Differences

This shows you the differences between two versions of the page.

--- cs501r_f2018:lab5 [2018/10/02 17:01]
cat [Description:]
+++ cs501r_f2018:lab5 [2021/06/30 23:42] (current)
@@ Line 28: / Line 28: @@
   * 35% Correct extraction of statistics
-  * 35% Correct construction of cost function
+  * 45% Correct construction of loss function in a loss class
-  * 20% Correct initialization and optimization of image variable
+  * 10% Correct initialization and optimization of image variable in a dataset class
   * 10% Awesome looking final image
@@ Line 62: / Line 62: @@
 </code>
-Or after the first time if you reset runtime, you can use:
+Or after the images are uploaded on to the local filesystem, you can use:
 <code>
@@ Line 72: / Line 72: @@
 </code>
-For reference on the network, we will give you a [[https://pytorch.org/tutorials/advanced/neural_style_tutorial.html#sphx-glr-download-advanced-neural-style-tutorial-py|pytorch implementation]] to look at and to implement. This should not just be copy-pasted but you should implement the steps included in the paper--this has more intuitive notation and does a good job explaining what each step does.
+**For reference on the network, we will give you a [[https://pytorch.org/tutorials/advanced/neural_style_tutorial.html#sphx-glr-download-advanced-neural-style-tutorial-py|pytorch implementation]] to look at and to implement. This should not just be copy-pasted but you should implement the steps included in the paper--this has more intuitive notation than the Gatys et al paper, and does a good job explaining what each step does.**
 Additionally, in the paper it talks about the VGG network. For the paper, they use the VGG16 network and we will do the same. You do NOT need to implement this! PyTorch has a VGG16 model in one of its libraries. Here we will show you how to access it.
@@ Line 91: / Line 92: @@
     self.vgg = models.vgg16(pretrained=True).features.eval()
     for i, m in enumerate(self.vgg.children()):
+        if isinstance(m, nn.ReLU):   # we want to set the relu layers to NOT do the relu in place.
+          m.inplace = False          # the model has a hard time going backwards on the in place functions.
         if i in requested:
           def curry(i):
@@ Line 157: / Line 161: @@
 You are welcome and encouraged to submit any other style transfer photographs you have, as long as you also submit the required image. Show us the awesome results you can generate!
+----
+====Hints and usefulness:====
+A former student contributed the following:
+Normalizing the image at each timestep is critical.  Here's what I did.
+Some extra things I did in the code snippet (in case they are useful):
+- I changed the VGG code to do use a dict, which in my opinion made things a lot easier.
+- Swapped the max pool layers for avg pool layers (rather hackily...)
+- I used a style scale of 500000 and a content scale of 1 (not in this code)
+PS if you try to use torchvision.transforms.Normalize() it won't work because it is missing a `forward()` and thus a `backward()` as well...
+<code python>
+from collections import OrderedDict
+class Normalization(nn.Module):
+    def __init__(
+        self,
+        mean=torch.tensor([0.485, 0.456, 0.406]).to(device),
+        std=torch.tensor([0.229, 0.224, 0.225]).to(device),
+    ):
+        super(Normalization, self).__init__()
+        self.mean = torch.tensor(mean).view(-1, 1, 1)
+        self.std = torch.tensor(std).view(-1, 1, 1)
+    def forward(self, img):
+        return (img - self.mean) / self.std
+class VGGIntermediate(nn.Module):
+    def __init__(self, requested=[], transforms=[Normalization()]):
+        super(VGGIntermediate, self).__init__()
+        self.transforms = transforms
+        self.vgg = models.vgg16(pretrained=True).features.eval()
+        layers_in_order = [
+            "conv1_1", "relu1_1", "conv1_2", "relu1_2", "maxpool1",
+            "conv2_1", "relu2_1", "conv2_2", "relu2_2", "maxpool2",
+            "conv3_1", "relu3_1", "conv3_2", "relu3_2", "conv3_3", "relu3_3", "maxpool3",
+            "conv4_1", "relu4_1", "conv4_2", "relu4_2", "conv4_3", "relu4_3", "maxpool4",
+            "conv5_1", "relu5_1", "conv5_2", "relu5_2", "conv5_3", "relu5_3", "maxpool5"
+        ]
+        self.intermediates = OrderedDict()
+        for layer_name, m in zip(layers_in_order, self.vgg.children()):
+            if isinstance(m, nn.ReLU):
+                m.inplace = False
+            elif isinstance(m, nn.MaxPool2d):
+                m.forward = lambda x: F.avg_pool2d(
+                    x, m.kernel_size, m.stride, m.padding
+                )
+            if layer_name in requested:
+                def curry(name):
+                    def hook(module, input, output):
+                        self.intermediates[name] = output
+                    return hook
+                m.register_forward_hook(curry(layer_name))
+    def forward(self, x):
+        for transform in self.transforms:
+            x = transform(x)
+        self.vgg(x)
+        return self.intermediates
+</code>

BYU CS classes

User Tools

Site Tools

Differences

Page Tools