Differences

This shows you the differences between two versions of the page.

--- cs501r_f2016:lab13 [2017/11/11 17:05]
wingated
+++ cs501r_f2016:lab13 [2021/06/30 23:42]
@@ Line 1: / Line 1: @@
-====Objective:====
-To explore an alternative use of DNNs by implementing the style transfer algorithm.
-----
-====Deliverable:====
-{{ :cs501r_f2016:style1.png?300|}}
-For this lab, you will need to implement the style transfer algorithm of [[https://arxiv.org/pdf/1508.06576v2.pdf|Gatys et al]].
-  - You must extract statistics from the content and style images
-  - You must formulate an optimization problem over an input image
-  - You must optimize the image to match both style and content
-You should turn in the following:
-  - The final image that you generated
-  - Your code
-An example image that I generated is shown at the right.
-----
-====Grading standards:====
-Your code will be graded on the following:
-  * 35% Correct extraction of statistics
-  * 35% Correct construction of cost function
-  * 20% Correct initialization and optimization of image variable
-  * 10% Awesome looking final image
-----
-====Description:====
-For this lab, you should implement the style transfer algorithm referenced above.  We are providing the following, [[https://www.dropbox.com/sh/tt0ctms12aumgui/AACRKSSof6kw-wi8vs1v8ls3a?dl=0
-|available from a dropbox folder]]:
-  - lab10_scaffold.py - Lab 10 scaffolding code
-  - vgg16.py.txt - The VGG16 model
-  - content.png - An example content image
-  - style.png|An example style image
-You will also need the VGG16 pre-trained weights:
-  - [[http://liftothers.org/byu/vgg16_weights.npz|VGG16 weights]]
-In the scaffolding code, you will find some examples of how to use the provided VGG model.  (This model is a slightly modified version of [[https://www.cs.toronto.edu/~frossard/post/vgg16/|code available here]]).
-**Note:** In class, we discussed how to construct a computation graph that reuses the VGG network 3 times (one for content, style, and optimization images).  It turns out that you don't need to do that.  In fact, we merely need to //evaluate// the VGG network on the content and style images, and save the resulting activations.
-The activations can be used to construct a cost function directly.  In other words, we don't need to keep around the content/style VGG networks, because we'll never back-propagate through them.
-The steps for completion of this lab are:
-  - Run the VGG network on the content and style images.  Save the activations.
-  - Construct a content loss function, based on the paper
-  - Construct a style loss function, based on the paper
-    - For each layer specified in the paper (also noted in the code), you'll need to construct a Gram matrix
-    - That Gram matrix should match an equivalent Gram matrix computed on the style activations
-  - Construct an Adam optimizer, step size 0.1
-  - Initialize all of your variables and reload your VGG weights
-  - Initialize your optimization image to be the content image (or another image of your choosing)
-  - Optimize!
-Some of these steps are already done in the scaffolding code.
-Note that I ran my DNN for about 6000 steps to generate the image shown above.
-Here was my loss function over time:
-<code>
-ITER    LOSS            STYLE LOSS      CONTENT LOSS
-       210537.875000   210537872.00000 0.000000
-     73993.000000    67282552.000000 6710.441406
-     47634.054688    39536856.000000 8097.196777
-     36499.234375    28016930.000000 8482.302734
-     30405.132812    21805504.000000 8599.625977
-     26572.333984    17947418.000000 8624.916016
-     23952.351562    15339518.000000 8612.833008
-     22057.589844    13475838.000000 8581.751953
-     20623.390625    12093137.000000 8530.253906
-     19504.234375    11023667.000000 8480.566406
-    18598.349609    10174618.000000 8423.731445
-    17857.289062    9491233.000000  8366.055664
-    17243.207031    8932358.000000  8310.849609
-    16727.312500    8470261.000000  8257.049805
-    16287.441406    8079912.500000  8207.528320
-    15904.160156    7747010.500000  8157.148926
-    15567.595703    7453235.500000  8114.359863
-    15269.226562    7199946.500000  8069.279297
-    15003.159180    6973264.000000  8029.895020
-    14762.021484    6776666.500000  7985.354492
-    14544.566406    6602410.000000  7942.156738
-    14347.167969    6442019.000000  7905.148926
-    14166.757812    6299105.500000  7867.651367
-    13999.201172    6169558.500000  7829.643066
-    13845.177734    6053753.000000  7791.424316
-    13701.140625    5946503.500000  7754.636230
-    13566.027344    5846906.000000  7719.121582
-    13440.531250    5751874.500000  7688.655762
-    13322.011719    5664197.500000  7657.814453
-    13210.117188    5585183.000000  7624.934570
-    13105.109375    5510268.000000  7594.841797
-    13005.414062    5440027.500000  7565.385742
-    12912.160156    5376126.000000  7536.033203
-    12824.537109    5316451.500000  7508.085938
-    12742.234375    5259337.500000  7482.895996
-    12663.185547    5202367.500000  7460.817871
-    12588.695312    5151772.000000  7436.922363
-    12517.728516    5103315.000000  7414.413574
-    12450.191406    5055678.000000  7394.513184
-    12385.476562    5012455.000000  7373.021484
-    12323.820312    4973657.000000  7350.163086
-    12263.249023    4937481.000000  7325.767578
-    12204.673828    4898750.000000  7305.923340
-    12148.785156    4860086.000000  7288.698242
-    12095.140625    4822883.500000  7272.257324
-    12043.544922    4787642.500000  7255.902832
-    11992.242188    4753499.500000  7238.742188
-    11942.533203    4722825.500000  7219.708008
-    11895.559570    4695372.500000  7200.187012
-    11849.578125    4666181.000000  7183.397461
-    11804.967773    4639222.500000  7165.745117
-    11762.816406    4614679.500000  7148.136719
-    11722.379883    4589744.000000  7132.635742
-    11682.291016    4565345.000000  7116.945312
-    11642.744141    4541704.500000  7101.039062
-    11604.595703    4519445.000000  7085.149902
-    11568.400391    4497892.000000  7070.507812
-    11533.195312    4478154.000000  7055.040527
-    11497.519531    4459191.000000  7038.328125
-    11463.125977    4439539.000000  7023.586914
-    11429.999023    4421518.000000  7008.480957
-</code>
-----
-====Hints:====
-You should make sure that if you initialize your image to the content image, and your loss function is strictly the content loss, that your loss is 0.0
-I found that it was important to clip pixel values to be in [0,255].  To do that, every 100 iterations I extracted the image, clipped it, and then assigned it back in.
-...although now that I think about it, perhaps I should have been operating on whitened images from the beginning!  You should probably try that.
-----
-====Bonus:====
-There's no official extra credit for this lab, but have some fun with it!  Try different content and different styles.  See if you can get nicer, higher resolution images out of it.
-Also, take a look at the vgg16.py code.  What happens if you swap out max pooling for average pooling?
-What difference does whitening the input images make?
-Show me the awesome results you can generate!

BYU CS classes

User Tools

Site Tools

Differences

Page Tools