User Tools

Site Tools


cs501r_f2016:lab13

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
cs501r_f2016:lab13 [2016/11/22 19:53]
wingated created
cs501r_f2016:lab13 [2021/06/30 23:42] (current)
Line 1: Line 1:
- 
-https://​www.cs.toronto.edu/​~frossard/​post/​vgg16/​ 
  
 ====Objective:​==== ====Objective:​====
Line 8: Line 6:
 ---- ----
 ====Deliverable:​==== ====Deliverable:​====
 +
 +{{ :​cs501r_f2016:​style1.png?​300|}}
  
 For this lab, you will need to implement the style transfer algorithm of [[https://​arxiv.org/​pdf/​1508.06576v2.pdf|Gatys et al]]. For this lab, you will need to implement the style transfer algorithm of [[https://​arxiv.org/​pdf/​1508.06576v2.pdf|Gatys et al]].
Line 19: Line 19:
   - The final image that you generated   - The final image that you generated
   - Your code   - Your code
 +
 +An example image that I generated is shown at the right.
  
 ---- ----
 ====Grading standards:​==== ====Grading standards:​====
  
-Your notebook ​will be graded on the following:+Your code will be graded on the following:
  
-  * 35% Correct ​implementation ​of Siamese network +  * 35% Correct ​extraction ​of statistics 
-  * 35% Correct ​implementation ​of Resnet +  * 35% Correct ​construction ​of cost function 
-  * 20% Reasonable effort to find a good-performing topology +  * 20% Correct initialization and optimization of image variable 
-  * 10% Results writeup+  * 10% Awesome looking final image
  
 ---- ----
 ====Description:​==== ====Description:​====
  
-For this lab, you should implement the style transfer algorithm referenced above. ​ We are providing the following:+For this lab, you should implement the style transfer algorithm referenced above. ​ We are providing the following, [[https://​www.dropbox.com/​sh/​tt0ctms12aumgui/​AACRKSSof6kw-wi8vs1v8ls3a?​dl=0 
 +|available from a dropbox folder]]: 
 + 
 +  - lab10_scaffold.py - Lab 10 scaffolding code 
 +  - vgg16.py - The VGG16 model 
 +  - content.png - An example content image 
 +  - style.png - An example style image 
 + 
 +You will also need the VGG16 pre-trained weights:
  
-  - [[http://​liftothers.org/​byu/​lab10_scaffold.py|Lab 10 scaffolding code]] 
-  - [[http://​liftothers.org/​byu/​vgg16.py|The VGG16 model]]\ 
   - [[http://​liftothers.org/​byu/​vgg16_weights.npz|VGG16 weights]]   - [[http://​liftothers.org/​byu/​vgg16_weights.npz|VGG16 weights]]
-  - [[http://​liftothers.org/​byu/​content.png|An example content image]] +
-  - [[http://​liftothers.org/​byu/​style.png|An example style image]]+
  
 In the scaffolding code, you will find some examples of how to use the provided VGG model. ​ (This model is a slightly modified version of [[https://​www.cs.toronto.edu/​~frossard/​post/​vgg16/​|code available here]]). In the scaffolding code, you will find some examples of how to use the provided VGG model. ​ (This model is a slightly modified version of [[https://​www.cs.toronto.edu/​~frossard/​post/​vgg16/​|code available here]]).
  
 **Note:** In class, we discussed how to construct a computation graph that reuses the VGG network 3 times (one for content, style, and optimization images). ​ It turns out that you don't need to do that.  In fact, we merely need to //​evaluate//​ the VGG network on the content and style images, and save the resulting activations. **Note:** In class, we discussed how to construct a computation graph that reuses the VGG network 3 times (one for content, style, and optimization images). ​ It turns out that you don't need to do that.  In fact, we merely need to //​evaluate//​ the VGG network on the content and style images, and save the resulting activations.
 +
 +The activations can be used to construct a cost function directly. ​ In other words, we don't need to keep around the content/​style VGG networks, because we'll never back-propagate through them.
  
 The steps for completion of this lab are: The steps for completion of this lab are:
  
-  - Load all of the data.  ​Create a test/​training split+  - Run the VGG network on the content and style images.  ​Save the activations
-  - Establish ​baseline accuracy (ie, if you randomly predict same/​differentwhat accuracy do you achieve?) +  - Construct ​content loss functionbased on the paper 
-  - Use tensorflow to create your siamese network. +  - Construct a style loss function, based on the paper 
-    - Use ResNets to extract features from the images +    - For each layer specified in the paper (also noted in the code), you'll need to construct a Gram matrix 
-    - Make sure that parameters are shared across both halves of the network! +    - That Gram matrix should match an equivalent Gram matrix computed on the style activations 
-  - Train the network using an optimizer of your choice +  - Construct ​an Adam optimizer, step size 0.1 
-    You should use some sort of SGD. +  - Initialize all of your variables and reload your VGG weights 
-    You will need to sample same/​different pairs.+  Initialize your optimization image to be the content image (or another image of your choosing) 
 +  Optimize!
  
-Note: you will NOT be graded on the accuracy ​of your final classifier, as long as you make a good faith effort to come up with something that performs reasonably well.+Some of these steps are already done in the scaffolding code.
  
-Your ResNet should extract a vector of features from each image. ​ Those feature vectors should then be compared to calculate an "​energy"; ​that energy should then be input into a contrastive loss function, as discussed in class.+Note that I ran my DNN for about 6000 steps to generate the image shown above.
  
-Remember that your network should be symmetric, so if you swap input images, nothing should change. +Here was my loss function over time:
- +
-Note that some people in the database only have one image. ​ These images are still useful, however (why?), so don't just throw them away.+
  
- +<​code>​ 
----- +ITER    LOSS            STYLE LOSS      CONTENT LOSS 
-====Writeup:​==== +0       ​210537.875000 ​  ​210537872.00000 0.000000 
- +100     ​73993.000000 ​   67282552.000000 6710.441406 
-As discussed in the "​Deliverable"​ section, your writeup must include the following: +200     ​47634.054688 ​   39536856.000000 8097.196777 
- +300     ​36499.234375 ​   28016930.000000 8482.302734 
-  ​- A description of your test/​training split +400     ​30405.132812 ​   21805504.000000 8599.625977 
-  ​- A description of your resnet architecture (layers, strides, nonlinearities,​ etc.) +500     ​26572.333984 ​   17947418.000000 8624.916016 
-  ​- How you assessed whether or not your architecture was working +600     ​23952.351562 ​   15339518.000000 8612.833008 
-  ​- The final performance of your classifier +700     ​22057.589844 ​   13475838.000000 8581.751953 
- +800     ​20623.390625 ​   12093137.000000 8530.253906 
-This writeup should be small - less than 1 page.  ​You don't need to wax eloquent.+900     ​19504.234375 ​   11023667.000000 8480.566406 
 +1000    18598.349609 ​   10174618.000000 8423.731445 
 +1100    17857.289062 ​   9491233.000000 ​ 8366.055664 
 +1200    17243.207031 ​   8932358.000000 ​ 8310.849609 
 +1300    16727.312500 ​   8470261.000000 ​ 8257.049805 
 +1400    16287.441406 ​   8079912.500000 ​ 8207.528320 
 +1500    15904.160156 ​   7747010.500000 ​ 8157.148926 
 +1600    15567.595703 ​   7453235.500000 ​ 8114.359863 
 +1700    15269.226562 ​   7199946.500000 ​ 8069.279297 
 +1800    15003.159180 ​   6973264.000000 ​ 8029.895020 
 +1900    14762.021484 ​   6776666.500000 ​ 7985.354492 
 +2000    14544.566406 ​   6602410.000000 ​ 7942.156738 
 +2100    14347.167969 ​   6442019.000000 ​ 7905.148926 
 +2200    14166.757812 ​   6299105.500000 ​ 7867.651367 
 +2300    13999.201172 ​   6169558.500000 ​ 7829.643066 
 +2400    13845.177734 ​   6053753.000000 ​ 7791.424316 
 +2500    13701.140625 ​   5946503.500000 ​ 7754.636230 
 +2600    13566.027344 ​   5846906.000000 ​ 7719.121582 
 +2700    13440.531250 ​   5751874.500000 ​ 7688.655762 
 +2800    13322.011719 ​   5664197.500000 ​ 7657.814453 
 +2900    13210.117188 ​   5585183.000000 ​ 7624.934570 
 +3000    13105.109375 ​   5510268.000000 ​ 7594.841797 
 +3100    13005.414062 ​   5440027.500000 ​ 7565.385742 
 +3200    12912.160156 ​   5376126.000000 ​ 7536.033203 
 +3300    12824.537109 ​   5316451.500000 ​ 7508.085938 
 +3400    12742.234375 ​   5259337.500000 ​ 7482.895996 
 +3500    12663.185547 ​   5202367.500000 ​ 7460.817871 
 +3600    12588.695312 ​   5151772.000000 ​ 7436.922363 
 +3700    12517.728516 ​   5103315.000000 ​ 7414.413574 
 +3800    12450.191406 ​   5055678.000000 ​ 7394.513184 
 +3900    12385.476562 ​   5012455.000000 ​ 7373.021484 
 +4000    12323.820312 ​   4973657.000000 ​ 7350.163086 
 +4100    12263.249023 ​   4937481.000000 ​ 7325.767578 
 +4200    12204.673828 ​   4898750.000000 ​ 7305.923340 
 +4300    12148.785156 ​   4860086.000000 ​ 7288.698242 
 +4400    12095.140625 ​   4822883.500000 ​ 7272.257324 
 +4500    12043.544922 ​   4787642.500000 ​ 7255.902832 
 +4600    11992.242188 ​   4753499.500000 ​ 7238.742188 
 +4700    11942.533203 ​   4722825.500000 ​ 7219.708008 
 +4800    11895.559570 ​   4695372.500000 ​ 7200.187012 
 +4900    11849.578125 ​   4666181.000000 ​ 7183.397461 
 +5000    11804.967773 ​   4639222.500000 ​ 7165.745117 
 +5100    11762.816406 ​   4614679.500000 ​ 7148.136719 
 +5200    11722.379883 ​   4589744.000000 ​ 7132.635742 
 +5300    11682.291016 ​   4565345.000000 ​ 7116.945312 
 +5400    11642.744141 ​   4541704.500000 ​ 7101.039062 
 +5500    11604.595703 ​   4519445.000000 ​ 7085.149902 
 +5600    11568.400391 ​   4497892.000000 ​ 7070.507812 
 +5700    11533.195312 ​   4478154.000000 ​ 7055.040527 
 +5800    11497.519531 ​   4459191.000000 ​ 7038.328125 
 +5900    11463.125977 ​   4439539.000000 ​ 7023.586914 
 +6000    11429.999023 ​   4421518.000000 ​ 7008.480957 
 +</​code>​
  
 ---- ----
 ====Hints:​==== ====Hints:​====
  
-To help you get startedhere's a simple script that will load all of the images ​and calculate labels. ​ It assumes that the face database has been unpacked in the current directoryand that there exists a file called ''​list.txt''​ that was generated with the following command:+You should make sure that if you initialize your image to the content image, and your loss function is strictly ​the content loss, that your loss is 0.0
  
-<code bash> +I found that it was important to clip pixel values to be in [0,255] To do that, every 100 iterations I extracted the image, clipped it, and then assigned it back in.
-find ./lfw2/ -name \*.jpg > list.txt +
-</​code>​+
  
-After running this code, the data will in the ''​data''​ tensor, and the labels will be in the ''​labels''​ tensor:+**...although now that I think about itperhaps I should have been operating on whitened images from the beginning! ​ You should probably try that.**
  
-<code python> 
  
-from PIL import Image +---- 
-import numpy as np+====Bonus:​====
  
-+There'​s no official extra credit for this lab, but have some fun with it!  Try different content and different styles See if you can get nicer, higher resolution images out of it.
-# assumes list.txt is a list of filenames, formatted as +
-+
-# ./​lfw2//​Aaron_Eckhart/​Aaron_Eckhart_0001.jpg +
-# ./​lfw2//​Aaron_Guiel/​Aaron_Guiel_0001.jpg +
-# ... +
-#+
  
-files = open( './list.txt' ).readlines()+Also, take a look at the vgg16.py code What happens if you swap out max pooling for average pooling?
  
-data = np.zeros(( len(files), 250, 250 )) +What difference does whitening the input images make?
-labels = np.zeros(( len(files), 1 ))+
  
-# a little hash map mapping subjects to IDs +Show me the awesome results you can generate!
-ids = {} +
-scnt = 0+
  
-# load in all of our images 
-ind = 0 
-for fn in files: 
  
-    subject = fn.split('/'​)[3] 
-    if not ids.has_key( subject ): 
-        ids[ subject ] = scnt 
-        scnt += 1 
-    label = ids[ subject ] 
-    ​ 
-    data[ ind, :, : ] = np.array( Image.open( fn.rstrip() ) ) 
-    labels[ ind ] = label 
-    ind += 1 
  
-# data is (13233, 250, 250) 
-# labels is (13233, 1) 
  
-</​code>​ 
cs501r_f2016/lab13.1479844381.txt.gz · Last modified: 2021/06/30 23:40 (external edit)