User Tools

Site Tools


cs501r_f2016:lab13

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cs501r_f2016:lab13 [2016/11/22 19:53]
wingated created
cs501r_f2016:lab13 [2021/06/30 23:42]
Line 1: Line 1:
  
-https://​www.cs.toronto.edu/​~frossard/​post/​vgg16/​ 
- 
-====Objective:​==== 
- 
-To explore an alternative use of DNNs by implementing the style transfer algorithm. 
- 
----- 
-====Deliverable:​==== 
- 
-For this lab, you will need to implement the style transfer algorithm of [[https://​arxiv.org/​pdf/​1508.06576v2.pdf|Gatys et al]]. 
- 
-  - You must extract statistics from the content and style images 
-  - You must formulate an optimization problem over an input image 
-  - You must optimize the image to match both style and content 
- 
-You should turn in the following: 
- 
-  - The final image that you generated 
-  - Your code 
- 
----- 
-====Grading standards:​==== 
- 
-Your notebook will be graded on the following: 
- 
-  * 35% Correct implementation of Siamese network 
-  * 35% Correct implementation of Resnet 
-  * 20% Reasonable effort to find a good-performing topology 
-  * 10% Results writeup 
- 
----- 
-====Description:​==== 
- 
-For this lab, you should implement the style transfer algorithm referenced above. ​ We are providing the following: 
- 
-  - [[http://​liftothers.org/​byu/​lab10_scaffold.py|Lab 10 scaffolding code]] 
-  - [[http://​liftothers.org/​byu/​vgg16.py|The VGG16 model]]\ 
-  - [[http://​liftothers.org/​byu/​vgg16_weights.npz|VGG16 weights]] 
-  - [[http://​liftothers.org/​byu/​content.png|An example content image]] 
-  - [[http://​liftothers.org/​byu/​style.png|An example style image]] 
- 
-In the scaffolding code, you will find some examples of how to use the provided VGG model. ​ (This model is a slightly modified version of [[https://​www.cs.toronto.edu/​~frossard/​post/​vgg16/​|code available here]]). 
- 
-**Note:** In class, we discussed how to construct a computation graph that reuses the VGG network 3 times (one for content, style, and optimization images). ​ It turns out that you don't need to do that.  In fact, we merely need to //​evaluate//​ the VGG network on the content and style images, and save the resulting activations. 
- 
-The steps for completion of this lab are: 
- 
-  - Load all of the data.  Create a test/​training split. 
-  - Establish a baseline accuracy (ie, if you randomly predict same/​different,​ what accuracy do you achieve?) 
-  - Use tensorflow to create your siamese network. 
-    - Use ResNets to extract features from the images 
-    - Make sure that parameters are shared across both halves of the network! 
-  - Train the network using an optimizer of your choice 
-    - You should use some sort of SGD. 
-    - You will need to sample same/​different pairs. 
- 
-Note: you will NOT be graded on the accuracy of your final classifier, as long as you make a good faith effort to come up with something that performs reasonably well. 
- 
-Your ResNet should extract a vector of features from each image. ​ Those feature vectors should then be compared to calculate an "​energy";​ that energy should then be input into a contrastive loss function, as discussed in class. 
- 
-Remember that your network should be symmetric, so if you swap input images, nothing should change. 
- 
-Note that some people in the database only have one image. ​ These images are still useful, however (why?), so don't just throw them away. 
- 
- 
----- 
-====Writeup:​==== 
- 
-As discussed in the "​Deliverable"​ section, your writeup must include the following: 
- 
-  - A description of your test/​training split 
-  - A description of your resnet architecture (layers, strides, nonlinearities,​ etc.) 
-  - How you assessed whether or not your architecture was working 
-  - The final performance of your classifier 
- 
-This writeup should be small - less than 1 page.  You don't need to wax eloquent. 
- 
----- 
-====Hints:​==== 
- 
-To help you get started, here's a simple script that will load all of the images and calculate labels. ​ It assumes that the face database has been unpacked in the current directory, and that there exists a file called ''​list.txt''​ that was generated with the following command: 
- 
-<code bash> 
-find ./lfw2/ -name \*.jpg > list.txt 
-</​code>​ 
- 
-After running this code, the data will in the ''​data''​ tensor, and the labels will be in the ''​labels''​ tensor: 
- 
-<code python> 
- 
-from PIL import Image 
-import numpy as np 
- 
-# 
-# assumes list.txt is a list of filenames, formatted as 
-# 
-# ./​lfw2//​Aaron_Eckhart/​Aaron_Eckhart_0001.jpg 
-# ./​lfw2//​Aaron_Guiel/​Aaron_Guiel_0001.jpg 
-# ... 
-# 
- 
-files = open( '​./​list.txt'​ ).readlines() 
- 
-data = np.zeros(( len(files), 250, 250 )) 
-labels = np.zeros(( len(files), 1 )) 
- 
-# a little hash map mapping subjects to IDs 
-ids = {} 
-scnt = 0 
- 
-# load in all of our images 
-ind = 0 
-for fn in files: 
- 
-    subject = fn.split('/'​)[3] 
-    if not ids.has_key( subject ): 
-        ids[ subject ] = scnt 
-        scnt += 1 
-    label = ids[ subject ] 
-    ​ 
-    data[ ind, :, : ] = np.array( Image.open( fn.rstrip() ) ) 
-    labels[ ind ] = label 
-    ind += 1 
- 
-# data is (13233, 250, 250) 
-# labels is (13233, 1) 
- 
-</​code>​ 
cs501r_f2016/lab13.txt ยท Last modified: 2021/06/30 23:42 (external edit)