User Tools

Site Tools


cs501r_f2016:tmp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs501r_f2016:tmp [2016/09/24 20:47]
wingated
cs501r_f2016:tmp [2016/11/09 18:30]
wingated
Line 1: Line 1:
 ====Objective:​==== ====Objective:​====
  
-To read current papers on DNN research ​and translate them into working models. ​ To experiment with DNN-style regularization methods, including Dropout, Dropconnect, and L1 weight regularization.+To gain experience coding a DNN architecture ​and learning program end-to-end, and to gain experience with Siamese network and ResNets.
  
 ---- ----
 ====Deliverable:​==== ====Deliverable:​====
  
-{{ :​cs501r_f2016:​lab6_do.png?​direct&​200|}}+For this lab, you will need to implement a simple face similarity detector.
  
-For this lab, you will need to implement ​three different regularization methods from the literature, and explore ​the parameters of each.+  - You must implement ​a siamese network that accepts two input images 
 +  - The network must output ​the probability that the two images are the same class 
 +  - Your implementation should use a ResNet architecture
  
-  - You must implement dropout (NOT using the pre-defined Tensorflow layers) +You should turn in the following:
-  - You must implement dropconnect +
-  - You must implement L1 weight regularization+
  
-You should ​ turn in an iPython notebook ​that shows three plotsone for each of the regularization methods.+  - A tensorboard screenshot showing ​that your architecture isindeed, a siamese architecture 
 +  - Your code 
 +  - A small writeup (<1/2 page) describing your test/​training split, your resnet architecture,​ and the final performance of your classifier.
  
-  - For dropouta plot showing training ​test performance as a function of the "keep probability"​. +You should use the [[http://www.openu.ac.il/​home/​hassner/​data/​lfwa/​|Labeled Faces in the Wild-a]] dataset (also available for  
-  - For dropconnect: ​the same +[[http://​liftothers.org/​byu/​lfwa.tar.gz|download from liftothers]]).
-  ​For L1 plot showing training / test performance as a function of the regularization strength, \lambda +
- +
-An example of my training/test performance for dropout is shown at the right.+
  
 ---- ----
Line 27: Line 26:
 Your notebook will be graded on the following: Your notebook will be graded on the following:
  
-  * 40% Correct implementation of Dropout +  * 35% Correct implementation of Siamese network 
-  * 30% Correct implementation of Dropconnect +  * 35% Correct implementation of Resnet 
-  * 20% Correct implementation of L1 regularization +  * 20% Reasonable effort to find a good-performing topology 
-  * 10% Tidy and legible plots+  * 10% Results writeup
  
 ---- ----
 ====Description:​==== ====Description:​====
  
-This lab is chance for you to start reading the literature on deep neural networks, and understand how to replicate methods from the literature. ​ You will implement 4 different ​regularization methods, and will benchmark each one.+For this lab, you should implement ​Siamese network, and train it to recognize whether or not two faces are the same or different.
  
-To help ensure that everyone ​is starting off on the same footing, ​you should download the following scaffold code:+No scaffolding code (except for a simple script for loading the images below) will be provided. ​ The goal of this lab is for you to experiment with implementing an idea end-to-end.
  
-[[http://​liftothers.org/​byu/​lab6_scaffold.py|Lab 6 scaffold code]]+The steps for completion of this lab are:
  
-For all 4 methods, we will run on a single, deterministic batch of the first 1000 images from the MNIST dataset.  ​This will help us to overfitand will hopefully be small enough not to tax your computers too much.+  - Load all of the data.  ​Create a test/​training split. 
 +  - Establish a baseline accuracy (ieif you randomly predict same/​different,​ what accuracy do you achieve?) 
 +  - Use tensorflow ​to create ​your siamese network. 
 +    - Use ResNets to extract features from the images 
 +    - Make sure that parameters are shared across both halves of the network! 
 +  - Train the network using an optimizer of your choice
  
-**Part 1implement dropout**+Noteyou will NOT be graded on the accuracy of your final classifier, as long as you make a good faith effort to come up with something that performs reasonably well.
  
-For the first part of the lab, you should ​implement dropout.  ​The paper upon which you should ​base your implementation is found at:+Your ResNet ​should ​extract a vector of features from each image.  ​Those feature vectors ​should ​then be compared to calculate an "​energy";​ that energy should then be input into a contrastive loss function, as discussed in class.
  
-[[https://​www.cs.toronto.edu/​~hinton/​absps/​JMLRdropout.pdf|The dropout paper]]+Note that some people in the database only have one image These images are still useful, however (why?), so don't just throw them away.
  
-The relevant equations are found in section 4 (pg 1933). ​ You may also refer to the slides. 
  
-There are several notes to help you with this part:+---- 
 +====Writeup:​==== 
 + 
 +As discussed in the "​Deliverable"​ section, your writeup must include the following:​ 
 + 
 +  - A description of your test/​training split 
 +  - A description of your resnet architecture (layers, strides, nonlinearities,​ etc.) 
 +  - How you assessed whether or not your architecture was working 
 +  - The final performance of your classifier 
 + 
 +This writeup should be small - between 1/2 - 1 page.  You don't need to wax eloquent. 
 + 
 +---- 
 +====Hints:====
  
-  - Firstyou should run the provided code as-is. ​ It will overfit on the first 1000 images ​(how do you know this?).  ​Record ​the test and training accuracy; this will be the "​baseline"​ line in your plot. +To help you get startedhere's a simple script that will load all of the images ​and calculate labels.  ​It assumes that the face database has been unpacked ​in the current directory, and that there exists a file called ​''​list.txt''​ that was generated with the following command:
-  - Second, you should add dropout to each of the ''​h1'',​ ''​h2''​, and ''​h3'' ​layers. +
-  - You must consider carefully how to use tensorflow to implement dropout. +
-  - Remember ​that when you test images (or when you compute training set accuracy), you must scale activations by the ''​keep_probability'',​ as discussed in class and in the paper. +
-  - You should use the Adam optimizer, and optimize for 150 steps.+
  
-Note that although we are training on only the first 1000 images, we are testing on the entire 10,000 image test set.+<code bash> 
 +find ./lfw2/ -name \*.jpg > list.txt 
 +</​code>​
  
-In order to generate ​the final plot, you will need to scan across multiple values of the ''​keep_probability''​.  You may wish to refactor ​the provided code in order to make this easier. ​ You should test at least the values ​''​[ 0.1, 0.25, 0.5, 0.75, 1.0 ]''​.+After running this code, the data will in the ''​data'' ​tensor, and the labels will be in the ''​labels'' ​tensor:
  
-Once you understand dropout, implementing it is not hard; you should only have to add ~10 lines of code.+<code python>
  
-Also note that because dropout involves some randomness, your curve may not match mine exactly; this is expected.+from PIL import Image 
 +import numpy as np
  
-**Part 2: implement dropconnect**+
 +# assumes list.txt is a list of filenames, formatted as 
 +
 +# ./​lfw2//​Aaron_Eckhart/​Aaron_Eckhart_0001.jpg 
 +# ./​lfw2//​Aaron_Guiel/​Aaron_Guiel_0001.jpg 
 +# ... 
 +#
  
-The specifications for this part are similar to part 1 Once you have implemented Dropout, it should be very easy to modify your code to perform dropconnect The paper upon which you should base your implementation is+files = open( './list.txt' ).readlines()
  
-[[http://​www.jmlr.org/​proceedings/​papers/​v28/​wan13.pdf|The dropconnect paper]]+data = np.zeros(( len(files), 250, 250 )) 
 +labels = np.zeros(( len(files), 1 ))
  
-**Important note**: the dropconnect paper has somewhat more sophisticated inference method (that is, the method used at test time). ​ **We will not use that method.** Instead, we will use the same inference approximation used by the Dropout paper -- we will simply scale things by the ''​keep_probability''​.+little hash map mapping subjects to IDs 
 +ids = {} 
 +scnt = 0
  
-You should scan across the same values ​of ''​keep_probability'',​ and you should generate a similar plot.+# load in all of our images 
 +ind = 0 
 +for fn in files:
  
-Dropconnect seems to want more training steps than dropoutso you should run the optimizer for 1500 iterations.+    subject = fn.split('/'​)[3] 
 +    if not ids.has_key( subject ): 
 +        ids[ subject ] = scnt 
 +        scnt += 1 
 +    label = ids[ subject ] 
 +     
 +    data[ ind:, : ] = np.array( Image.open( fn.rstrip() ) ) 
 +    labels[ ind ] = label 
 +    ind += 1
  
-**Part 3: implement L1 regularization**+# data is (13233, 250, 250) 
 +# labels is (13233, 1)
  
-For this part, you should implement L1 regularization on the weights. ​ This will change your computation graph a bit, and specifically wil+</​code>​
cs501r_f2016/tmp.txt · Last modified: 2021/06/30 23:42 (external edit)