User Tools

Site Tools


cs501r_f2016:tmp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs501r_f2016:tmp [2016/09/24 20:38]
wingated
cs501r_f2016:tmp [2021/06/30 23:42] (current)
Line 1: Line 1:
 ====Objective:​==== ====Objective:​====
  
-To read current papers on DNN research ​and translate them into working models. ​ To experiment with DNN-style regularization methods, including Dropout, Dropconnect, and L1/L2 weight regularization.+To gain experience coding a DNN architecture ​and learning program end-to-end, and to gain experience with Siamese network and ResNets.
  
 ---- ----
 ====Deliverable:​==== ====Deliverable:​====
  
-{{ :​cs501r_f2016:​lab6_do.png?​direct&​200|}}+For this lab, you will need to implement a simple face similarity detector.
  
-For this lab, you will need to implement ​three different regularization methods from the literature, and explore ​the parameters of each.+  - You must implement ​a siamese network that accepts two input images 
 +  - The network must output ​the probability that the two images are the same class 
 +  - Your implementation should use a ResNet architecture
  
-  - You must implement dropout (NOT using the pre-defined Tensorflow layers) +You should turn in the following:
-  - You must implement dropconnect +
-  - You must experiment with L1 weight regularization+
  
-You should ​ turn in an iPython notebook ​that shows three plotsone for each of the regularization methods.+  - A tensorboard screenshot showing ​that your architecture isindeed, a siamese architecture 
 +  - Your code 
 +  - A small writeup (<1/2 page) describing your test/​training split, your resnet architecture,​ and the final performance of your classifier.
  
-  - For dropouta plot showing training ​test performance as a function of the "keep probability"​. +You should use the [[http://www.openu.ac.il/home/hassner/​data/​lfwa/​|Labeled Faces in the Wild-a]] dataset (also available for  
-  - For dropconnect:​ the same +[[http://​liftothers.org/​byu/​lfwa.tar.gz|download from liftothers]]).
-  - For L1/L2: a plot showing training ​test performance as a function of the regularization strength, \lambda +
- +
-An example of my training/test performance is shown at the right.+
  
 ---- ----
Line 27: Line 26:
 Your notebook will be graded on the following: Your notebook will be graded on the following:
  
-  * 40% Correct implementation of Dropout +  * 35% Correct implementation of Siamese network 
-  * 30% Correct implementation of Dropconnect +  * 35% Correct implementation of Resnet 
-  * 20% Correct implementation of L1/L2 regularization +  * 20% Reasonable effort to find a good-performing topology 
-  * 10% Tidy and legible plots+  * 10% Results writeup
  
 ---- ----
 ====Description:​==== ====Description:​====
  
-This lab is chance for you to start reading the literature on deep neural networks, and understand how to replicate methods from the literature. ​ You will implement 4 different ​regularization methods, and will benchmark each one.+For this lab, you should implement ​Siamese network, and train it to recognize whether or not two faces are the same or different.
  
-To help ensure that everyone ​is starting off on the same footing, ​you should download the following scaffold code:+No scaffolding code (except for a simple script for loading the images below) will be provided. ​ The goal of this lab is for you to experiment with implementing an idea end-to-end.
  
 +The steps for completion of this lab are:
  
 +  - Load all of the data.  Create a test/​training split.
 +  - Establish a baseline accuracy (ie, if you randomly predict same/​different,​ what accuracy do you achieve?)
 +  - Use tensorflow to create your siamese network.
 +    - Use ResNets to extract features from the images
 +    - Make sure that parameters are shared across both halves of the network!
 +  - Train the network using an optimizer of your choice
  
-For all 4 methods, we will run on a single, deterministic batch of the first 1000 images from the MNIST dataset. ​ This will help us to overfitand will hopefully be small enough not to tax your computers too much.+Note: you will NOT be graded ​on the accuracy ​of your final classifieras long as you make a good faith effort ​to come up with something that performs reasonably well.
  
-**Part 1: implement dropout**+Your ResNet should extract a vector of features from each image. ​ Those feature vectors should then be compared to calculate an "​energy";​ that energy should then be input into a contrastive loss function, as discussed in class.
  
-For the first part of the lab, you should implement dropout.  ​The paper upon which you should base your implementation is found at:+Note that some people in the database only have one image.  ​These images are still useful, however (why?), so don't just throw them away.
  
-[[https://​www.cs.toronto.edu/​~hinton/​absps/​JMLRdropout.pdf|The dropout paper]] 
  
-The relevant equations are found in section 4 (pg 1933). ​ You may also refer to the slides.+---- 
 +====Writeup:​====
  
-There are several notes to help you with this part:+As discussed in the "​Deliverable"​ section, your writeup must include the following:
  
-  - First, you should run the provided code as-is. ​ It will overfit on the first 1000 images (how do you know this?​). ​ Record the test and training ​accuracy; this will be the "​baseline"​ line in your plot. +  - A description of your test/training ​split 
-  - Second, you should add dropout to each of the ''​h1''​''​h2''​and ''​h3''​ layers+  - A description ​of your resnet architecture (layersstridesnonlinearities,​ etc.) 
-  - You must consider carefully how to use tensorflow to implement dropout. +  - How you assessed whether ​or not your architecture was working 
-  - Remember that when you test images (or when you compute training set accuracy), you must scale activations by the ''​keep_probability'',​ as discussed in class and in the paper. +  - The final performance of your classifier
-  - You should use the Adam optimizer, and optimize for 150 steps.+
  
-Note that although we are training on only the first 1000 images, we are testing on the entire 10,000 image test set.+This writeup should be small - between 1/2 - 1 page.  You don't need to wax eloquent.
  
-In order to generate the final plot, you will need to scan across multiple values of the ''​keep_probability''​. ​ You may wish to refactor the provided code in order to make this easier. ​ You should test at least the values ''​[ 0.1, 0.25, 0.5, 0.75, 1.0 ]''​.+---- 
 +====Hints:​====
  
-Once you understand dropoutimplementing it is not hard; you should only have to add ~10 lines of code.+To help you get startedhere's a simple script that will load all of the images and calculate labels. ​ It assumes that the face database has been unpacked in the current directory, and that there exists a file called ''​list.txt''​ that was generated with the following command:
  
-Also note that because dropout involves some randomness, your curve may not match mine exactly; this is expected.+<code bash> 
 +find ./lfw2/ -name \*.jpg > list.txt 
 +</​code>​
  
-**Part 2implement dropconnect**+After running this code, the data will in the ''​data''​ tensor, and the labels will be in the ''​labels''​ tensor:
  
-The specifications for this part are similar to part 1.  Once you have implemented Dropout, it should be very easy to modify your code to perform dropconnect. ​ The paper upon which you should base your implementation is+<code python>
  
-[[http://​www.jmlr.org/​proceedings/​papers/​v28/​wan13.pdf|The dropconnect paper]]+from PIL import Image 
 +import numpy as np
  
-**Important note**: the dropconnect paper has a somewhat more sophisticated inference method (that is, the method used at test time) **We will not use that method.** Instead, we will use the same inference approximation used by the Dropout paper -- we will simply scale things by the ''​keep_probability''​.+
 +# assumes list.txt ​is a list of filenamesformatted as 
 +
 +# ./​lfw2//​Aaron_Eckhart/​Aaron_Eckhart_0001.jpg 
 +# ./​lfw2//​Aaron_Guiel/​Aaron_Guiel_0001.jpg 
 +... 
 +#
  
-You should scan across the same values of ''​keep_probability''​, and you should generate the same plot.+files = open( './list.txt' ​).readlines()
  
-Dropconnect seems to want more training steps than dropoutso you should run the optimizer for 1500 iterations.+data = np.zeros(( len(files)250, 250 )) 
 +labels = np.zeros(( len(files), 1 ))
  
-**Part 3: implement L1 regularization**+# a little hash map mapping subjects to IDs 
 +ids = {} 
 +scnt = 0
  
-For this part, you should implement L1 regularization on the weights. ​ This will change your computation graph a bit, and specifically will change your cost function -- instead of optimizing just ''​cross_entropy'',​ you should optimize ''​cross_entropy + lam*regularizers'',​ where ''​lam''​ is the \lambda regularization parameter from the slides. ​ You should regularize ​all of the weights and biases (six variables ​in total).+# load in all of our images 
 +ind = 0 
 +for fn in files:
  
-You should create a plot of test/​training performance as you scan across values of lambda You should test at least [0.1, 0.01, 0.001]. +    subject = fn.split('/'​)[3
- +    ​if ​not ids.has_key( subject ): 
-Note: unlike the dropout/​dropconnect regularizers,​ you will probably ​not be able to improve test time performance! +        ids[ subject ] = scnt 
- +        scnt +
----- +    label ids[ subject ] 
-====Hints:====+     
 +    data[ ind, :, : ] np.array( Image.open( fn.rstrip() ) ) 
 +    labels[ ind ] label 
 +    ind +1
  
-To generate a random binary matrixyou can use ''​np.random.rand''​ to generate a matrix of random values between 0 and 1and then only keep those above a certain threshold.+# data is (13233250250) 
 +# labels is (13233, 1)
  
 +</​code>​
cs501r_f2016/tmp.1474749533.txt.gz · Last modified: 2021/06/30 23:40 (external edit)