User Tools

Site Tools


cs501r_f2016:tmp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs501r_f2016:tmp [2016/09/24 20:31]
wingated
cs501r_f2016:tmp [2021/06/30 23:42] (current)
Line 1: Line 1:
 ====Objective:​==== ====Objective:​====
  
-To read current papers on DNN research ​and translate them into working models. ​ To experiment with DNN-style regularization methods, including Dropout, Dropconnect, and L1/L2 weight regularization.+To gain experience coding a DNN architecture ​and learning program end-to-end, and to gain experience with Siamese network and ResNets.
  
 ---- ----
 ====Deliverable:​==== ====Deliverable:​====
  
-{{ :​cs501r_f2016:​lab6_do.png?​direct&​200|}}+For this lab, you will need to implement a simple face similarity detector.
  
-For this lab, you will need to implement ​three different regularization methods from the literature, and explore ​the parameters of each.+  - You must implement ​a siamese network that accepts two input images 
 +  - The network must output ​the probability that the two images are the same class 
 +  - Your implementation should use a ResNet architecture
  
-  - You must implement dropout (NOT using the pre-defined Tensorflow layers) +You should turn in the following:
-  - You must implement dropconnect +
-  - You must experiment with L1/L2 weight regularization+
  
-You should ​ turn in an iPython notebook ​that shows three plotsone for each of the regularization methods.+  - A tensorboard screenshot showing ​that your architecture isindeed, a siamese architecture 
 +  - Your code 
 +  - A small writeup (<1/2 page) describing your test/​training split, your resnet architecture,​ and the final performance of your classifier.
  
-  - For dropouta plot showing training ​test performance as a function of the "keep probability"​. +You should use the [[http://www.openu.ac.il/home/hassner/​data/​lfwa/​|Labeled Faces in the Wild-a]] dataset (also available for  
-  - For dropconnect:​ the same +[[http://​liftothers.org/​byu/​lfwa.tar.gz|download from liftothers]]).
-  - For L1/L2: a plot showing training ​test performance as a function of the regularization strength, \lambda +
- +
-An example of my training/test performance is shown at the right.+
  
 ---- ----
Line 27: Line 26:
 Your notebook will be graded on the following: Your notebook will be graded on the following:
  
-  * 40% Correct implementation of Dropout +  * 35% Correct implementation of Siamese network 
-  * 30% Correct implementation of Dropconnect +  * 35% Correct implementation of Resnet 
-  * 20% Correct implementation of L1/L2 regularization +  * 20% Reasonable effort to find a good-performing topology 
-  * 10% Tidy and legible plots+  * 10% Results writeup
  
 ---- ----
 ====Description:​==== ====Description:​====
  
-This lab is chance for you to start reading the literature on deep neural networks, and understand how to replicate methods from the literature. ​ You will implement 4 different ​regularization methods, and will benchmark each one.+For this lab, you should implement ​Siamese network, and train it to recognize whether or not two faces are the same or different.
  
-Please note tat+No scaffolding code (except for a simple script for loading the images below) will be provided. ​ The goal of this lab is for you to experiment with implementing an idea end-to-end.
  
-To help ensure that everyone is starting off on the same footing, you should download the following scaffold code:+The steps for completion of this lab are:
  
-**For all parts**+  - Load all of the data.  Create a test/​training split. 
 +  - Establish a baseline accuracy (ie, if you randomly predict same/​different,​ what accuracy do you achieve?) 
 +  - Use tensorflow to create your siamese network. 
 +    - Use ResNets to extract features from the images 
 +    - Make sure that parameters are shared across both halves of the network! 
 +  - Train the network using an optimizer of your choice
  
-For all 4 methods, we will run on a single, deterministic batch of the first 1000 images from the MNIST dataset This will help us to +Note: you will NOT be graded ​on the accuracy of your final classifier, as long as you make good faith effort to come up with something that performs reasonably well.
  
 +Your ResNet should extract a vector of features from each image. ​ Those feature vectors should then be compared to calculate an "​energy";​ that energy should then be input into a contrastive loss function, as discussed in class.
  
-**Part 1: implement dropout**+Note that some people in the database only have one image. ​ These images are still useful, however (why?), so don't just throw them away.
  
-For the first part of the lab, you should implement dropout. ​ The paper upon which you should base your implementation is found at: 
  
-[[https://​www.cs.toronto.edu/​~hinton/​absps/​JMLRdropout.pdf|Dropout]]+---- 
 +====Writeup:====
  
-The relevant equations are found in section ​4 (pg 1933). ​ You may also refer to the slides.+As discussed ​in the "​Deliverable" ​section, your writeup must include ​the following:
  
-There are several notes to help you with this part:+  - A description of your test/​training split 
 +  - A description of your resnet architecture (layers, strides, nonlinearities,​ etc.) 
 +  - How you assessed whether or not your architecture was working 
 +  - The final performance of your classifier
  
-  - First, you should ​run the provided code as-is.  It will overfit on the first 1000 images (how do you know this?​). ​ Record the accuracy of the  +This writeup ​should ​be small between 1/2 1 page.  You don't need to wax eloquent.
-  ​Second, you should add dropout to each of the ''​h1'',​ ''​h2'',​ and ''​h3''​ layers. +
-  You must consider carefully how to use tensorflow to implement dropout. +
-  - Remember that you must scale activations by the ''​keep_probability'',​ as discussed in class and in the paper. +
-  - You should use the Adam optimizer, and optimize for 150 steps.+
  
-Note that although we are training on only the first 1000 images, we are testing on the entire 10,000 image test set.+---- 
 +====Hints:​====
  
-In order to generate the final plotyou will need to scan across multiple values ​of the ''​keep_probability''​.  ​You may wish to refactor ​the provided code in order to make this easier. ​ You should test at least the values ​''​[ 0.1, 0.25, 0.5, 0.75, 1.0 ]''​.+To help you get startedhere's a simple script that will load all of the images and calculate labels.  ​It assumes that the face database has been unpacked ​in the current directory, and that there exists a file called ​''​list.txt'' ​that was generated with the following command:
  
-Once you understand dropout, implementing it is not hard; you should only have to add ~10 lines of code.+<code bash> 
 +find ./lfw2/ -name \*.jpg > list.txt 
 +</​code>​
  
-**Part 2implement dropconnect**+After running this code, the data will in the ''​data''​ tensor, and the labels will be in the ''​labels''​ tensor:
  
-The specifications for this part are similar to part 1.  Once you have implemented Dropout, it should be very easy to modify your code to perform dropconnect. ​ The paper upon which you should base your implementation is+<code python>
  
-[[http://​www.jmlr.org/​proceedings/​papers/​v28/​wan13.pdf|The dropconnect paper]]+from PIL import Image 
 +import numpy as np
  
-**Important note**: the dropconnect paper has a somewhat more sophisticated inference method (that is, the method used at test time) **We will not use that method.** Instead, we will use the same inference approximation used by the Dropout paper -- we will simply scale things by the ''​keep_probability''​.+
 +# assumes list.txt ​is a list of filenamesformatted as 
 +
 +# ./​lfw2//​Aaron_Eckhart/​Aaron_Eckhart_0001.jpg 
 +# ./​lfw2//​Aaron_Guiel/​Aaron_Guiel_0001.jpg 
 +... 
 +#
  
-You should scan across the same values of ''​keep_probability''​, and you should generate the same plot.+files = open( './list.txt' ​).readlines()
  
-Dropconnect seems to want more training steps than dropoutso you should run the optimizer for 1500 iterations.+data = np.zeros(( len(files)250, 250 )) 
 +labels = np.zeros(( len(files), 1 ))
  
-**Part 3: implement L1/L2 regularization**+# a little hash map mapping subjects to IDs 
 +ids = {} 
 +scnt = 0
  
-For this part, you should implement both L1 and L2 regularization on the weights. ​ This will change your computation graph a bit, and specifically will change your cost function -- instead of optimizing just ''​cross_entropy'',​ you should optimize ''​cross_entropy + lam*regularizers'',​ where ''​lam''​ is the \lambda regularization parameter from the slides. ​ You should regularize ​all of the weights and biases (six variables ​in total).+# load in all of our images 
 +ind = 0 
 +for fn in files:
  
-You should create a plot of test/​training performance as you scan across values of lambda You should test at least [0.1, 0.01, 0.001]. +    subject = fn.split('/'​)[3
- +    ​if ​not ids.has_key( subject ): 
-Note: unlike the dropout/​dropconnect regularizers,​ you will probably ​not be able to improve test time performance! +        ids[ subject ] = scnt 
- +        scnt +
----- +    label ids[ subject ] 
-====Hints:====+     
 +    data[ ind, :, : ] np.array( Image.open( fn.rstrip() ) ) 
 +    labels[ ind ] label 
 +    ind +1
  
-To generate a random binary matrixyou can use ''​np.random.rand''​ to generate a matrix of random values between 0 and 1and then only keep those above a certain threshold.+# data is (13233250250) 
 +# labels is (13233, 1)
  
 +</​code>​
cs501r_f2016/tmp.1474749098.txt.gz · Last modified: 2021/06/30 23:40 (external edit)