Differences

This shows you the differences between two versions of the page.

--- cs501r_f2016:tmp [2016/09/24 20:47]
wingated
+++ cs501r_f2016:tmp [2016/11/09 18:30]
wingated
@@ Line 1: / Line 1: @@
 ====Objective:====
-To read current papers on DNN research and translate them into working models.  To experiment with DNN-style regularization methods, including Dropout, Dropconnect, and L1 weight regularization.
+To gain experience coding a DNN architecture and learning program end-to-end, and to gain experience with Siamese network and ResNets.
 ----
 ====Deliverable:====
-{{ :cs501r_f2016:lab6_do.png?direct&200|}}
+For this lab, you will need to implement a simple face similarity detector.
-For this lab, you will need to implement three different regularization methods from the literature, and explore the parameters of each.
+  - You must implement a siamese network that accepts two input images
+  - The network must output the probability that the two images are the same class
+  - Your implementation should use a ResNet architecture
-  - You must implement dropout (NOT using the pre-defined Tensorflow layers)
+You should turn in the following:
-  - You must implement dropconnect
-  - You must implement L1 weight regularization
-You should  turn in an iPython notebook that shows three plots, one for each of the regularization methods.
+  - A tensorboard screenshot showing that your architecture is, indeed, a siamese architecture
+  - Your code
+  - A small writeup (<1/2 page) describing your test/training split, your resnet architecture, and the final performance of your classifier.
-  - For dropout: a plot showing training / test performance as a function of the "keep probability".
+You should use the [[http://www.openu.ac.il/home/hassner/data/lfwa/|Labeled Faces in the Wild-a]] dataset (also available for
-  - For dropconnect: the same
+[[http://liftothers.org/byu/lfwa.tar.gz|download from liftothers]]).
-  - For L1 a plot showing training / test performance as a function of the regularization strength, \lambda
-An example of my training/test performance for dropout is shown at the right.
 ----
@@ Line 27: / Line 26: @@
 Your notebook will be graded on the following:
-  * 40% Correct implementation of Dropout
+  * 35% Correct implementation of Siamese network
-  * 30% Correct implementation of Dropconnect
+  * 35% Correct implementation of Resnet
-  * 20% Correct implementation of L1 regularization
+  * 20% Reasonable effort to find a good-performing topology
-  * 10% Tidy and legible plots
+  * 10% Results writeup
 ----
 ====Description:====
-This lab is a chance for you to start reading the literature on deep neural networks, and understand how to replicate methods from the literature.  You will implement 4 different regularization methods, and will benchmark each one.
+For this lab, you should implement a Siamese network, and train it to recognize whether or not two faces are the same or different.
-To help ensure that everyone is starting off on the same footing, you should download the following scaffold code:
+No scaffolding code (except for a simple script for loading the images below) will be provided.  The goal of this lab is for you to experiment with implementing an idea end-to-end.
-[[http://liftothers.org/byu/lab6_scaffold.py|Lab 6 scaffold code]]
+The steps for completion of this lab are:
-For all 4 methods, we will run on a single, deterministic batch of the first 1000 images from the MNIST dataset.  This will help us to overfit, and will hopefully be small enough not to tax your computers too much.
+  - Load all of the data.  Create a test/training split.
+  - Establish a baseline accuracy (ie, if you randomly predict same/different, what accuracy do you achieve?)
+  - Use tensorflow to create your siamese network.
+    - Use ResNets to extract features from the images
+    - Make sure that parameters are shared across both halves of the network!
+  - Train the network using an optimizer of your choice
-**Part 1: implement dropout**
+Note: you will NOT be graded on the accuracy of your final classifier, as long as you make a good faith effort to come up with something that performs reasonably well.
-For the first part of the lab, you should implement dropout.  The paper upon which you should base your implementation is found at:
+Your ResNet should extract a vector of features from each image.  Those feature vectors should then be compared to calculate an "energy"; that energy should then be input into a contrastive loss function, as discussed in class.
-[[https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf|The dropout paper]]
+Note that some people in the database only have one image.  These images are still useful, however (why?), so don't just throw them away.
-The relevant equations are found in section 4 (pg 1933).  You may also refer to the slides.
-There are several notes to help you with this part:
+----
+====Writeup:====
+As discussed in the "Deliverable" section, your writeup must include the following:
+  - A description of your test/training split
+  - A description of your resnet architecture (layers, strides, nonlinearities, etc.)
+  - How you assessed whether or not your architecture was working
+  - The final performance of your classifier
+This writeup should be small - between 1/2 - 1 page.  You don't need to wax eloquent.
+----
+====Hints:====
-  - First, you should run the provided code as-is.  It will overfit on the first 1000 images (how do you know this?).  Record the test and training accuracy; this will be the "baseline" line in your plot.
+To help you get started, here's a simple script that will load all of the images and calculate labels.  It assumes that the face database has been unpacked in the current directory, and that there exists a file called ''list.txt'' that was generated with the following command:
-  - Second, you should add dropout to each of the ''h1'', ''h2'', and ''h3'' layers.
-  - You must consider carefully how to use tensorflow to implement dropout.
-  - Remember that when you test images (or when you compute training set accuracy), you must scale activations by the ''keep_probability'', as discussed in class and in the paper.
-  - You should use the Adam optimizer, and optimize for 150 steps.
-Note that although we are training on only the first 1000 images, we are testing on the entire 10,000 image test set.
+<code bash>
+find ./lfw2/ -name \*.jpg > list.txt
+</code>
-In order to generate the final plot, you will need to scan across multiple values of the ''keep_probability''.  You may wish to refactor the provided code in order to make this easier.  You should test at least the values ''[ 0.1, 0.25, 0.5, 0.75, 1.0 ]''.
+After running this code, the data will in the ''data'' tensor, and the labels will be in the ''labels'' tensor:
-Once you understand dropout, implementing it is not hard; you should only have to add ~10 lines of code.
+<code python>
-Also note that because dropout involves some randomness, your curve may not match mine exactly; this is expected.
+from PIL import Image
+import numpy as np
-**Part 2: implement dropconnect**
+#
+# assumes list.txt is a list of filenames, formatted as
+#
+# ./lfw2//Aaron_Eckhart/Aaron_Eckhart_0001.jpg
+# ./lfw2//Aaron_Guiel/Aaron_Guiel_0001.jpg
+# ...
+#
-The specifications for this part are similar to part 1.  Once you have implemented Dropout, it should be very easy to modify your code to perform dropconnect.  The paper upon which you should base your implementation is
+files = open( './list.txt' ).readlines()
-[[http://www.jmlr.org/proceedings/papers/v28/wan13.pdf|The dropconnect paper]]
+data = np.zeros(( len(files), 250, 250 ))
+labels = np.zeros(( len(files), 1 ))
-**Important note**: the dropconnect paper has a somewhat more sophisticated inference method (that is, the method used at test time).  **We will not use that method.** Instead, we will use the same inference approximation used by the Dropout paper -- we will simply scale things by the ''keep_probability''.
+# a little hash map mapping subjects to IDs
+ids = {}
+scnt = 0
-You should scan across the same values of ''keep_probability'', and you should generate a similar plot.
+# load in all of our images
+ind = 0
+for fn in files:
-Dropconnect seems to want more training steps than dropout, so you should run the optimizer for 1500 iterations.
+    subject = fn.split('/')[3]
+    if not ids.has_key( subject ):
+        ids[ subject ] = scnt
+        scnt += 1
+    label = ids[ subject ]
+    data[ ind, :, : ] = np.array( Image.open( fn.rstrip() ) )
+    labels[ ind ] = label
+    ind += 1
-**Part 3: implement L1 regularization**
+# data is (13233, 250, 250)
+# labels is (13233, 1)
-For this part, you should implement L1 regularization on the weights.  This will change your computation graph a bit, and specifically wil
+</code>

BYU CS classes

User Tools

Site Tools

Differences

Page Tools