This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cs501r_f2016:tmp [2016/09/19 16:52] wingated |
cs501r_f2016:tmp [2021/06/30 23:42] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
====Objective:==== | ====Objective:==== | ||
- | To explore deeper networks, to leverage convolutions, and to explore Tensorboard. | + | To gain experience coding a DNN architecture and learning program end-to-end, and to gain experience with Siamese network and ResNets. |
---- | ---- | ||
====Deliverable:==== | ====Deliverable:==== | ||
- | For this lab, you will need to perform three steps: | + | For this lab, you will need to implement a simple face similarity detector. |
- | - You need to implement the [[https://www.tensorflow.org/versions/r0.10/tutorials/index.html|Deep MNIST for experts tutorial]] | + | - You must implement a siamese network that accepts two input images |
- | - You need to modify the tutorial code to deliver visualizations via Tensorboard. | + | - The network must output the probability that the two images are the same class |
+ | - Your implementation should use a ResNet architecture | ||
- | Specifically, you should turn in an iPython notebook that shows two images: | + | You should turn in the following: |
- | - A Tensorboard image showing your classification accuracy over time | + | |
- | - A Tensorboard image showing your (expanded) computation graph | + | |
- | Examples are shown to the right. | + | - A tensorboard screenshot showing that your architecture is, indeed, a siamese architecture |
+ | - Your code | ||
+ | - A small writeup (<1/2 page) describing your test/training split, your resnet architecture, and the final performance of your classifier. | ||
- | According to the tutorial, if you run for 20,000 iterations, the final accuracy of your classifier will be around 99.5%. To make your life simpler, you only need to run for 1500 iterations. | + | You should use the [[http://www.openu.ac.il/home/hassner/data/lfwa/|Labeled Faces in the Wild-a]] dataset (also available for |
+ | [[http://liftothers.org/byu/lfwa.tar.gz|download from liftothers]]). | ||
---- | ---- | ||
Line 24: | Line 26: | ||
Your notebook will be graded on the following: | Your notebook will be graded on the following: | ||
- | * 40% Correct multilayer convolutional network defined and working | + | * 35% Correct implementation of Siamese network |
- | * 30% Tidy and legible display of Tensorboard accuracy | + | * 35% Correct implementation of Resnet |
- | * 30% Tidy and legible display of Tensorboard computation graph | + | * 20% Reasonable effort to find a good-performing topology |
+ | * 10% Results writeup | ||
---- | ---- | ||
====Description:==== | ====Description:==== | ||
- | You now understand the basics of multi-layer neural networks. Here, we'll expand on your toolkit by adding in convolutions, a bit of dropout, and a new optimization method. Most of these will be explained in future lectures, so for now we will just use them without (fully) understanding them. | + | For this lab, you should implement a Siamese network, and train it to recognize whether or not two faces are the same or different. |
- | **Part 1: implement deep convolutional networks ** | + | No scaffolding code (except for a simple script for loading the images below) will be provided. The goal of this lab is for you to experiment with implementing an idea end-to-end. |
- | For this lab, you must implement the [[https://www.tensorflow.org/versions/r0.10/tutorials/index.html|Deep MNIST for experts tutorial]]. This is mostly cutting-and-pasting code; since you already have Tensorflow up and running, this should be fairly straightfoward. | + | The steps for completion of this lab are: |
- | A few things to note: | + | - Load all of the data. Create a test/training split. |
+ | - Establish a baseline accuracy (ie, if you randomly predict same/different, what accuracy do you achieve?) | ||
+ | - Use tensorflow to create your siamese network. | ||
+ | - Use ResNets to extract features from the images | ||
+ | - Make sure that parameters are shared across both halves of the network! | ||
+ | - Train the network using an optimizer of your choice | ||
- | - You are now adding multiple layers. Be careful with your variable names! | + | Note: you will NOT be graded on the accuracy of your final classifier, as long as you make a good faith effort to come up with something that performs reasonably well. |
- | - You'll use the Adam optimizer, not vanilla SGD. We learn more about this later. | + | |
- | - The dropout layer is optional, but you should probably leave it in just to make cutting-and-pasting easier. | + | |
- | **Note:** you only need to train for 1500 steps. My final accuracy was 96.5%. If you want to train for the full 20k steps, you are of course welcome to do so! | + | Your ResNet should extract a vector of features from each image. Those feature vectors should then be compared to calculate an "energy"; that energy should then be input into a contrastive loss function, as discussed in class. |
- | **Part 2: add in Tensorboard visualizations** | + | Note that some people in the database only have one image. These images are still useful, however (why?), so don't just throw them away. |
- | There are two parts to this: first, you need to scope all of the nodes in your computation graph. In class, I showed a visualization that drew pretty boxes around all of the different parts of your computation graph. That's what I want from you! | ||
- | Second, you'll need to produce little graphs that show accuracy over time. | + | ---- |
+ | ====Writeup:==== | ||
- | Adventurous souls can dive right into the [[https://www.tensorflow.org/versions/r0.10/how_tos/summaries_and_tensorboard/index.html|Tensorflow visualization tutorial]]. Here are some condensed notes: | + | As discussed in the "Deliverable" section, your writeup must include the following: |
- | Tensorboard logs //events// to a //summary log//. You'll need to tell Tensorboard where to stash those events and when to write them out; both are done with a SummaryWriter. You need to create a SummaryWriter object: | + | - A description of your test/training split |
+ | - A description of your resnet architecture (layers, strides, nonlinearities, etc.) | ||
+ | - How you assessed whether or not your architecture was working | ||
+ | - The final performance of your classifier | ||
- | ''summary_writer = tf.train.SummaryWriter( "./tf_logs", graph=sess.graph )'' | + | This writeup should be small - between 1/2 - 1 page. You don't need to wax eloquent. |
- | as well as scalar summaries of relevant variables; maybe something like this: | + | ---- |
+ | ====Hints:==== | ||
- | ''acc_summary = tf.scalar_summary( 'accuracy', accuracy )'' | + | To help you get started, here's a simple script that will load all of the images and calculate labels. It assumes that the face database has been unpacked in the current directory, and that there exists a file called ''list.txt'' that was generated with the following command: |
- | These summaries are considered ops, just like any node in the computation graph, and they are triggered by ''sess.run''. Tensorflow helpfully allows you to merge all of the summary ops into a single operation: | + | <code bash> |
+ | find ./lfw2/ -name \*.jpg > list.txt | ||
+ | </code> | ||
- | ''merged_summary_op = tf.merge_all_summaries()'' | + | After running this code, the data will in the ''data'' tensor, and the labels will be in the ''labels'' tensor: |
- | Then, you'll need to trigger the ''merged_summary_op'' operation. This will generate a //summary string//, which you should pass to your summary writer. | + | <code python> |
- | Once you have run your code and collected the necessary statistics, you should be able to start up the Tensorboard visualizer. It runs as a webserver; to start Tensorboard, you should be able to run something like the following **from the directory where you ran your TF code**: | + | from PIL import Image |
+ | import numpy as np | ||
- | <code bash> | + | # |
- | cd tf_logs | + | # assumes list.txt is a list of filenames, formatted as |
- | tensorboard --logdir . | + | # |
- | </code> | + | # ./lfw2//Aaron_Eckhart/Aaron_Eckhart_0001.jpg |
+ | # ./lfw2//Aaron_Guiel/Aaron_Guiel_0001.jpg | ||
+ | # ... | ||
+ | # | ||
- | At which point you'll see something like the following output: | + | files = open( './list.txt' ).readlines() |
- | <code> | + | data = np.zeros(( len(files), 250, 250 )) |
- | Starting TensorBoard 28 on port 6006 | + | labels = np.zeros(( len(files), 1 )) |
- | (You can navigate to http://192.168.250.107:6006) | + | |
- | </code> | + | |
- | Point your browser to the spot indicated, and voila! | + | # a little hash map mapping subjects to IDs |
+ | ids = {} | ||
+ | scnt = 0 | ||
- | ---- | + | # load in all of our images |
- | ====Hints:==== | + | ind = 0 |
+ | for fn in files: | ||
+ | subject = fn.split('/')[3] | ||
+ | if not ids.has_key( subject ): | ||
+ | ids[ subject ] = scnt | ||
+ | scnt += 1 | ||
+ | label = ids[ subject ] | ||
+ | | ||
+ | data[ ind, :, : ] = np.array( Image.open( fn.rstrip() ) ) | ||
+ | labels[ ind ] = label | ||
+ | ind += 1 | ||
+ | # data is (13233, 250, 250) | ||
+ | # labels is (13233, 1) | ||
+ | </code> |