====Objective:==== To gain experience coding a DNN architecture and learning program end-to-end, and to gain experience with Siamese network and ResNets. ---- ====Deliverable:==== For this lab, you will need to implement a simple face similarity detector. - You must implement a siamese network that accepts two input images - The network must output the probability that the two images are the same class - Your implementation should use a ResNet architecture You should turn in the following: - A tensorboard screenshot showing that your architecture is, indeed, a siamese architecture - Your code - A small writeup (<1/2 page) describing your test/training split, your resnet architecture, and the final performance of your classifier. You should use the [[http://www.openu.ac.il/home/hassner/data/lfwa/|Labeled Faces in the Wild-a]] dataset (also available for [[http://liftothers.org/byu/lfwa.tar.gz|download from liftothers]]). ---- ====Grading standards:==== Your notebook will be graded on the following: * 35% Correct implementation of Siamese network * 35% Correct implementation of Resnet * 20% Reasonable effort to find a good-performing topology * 10% Results writeup ---- ====Description:==== For this lab, you should implement a Siamese network, and train it to recognize whether or not two faces are the same or different. No scaffolding code (except for a simple script for loading the images below) will be provided. The goal of this lab is for you to experiment with implementing an idea end-to-end. The steps for completion of this lab are: - Load all of the data. Create a test/training split. - Establish a baseline accuracy (ie, if you randomly predict same/different, what accuracy do you achieve?) - Use tensorflow to create your siamese network. - Use ResNets to extract features from the images - Make sure that parameters are shared across both halves of the network! - Train the network using an optimizer of your choice Note: you will NOT be graded on the accuracy of your final classifier, as long as you make a good faith effort to come up with something that performs reasonably well. Your ResNet should extract a vector of features from each image. Those feature vectors should then be compared to calculate an "energy"; that energy should then be input into a contrastive loss function, as discussed in class. Note that some people in the database only have one image. These images are still useful, however (why?), so don't just throw them away. ---- ====Writeup:==== As discussed in the "Deliverable" section, your writeup must include the following: - A description of your test/training split - A description of your resnet architecture (layers, strides, nonlinearities, etc.) - How you assessed whether or not your architecture was working - The final performance of your classifier This writeup should be small - between 1/2 - 1 page. You don't need to wax eloquent. ---- ====Hints:==== To help you get started, here's a simple script that will load all of the images and calculate labels. It assumes that the face database has been unpacked in the current directory, and that there exists a file called ''list.txt'' that was generated with the following command: find ./lfw2/ -name \*.jpg > list.txt After running this code, the data will in the ''data'' tensor, and the labels will be in the ''labels'' tensor: from PIL import Image import numpy as np # # assumes list.txt is a list of filenames, formatted as # # ./lfw2//Aaron_Eckhart/Aaron_Eckhart_0001.jpg # ./lfw2//Aaron_Guiel/Aaron_Guiel_0001.jpg # ... # files = open( './list.txt' ).readlines() data = np.zeros(( len(files), 250, 250 )) labels = np.zeros(( len(files), 1 )) # a little hash map mapping subjects to IDs ids = {} scnt = 0 # load in all of our images ind = 0 for fn in files: subject = fn.split('/')[3] if not ids.has_key( subject ): ids[ subject ] = scnt scnt += 1 label = ids[ subject ] data[ ind, :, : ] = np.array( Image.open( fn.rstrip() ) ) labels[ ind ] = label ind += 1 # data is (13233, 250, 250) # labels is (13233, 1)