User Tools

Site Tools


cs501r_f2016:lab7

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cs501r_f2016:lab7 [2017/10/17 21:55]
wingated
cs501r_f2016:lab7 [2021/06/30 23:42]
Line 1: Line 1:
-====WARNING THIS LAB SPEC IS UNDER DEVELOPMENT:​==== 
  
- 
-====Objective:​==== 
- 
-To learn about deconvolutions,​ variable sharing, trainable variables, 
-and generative adversarial models. 
- 
----- 
-====Deliverable:​==== 
- 
-{{ :​cs501r_f2016:​lab7_gan_results.png?​200|}} 
- 
-For this lab, you will need to implement a generative adversarial 
-network (GAN).  ​ 
-Specifically,​ we will be using the technique outlined in the paper [[https://​arxiv.org/​pdf/​1704.00028|Improved Training of Wasserstein GANs]]. 
- 
-You should turn in an iPython notebook that shows a two plots. ​ The first plot should be random samples from the final generator. ​ The second should show interpolation between two faces by interpolating in ''​z''​ space. 
- 
-You must also turn in your code, but your code does not need to be in a notebook, if it's easier to turn it in separately (but please zip your code and notebook together in a single zip file). 
- 
-An example of my final samples is shown at the right. 
- 
-**NOTE:** this lab is complex. ​ Please read through **the entire 
-spec** before diving in. 
- 
----- 
-====Grading standards:​==== 
- 
-Your code/image will be graded on the following: 
- 
-  * 20% Correct implementation of discriminator 
-  * 20% Correct implementation of generator 
-  * 20% Correct implementation of loss functions 
-  * 20% Correct sharing of variables 
-  * 10% Correct training of subsets of variables 
-  * 10% Tidy and legible final image 
- 
----- 
-====Description:​==== 
- 
-This lab will help you develop several new tensorflow skills, as well 
-as understand some best practices needed for building large models. 
-In addition, we'll be able to create networks that generate neat images! 
- 
-The most important new concepts here are //​deconvolutions//,​ 
-//variable reusing//, and //trainable variables//​. ​ Deconvolutions are 
-what we will use to map a ''​z''​ vector to an image. ​ Because we'll 
-want to refer to the discriminator in two different contexts, we'll 
-want to reuse its variables (instead of creating two different 
-discriminators!). ​ And because we'll want to optimize the 
-discriminator and generator separately, we'll need to be able to train 
-on subsets of variables. 
- 
-In the scaffold code, you will find the following: 
- 
-  - A small set of primitives for creating linear layers, convolution layers, and deconvolution layers. 
-  - A few placeholders where you should put your models 
-  - An optimization loop 
-  - A bit of code to visualize samples from the model 
- 
-An important part of this lab is reading this code, so please take the 
-time to thoroughly read and understand what it's doing. 
- 
-Let's dive in! 
- 
----- 
-**Part 0: naming your variables, and training on subsets of variables** 
- 
-Before filling in any code, we need to think ahead a bit.  We're going 
-to create a large-ish computation graph that describes everything 
-about our GAN, including the generator and discriminator. ​ However, 
-when we train the discriminator,​ we'll want to adjust only the 
-variables involved in the discriminator,​ and when we train the 
-generator, we'll want to adjust only the variables involved in the 
-generator. 
- 
-How can we accomplish this?  Well, tensorflow has a handy function 
-called ''​trainable_variables''​ that returns a list of all the 
-variables in your graph. ​ By itself, this isn't quite enough -- we 
-still need to distinguish generator variables from discriminator 
-variables. 
- 
-Here's how I solved this problem: by naming my variables consistently,​ 
-and then creating a list of only discriminator / generator variables. 
-So, for example, here's how I set up a trainer that optimizes my 
-discriminator loss function (''​d_loss''​) by tweaking only 
-discriminator variables (''​d_vars''​):​ 
- 
-<code python> 
-    t_vars = tf.trainable_variables() 
-    d_vars = [var for var in t_vars if '​d_'​ in var.name] 
-    d_optim = tf.train.AdamOptimizer( 0.0002, beta1=0.5 ).minimize( d_loss, var_list=d_vars ) 
-</​code>​ 
- 
-The critical part is that I created the ''​var_list''​ populated with 
-only a subset of the variables I needed. 
- 
-Note that for compatibility with the provided optimization code, you 
-should name your train steps ''​d_optim''​ and ''​g_optim''​. 
- 
----- 
-**Part 1: create your placeholders** 
- 
-What are the inputs to a GAN?  At some point, we'll need to be able to 
-pass in a ''​z''​ variable and some real images. ​ So, you'll only need 
-two placeholders in the entire computation graph! ​ If you name them 
-''​z''​ and ''​true_images'',​ then your code will be compatible with the 
-provided optimization loop. 
- 
----- 
-**Part 2: create your discriminator** 
- 
-To start, complete the ''​disc_model''​ function. ​ This is the 
-discriminator. ​ Its job is to accept as input a batch of images (call 
-it ''​imgs''​),​ and output a batch of probabilities (where each 
-probability is the probability of the image being a **real** image). 
- 
-Your discriminator should have the following layers: 
-  - ''​H0'':​ A 2d convolution on ''​imgs''​ with 32 filters, followed by a leaky relu 
-  - ''​H1'':​ A 2d convolution on ''​H0''​ with 64 filters, followed by a leaky relu 
-  - ''​H2'':​ A linear layer from ''​H1''​ to a 1024 dimensional vector, followed by a leaky relu 
-  - ''​H3'':​ A linear layer mapping ''​H2''​ to a single scalar (per image) 
-  - The final output should be a sigmoid of ''​H3''​. 
- 
-The hardest part of creating your discriminator will be getting all of 
-the dimensions to line up.  Here are a few hints to help you: 
- 
-  - The images that are passed in will have dimension of ''​[None,​784]''​. ​ However, that's not compatible with a convolution! So, we need to reshape it.  The first line of your function ought to be something like: ''​imgs = tf.reshape( imgs, [ batch_size, 28, 28, 1 ] )''​. ​ Note that it's 4-dimensional - that's important! 
-  - Similarly, the output of the ''​H1''​ layer will be a 4 dimensional tensor, but it needs to go through a linear layer to get mapped down to 1024 dimensions. ​ The easiest way to accomplish this is to reshape ''​H1''​ to be 2-dimensional,​ maybe something like: ''​h1 = tf.reshape( h1, [ batch_size, -1 ] )''​ 
- 
----- 
-**Part 3: create your generator** 
- 
-Now, let's fill in the generator function. ​ The generator'​s job is to 
-accept a batch of ''​z''​ variables (each of dimension 100), and then 
-return a batch of images (each image will be 28x28, but for 
-compatibility with the discriminator,​ we will reshape it to be 784x1). 
- 
-Your generator should have the following layers: 
-  - ''​H1'':​ A linear layer, mapping ''​z''​ to 128*7*7 features, followed by a relu 
-  - ''​D2'':​ a deconvolution layer, mapping ''​H1''​ to a tensor that is ''​[batch_size,​14,​14,​128]'',​ followed by a relu 
-  - ''​D3'':​ a deconvolution layer, mapping ''​D2''​ to a tensor that is ''​[batch_size,​28,​28,​1]''​ 
-  - The final output should be sigmoid of ''​D3''​ 
- 
-Note that you reshape ''​D3''​ to be ''​[batch_size,​784]''​ for 
-compatibility with the discriminator. 
- 
----- 
-**Part 4: create your loss functions and training ops** 
- 
-{{ :​cs501r_f2016:​lab7_graph.png?​200|}} 
- 
-You should create two loss functions, one for the discriminator,​ and 
-one for the generator. ​ Refer to the slides on GANs for details on the 
-loss functions. ​ Note that the slides and the following discussion are 
-framed in terms of maximizing, but for consistency with my code (and 
-other labs), you may wish to frame your cost functions in terms of 
-minimization. 
- 
-This is possibly the hardest part of the lab, even though the code is 
-relatively simple. ​ Here's how we need to wire up all of the pieces: 
- 
-  - We need to pass the ''​z''​ variable into the generative model, and call the output ''​sample_images''​ 
-  - We need to pass some true images into the discriminator,​ and get back some probabilities. 
-  - We need to pass some sampled images into the discriminator,​ and get back some (different) probabilities. 
-  - We need to construct a loss function for the discriminator that attempts to maximize the log of the output probabilities on the true images and the log of 1.0 - the output probabilities on the sampled images; these two halves can be summed together 
-  - We need to construct a loss function for the generator that attempts to maximize the log of the output probabilities on the sampled images 
-  - For debugging purposes, I highly recommend you create an additional op called ''​d_acc''​ that calculates classification accuracy on a batch. This can just check the output probabilities of the discriminator on the real and sampled images, and see if they'​re greater (or less) than 0.5. 
- 
-**Here'​s the tricky part**. ​ Note that in wiring up our overall model, 
-we need to use the discriminator twice - once on real images, and once 
-on sampled images. ​ You've already coded up a nice function that 
-encapsulates the discriminator,​ but we don't want to just call it 
-twice -- that would create two copies of all of the variables. 
- 
-Instead, we need to //share variables// -- the idea is that we want to 
-be able to call our discriminator function twice to be able to perform 
-the same classification logic, but use the same variables each time. 
-Tensorflow has a mechanism 
-to help with this, which you should [[https://​www.tensorflow.org/​versions/​r0.11/​how_tos/​variable_scope/​index.html|read about here]]. 
- 
-Note that the provided layers already use "​get_variable",​ so sharing 
-variables should be as straightforward as figuring out when to call 
-the ''​reuse_variables''​ function! 
- 
-I highly recommend using Tensorboard to visualize your final 
-computation graph to make sure you got this right. Check out my computation graph image on the right - you can see the two discriminator blocks, and you can see that the same variables are feeding into both of them. 
- 
----- 
-**Part 5: Run it and generate your final image!** 
- 
-Assuming you've named all of your placeholders and ops properly, you 
-can use the provided optimization code.  It's set to run for 500 
-iterations, and print out some debugging information every 10 steps. 
- 
-Note that the loop takes 3 steps for the generator for every 1 step 
-taken by the discriminator! ​ This is to help maintain the "​balance of 
-power" we talked about in class. 
- 
-Assuming everything has gone well, you should see output something 
-like this: 
- 
-<​code>​ 
-0       1.37 0.71 0.88 
-10      0.90 0.98 1.00 
-20      0.69 0.93 1.00 
-30      0.89 1.14 0.91 
-40      0.94 1.06 0.86 
-50      0.77 1.20 0.96 
-60      0.59 1.55 0.94 
-70      0.46 1.47 0.97 
-80      0.58 1.64 0.94 
-90      0.42 1.64 0.98 
-100     0.73 1.14 0.87 
-110     0.74 1.51 0.91 
-120     0.78 1.35 0.86 
-130     1.08 1.31 0.71 
-140     1.39 0.94 0.61 
-150     0.90 1.24 0.82 
-160     1.26 1.00 0.66 
-170     0.90 1.03 0.81 
-180     1.02 1.04 0.76 
-... 
-490     1.25 1.12 0.68 
-</​code>​ 
- 
-Note that we see the struggle between the generator and discriminator 
-clearly here. The first column represents the loss function for the 
-discriminator,​ the second column is the loss function for the 
-generator, and the final column is the discriminators classification 
-accuracy. 
- 
-Initially, the discriminator is able to distinguish almost perfectly 
-between true and fake images, but by the end of training, it's only 
-running at 68% accuracy. ​ Not bad! 
- 
-Note that for your final image, you may need to train longer -- I used 
-5000 steps, instead of 500. 
- 
-**Hint for debugging**:​ if you ever see the cost function for the generator going higher and higher, it means that the discriminator is too powerful. 
cs501r_f2016/lab7.txt ยท Last modified: 2021/06/30 23:42 (external edit)