Differences

This shows you the differences between two versions of the page.

--- cs501r_f2016:lab7 [2016/10/10 15:48]
wingated
+++ cs501r_f2016:lab7 [2021/06/30 23:42] (current)
@@ Line 1: / Line 1: @@
+====WARNING THIS LAB SPEC IS UNDER DEVELOPMENT:====
 ====Objective:====
@@ Line 10: / Line 13: @@
 For this lab, you will need to implement a generative adversarial
-network (GAN).  You will generate images that look like MNIST digits.
+network (GAN).
+Specifically, we will be using the technique outlined in the paper [[https://arxiv.org/pdf/1704.00028|Improved Training of Wasserstein GANs]].
-You should turn in an iPython notebook that shows a single plot, which
+You should turn in an iPython notebook that shows a two plots.  The first plot should be random samples from the final generator.  The second should show interpolation between two faces by interpolating in ''z'' space.
-will be samples from the final GAN.
-An example of my final samples is shown at the right.
+You must also turn in your code, but your code does not need to be in a notebook, if it's easier to turn it in separately (but please zip your code and notebook together in a single zip file).
-You are welcome to turn in your image and your code separately.
+An example of my final samples is shown at the right.
 **NOTE:** this lab is complex.  Please read through **the entire
@@ Line 49: / Line 52: @@
 discriminator and generator separately, we'll need to be able to train
 on subsets of variables.
-This lab is a bit more complex than some of the others, so we are
-providing some scaffold code:
-[[http://liftothers.org/byu/lab7_scaffold.py|Lab 7 scaffold code]]
 In the scaffold code, you will find the following:
-  - A small set of primitives for creating linear layers, convolution
+  - A small set of primitives for creating linear layers, convolution layers, and deconvolution layers.
-    layers, and deconvolution layers.
   - A few placeholders where you should put your models
   - An optimization loop
@@ Line 123: / Line 120: @@
   - ''H0'': A 2d convolution on ''imgs'' with 32 filters, followed by a leaky relu
   - ''H1'': A 2d convolution on ''H0'' with 64 filters, followed by a leaky relu
-  - ''H2'': A linear layer from ''H1'' to a 1024 dimensional vector
+  - ''H2'': A linear layer from ''H1'' to a 1024 dimensional vector, followed by a leaky relu
   - ''H3'': A linear layer mapping ''H2'' to a single scalar (per image)
   - The final output should be a sigmoid of ''H3''.
@@ Line 130: / Line 127: @@
 the dimensions to line up.  Here are a few hints to help you:
-  - The images that are passed in will have dimension of
+  - The images that are passed in will have dimension of ''[None,784]''.  However, that's not compatible with a convolution! So, we need to reshape it.  The first line of your function ought to be something like: ''imgs = tf.reshape( imgs, [ batch_size, 28, 28, 1 ] )''.  Note that it's 4-dimensional - that's important!
-    ''[None,784]''.  However, that's not compatible with a convolution!
+  - Similarly, the output of the ''H1'' layer will be a 4 dimensional tensor, but it needs to go through a linear layer to get mapped down to 1024 dimensions.  The easiest way to accomplish this is to reshape ''H1'' to be 2-dimensional, maybe something like: ''h1 = tf.reshape( h1, [ batch_size, -1 ] )''
-    So, we need to reshape it.  The first line of your function ought to
-    be something like: ''imgs = tf.reshape( imgs, [ batch_size, 28,
-, 1 ] )''.  Note that it's 4-dimensional - that's important!
-  - Similarly, the output of the ''H1'' layer will be a 4 dimensional
-    tensor, but it needs to go through a linear layer to get mapped
-    down to 1024 dimensions.  The easiest way to accomplish this is to
-    reshape ''H1'' to be 2-dimensional, maybe something like: ''h1 = tf.reshape( h1, [ batch_size, -1 ] )''
 ----
@@ Line 150: / Line 140: @@
 Your generator should have the following layers:
   - ''H1'': A linear layer, mapping ''z'' to 128*7*7 features, followed by a relu
-  - ''D2'': a deconvolution layer, mapping ''H1'' to a tensor that is
+  - ''D2'': a deconvolution layer, mapping ''H1'' to a tensor that is ''[batch_size,14,14,128]'', followed by a relu
-    ''[batch_size,14,14,128]'', followed by a relu
+  - ''D3'': a deconvolution layer, mapping ''D2'' to a tensor that is ''[batch_size,28,28,1]''
-  - ''D3'': a deconvolution layer, mapping ''D2'' to a tensor that is
-    ''[batch_size,28,28,1]''
   - The final output should be sigmoid of ''D3''
@@ Line 161: / Line 149: @@
 ----
 **Part 4: create your loss functions and training ops**
+{{ :cs501r_f2016:lab7_graph.png?200|}}
 You should create two loss functions, one for the discriminator, and
@@ Line 172: / Line 162: @@
 relatively simple.  Here's how we need to wire up all of the pieces:
-  - We need to pass the ''z'' variable into the generative model, and
+  - We need to pass the ''z'' variable into the generative model, and call the output ''sample_images''
-    call the output ''sample_images''
+  - We need to pass some true images into the discriminator, and get back some probabilities.
-  - We need to pass some true images into the discriminator, and
+  - We need to pass some sampled images into the discriminator, and get back some (different) probabilities.
-    get back some probabilities.
+  - We need to construct a loss function for the discriminator that attempts to maximize the log of the output probabilities on the true images and the log of 1.0 - the output probabilities on the sampled images; these two halves can be summed together
-  - We need to pass some sampled images into the discriminator, and
+  - We need to construct a loss function for the generator that attempts to maximize the log of the output probabilities on the sampled images
-    get back some (different) probabilities.
+  - For debugging purposes, I highly recommend you create an additional op called ''d_acc'' that calculates classification accuracy on a batch. This can just check the output probabilities of the discriminator on the real and sampled images, and see if they're greater (or less) than 0.5.
-  - We need to construct a loss function for the discriminator that
-    attempts to maximize the log of the output probabilities on the
-    true images and the log of 1.0 - the output probabilities on the
-    sampled images; these two halves can be summed together
-  - We need to construct a loss function for the generator that
-    attempts to maximize the log of the output probabilities on the
-    sampled images
-  - For debugging purposes, I highly recommend you create an additional op
-    called ''d_acc'' that calculates classification accuracy on a batch.
-    This can just check the output probabilities of the discriminator on
-    the real and sampled images, and see if they're greater (or less) than
-.5.
 **Here's the tricky part**.  Note that in wiring up our overall model,
@@ Line 208: / Line 186: @@
 I highly recommend using Tensorboard to visualize your final
-computation graph to make sure you got this right.
+computation graph to make sure you got this right. Check out my computation graph image on the right - you can see the two discriminator blocks, and you can see that the same variables are feeding into both of them.
 ----
@@ Line 224: / Line 202: @@
 like this:
-''
+<code>
        1.37 0.71 0.88
       0.90 0.98 1.00
@@ Line 246: / Line 224: @@
 ...
      1.25 1.12 0.68
-''
+</code>
 Note that we see the struggle between the generator and discriminator
@@ Line 261: / Line 239: @@
 steps, instead of 500.
+**Hint for debugging**: if you ever see the cost function for the generator going higher and higher, it means that the discriminator is too powerful.

BYU CS classes

User Tools

Site Tools

Differences

Page Tools