This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cs501r_f2016:lab7 [2016/10/10 15:47] wingated |
cs501r_f2016:lab7 [2021/06/30 23:42] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====WARNING THIS LAB SPEC IS UNDER DEVELOPMENT:==== | ||
+ | |||
+ | |||
====Objective:==== | ====Objective:==== | ||
Line 7: | Line 10: | ||
====Deliverable:==== | ====Deliverable:==== | ||
- | {{:cs501r_f2016:lab7_gan_results.png?200|}} | + | {{ :cs501r_f2016:lab7_gan_results.png?200|}} |
For this lab, you will need to implement a generative adversarial | For this lab, you will need to implement a generative adversarial | ||
- | network (GAN). You will generate images that look like MNIST digits. | + | network (GAN). |
+ | Specifically, we will be using the technique outlined in the paper [[https://arxiv.org/pdf/1704.00028|Improved Training of Wasserstein GANs]]. | ||
- | You should turn in an iPython notebook that shows a single plot, which | + | You should turn in an iPython notebook that shows a two plots. The first plot should be random samples from the final generator. The second should show interpolation between two faces by interpolating in ''z'' space. |
- | will be samples from the final GAN. | + | |
- | An example of my final samples is shown at the right. | + | You must also turn in your code, but your code does not need to be in a notebook, if it's easier to turn it in separately (but please zip your code and notebook together in a single zip file). |
- | You are welcome to turn in your image and your code separately. | + | An example of my final samples is shown at the right. |
**NOTE:** this lab is complex. Please read through **the entire | **NOTE:** this lab is complex. Please read through **the entire | ||
Line 49: | Line 52: | ||
discriminator and generator separately, we'll need to be able to train | discriminator and generator separately, we'll need to be able to train | ||
on subsets of variables. | on subsets of variables. | ||
- | |||
- | This lab is a bit more complex than some of the others, so we are | ||
- | providing some scaffold code: | ||
- | |||
- | [[http://liftothers.org/byu/lab7_scaffold.py|Lab 7 scaffold code]] | ||
In the scaffold code, you will find the following: | In the scaffold code, you will find the following: | ||
- | - A small set of primitives for creating linear layers, convolution | + | - A small set of primitives for creating linear layers, convolution layers, and deconvolution layers. |
- | layers, and deconvolution layers. | + | |
- A few placeholders where you should put your models | - A few placeholders where you should put your models | ||
- An optimization loop | - An optimization loop | ||
Line 123: | Line 120: | ||
- ''H0'': A 2d convolution on ''imgs'' with 32 filters, followed by a leaky relu | - ''H0'': A 2d convolution on ''imgs'' with 32 filters, followed by a leaky relu | ||
- ''H1'': A 2d convolution on ''H0'' with 64 filters, followed by a leaky relu | - ''H1'': A 2d convolution on ''H0'' with 64 filters, followed by a leaky relu | ||
- | - ''H2'': A linear layer from ''H1'' to a 1024 dimensional vector | + | - ''H2'': A linear layer from ''H1'' to a 1024 dimensional vector, followed by a leaky relu |
- ''H3'': A linear layer mapping ''H2'' to a single scalar (per image) | - ''H3'': A linear layer mapping ''H2'' to a single scalar (per image) | ||
- The final output should be a sigmoid of ''H3''. | - The final output should be a sigmoid of ''H3''. | ||
Line 130: | Line 127: | ||
the dimensions to line up. Here are a few hints to help you: | the dimensions to line up. Here are a few hints to help you: | ||
- | - The images that are passed in will have dimension of | + | - The images that are passed in will have dimension of ''[None,784]''. However, that's not compatible with a convolution! So, we need to reshape it. The first line of your function ought to be something like: ''imgs = tf.reshape( imgs, [ batch_size, 28, 28, 1 ] )''. Note that it's 4-dimensional - that's important! |
- | ''[None,784]''. However, that's not compatible with a convolution! | + | - Similarly, the output of the ''H1'' layer will be a 4 dimensional tensor, but it needs to go through a linear layer to get mapped down to 1024 dimensions. The easiest way to accomplish this is to reshape ''H1'' to be 2-dimensional, maybe something like: ''h1 = tf.reshape( h1, [ batch_size, -1 ] )'' |
- | So, we need to reshape it. The first line of your function ought to | + | |
- | be something like: ''imgs = tf.reshape( imgs, [ batch_size, 28, | + | |
- | 28, 1 ] )''. Note that it's 4-dimensional - that's important! | + | |
- | - Similarly, the output of the ''H1'' layer will be a 4 dimensional | + | |
- | tensor, but it needs to go through a linear layer to get mapped | + | |
- | down to 1024 dimensions. The easiest way to accomplish this is to | + | |
- | reshape ''H1'' to be 2-dimensional, maybe something like: ''h1 = tf.reshape( h1, [ batch_size, -1 ] )'' | + | |
---- | ---- | ||
Line 150: | Line 140: | ||
Your generator should have the following layers: | Your generator should have the following layers: | ||
- ''H1'': A linear layer, mapping ''z'' to 128*7*7 features, followed by a relu | - ''H1'': A linear layer, mapping ''z'' to 128*7*7 features, followed by a relu | ||
- | - ''D2'': a deconvolution layer, mapping ''H1'' to a tensor that is | + | - ''D2'': a deconvolution layer, mapping ''H1'' to a tensor that is ''[batch_size,14,14,128]'', followed by a relu |
- | ''[batch_size,14,14,128]'', followed by a relu | + | - ''D3'': a deconvolution layer, mapping ''D2'' to a tensor that is ''[batch_size,28,28,1]'' |
- | - ''D3'': a deconvolution layer, mapping ''D2'' to a tensor that is | + | |
- | ''[batch_size,28,28,1]'' | + | |
- The final output should be sigmoid of ''D3'' | - The final output should be sigmoid of ''D3'' | ||
Line 161: | Line 149: | ||
---- | ---- | ||
**Part 4: create your loss functions and training ops** | **Part 4: create your loss functions and training ops** | ||
+ | |||
+ | {{ :cs501r_f2016:lab7_graph.png?200|}} | ||
You should create two loss functions, one for the discriminator, and | You should create two loss functions, one for the discriminator, and | ||
Line 172: | Line 162: | ||
relatively simple. Here's how we need to wire up all of the pieces: | relatively simple. Here's how we need to wire up all of the pieces: | ||
- | - We need to pass the ''z'' variable into the generative model, and | + | - We need to pass the ''z'' variable into the generative model, and call the output ''sample_images'' |
- | call the output ''sample_images'' | + | - We need to pass some true images into the discriminator, and get back some probabilities. |
- | - We need to pass some true images into the discriminator, and | + | - We need to pass some sampled images into the discriminator, and get back some (different) probabilities. |
- | get back some probabilities. | + | - We need to construct a loss function for the discriminator that attempts to maximize the log of the output probabilities on the true images and the log of 1.0 - the output probabilities on the sampled images; these two halves can be summed together |
- | - We need to pass some sampled images into the discriminator, and | + | - We need to construct a loss function for the generator that attempts to maximize the log of the output probabilities on the sampled images |
- | get back some (different) probabilities. | + | - For debugging purposes, I highly recommend you create an additional op called ''d_acc'' that calculates classification accuracy on a batch. This can just check the output probabilities of the discriminator on the real and sampled images, and see if they're greater (or less) than 0.5. |
- | - We need to construct a loss function for the discriminator that | + | |
- | attempts to maximize the log of the output probabilities on the | + | |
- | true images and the log of 1.0 - the output probabilities on the | + | |
- | sampled images; these two halves can be summed together | + | |
- | - We need to construct a loss function for the generator that | + | |
- | attempts to maximize the log of the output probabilities on the | + | |
- | sampled images | + | |
- | - For debugging purposes, I highly recommend you create an additional op | + | |
- | called ''d_acc'' that calculates classification accuracy on a batch. | + | |
- | This can just check the output probabilities of the discriminator on | + | |
- | the real and sampled images, and see if they're greater (or less) than | + | |
- | 0.5. | + | |
**Here's the tricky part**. Note that in wiring up our overall model, | **Here's the tricky part**. Note that in wiring up our overall model, | ||
Line 208: | Line 186: | ||
I highly recommend using Tensorboard to visualize your final | I highly recommend using Tensorboard to visualize your final | ||
- | computation graph to make sure you got this right. | + | computation graph to make sure you got this right. Check out my computation graph image on the right - you can see the two discriminator blocks, and you can see that the same variables are feeding into both of them. |
---- | ---- | ||
Line 224: | Line 202: | ||
like this: | like this: | ||
- | '' | + | <code> |
0 1.37 0.71 0.88 | 0 1.37 0.71 0.88 | ||
10 0.90 0.98 1.00 | 10 0.90 0.98 1.00 | ||
Line 246: | Line 224: | ||
... | ... | ||
490 1.25 1.12 0.68 | 490 1.25 1.12 0.68 | ||
- | '' | + | </code> |
Note that we see the struggle between the generator and discriminator | Note that we see the struggle between the generator and discriminator | ||
Line 261: | Line 239: | ||
5000 steps, instead of 500. | 5000 steps, instead of 500. | ||
+ | **Hint for debugging**: if you ever see the cost function for the generator going higher and higher, it means that the discriminator is too powerful. |