This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
|
cs501r_f2016:lab7 [2016/10/10 15:46] wingated created |
cs501r_f2016:lab7 [2021/06/30 23:42] (current) |
||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | ====WARNING THIS LAB SPEC IS UNDER DEVELOPMENT:==== | ||
| + | |||
| + | |||
| ====Objective:==== | ====Objective:==== | ||
| Line 6: | Line 9: | ||
| ---- | ---- | ||
| ====Deliverable:==== | ====Deliverable:==== | ||
| + | |||
| + | {{ :cs501r_f2016:lab7_gan_results.png?200|}} | ||
| For this lab, you will need to implement a generative adversarial | For this lab, you will need to implement a generative adversarial | ||
| - | network (GAN). You will generate images that look like MNIST digits. | + | network (GAN). |
| + | Specifically, we will be using the technique outlined in the paper [[https://arxiv.org/pdf/1704.00028|Improved Training of Wasserstein GANs]]. | ||
| - | You should turn in an iPython notebook that shows a single plot, which | + | You should turn in an iPython notebook that shows a two plots. The first plot should be random samples from the final generator. The second should show interpolation between two faces by interpolating in ''z'' space. |
| - | will be samples from the final GAN. | + | |
| - | An example of my final samples is shown at the right. | + | You must also turn in your code, but your code does not need to be in a notebook, if it's easier to turn it in separately (but please zip your code and notebook together in a single zip file). |
| - | You are welcome to turn in your image and your code separately. | + | An example of my final samples is shown at the right. |
| **NOTE:** this lab is complex. Please read through **the entire | **NOTE:** this lab is complex. Please read through **the entire | ||
| Line 47: | Line 52: | ||
| discriminator and generator separately, we'll need to be able to train | discriminator and generator separately, we'll need to be able to train | ||
| on subsets of variables. | on subsets of variables. | ||
| - | |||
| - | This lab is a bit more complex than some of the others, so we are | ||
| - | providing some scaffold code: | ||
| - | |||
| - | [[http://liftothers.org/byu/lab7_scaffold.py|Lab 7 scaffold code]] | ||
| In the scaffold code, you will find the following: | In the scaffold code, you will find the following: | ||
| - | - A small set of primitives for creating linear layers, convolution | + | - A small set of primitives for creating linear layers, convolution layers, and deconvolution layers. |
| - | layers, and deconvolution layers. | + | |
| - A few placeholders where you should put your models | - A few placeholders where you should put your models | ||
| - An optimization loop | - An optimization loop | ||
| Line 121: | Line 120: | ||
| - ''H0'': A 2d convolution on ''imgs'' with 32 filters, followed by a leaky relu | - ''H0'': A 2d convolution on ''imgs'' with 32 filters, followed by a leaky relu | ||
| - ''H1'': A 2d convolution on ''H0'' with 64 filters, followed by a leaky relu | - ''H1'': A 2d convolution on ''H0'' with 64 filters, followed by a leaky relu | ||
| - | - ''H2'': A linear layer from ''H1'' to a 1024 dimensional vector | + | - ''H2'': A linear layer from ''H1'' to a 1024 dimensional vector, followed by a leaky relu |
| - ''H3'': A linear layer mapping ''H2'' to a single scalar (per image) | - ''H3'': A linear layer mapping ''H2'' to a single scalar (per image) | ||
| - The final output should be a sigmoid of ''H3''. | - The final output should be a sigmoid of ''H3''. | ||
| Line 128: | Line 127: | ||
| the dimensions to line up. Here are a few hints to help you: | the dimensions to line up. Here are a few hints to help you: | ||
| - | - The images that are passed in will have dimension of | + | - The images that are passed in will have dimension of ''[None,784]''. However, that's not compatible with a convolution! So, we need to reshape it. The first line of your function ought to be something like: ''imgs = tf.reshape( imgs, [ batch_size, 28, 28, 1 ] )''. Note that it's 4-dimensional - that's important! |
| - | ''[None,784]''. However, that's not compatible with a convolution! | + | - Similarly, the output of the ''H1'' layer will be a 4 dimensional tensor, but it needs to go through a linear layer to get mapped down to 1024 dimensions. The easiest way to accomplish this is to reshape ''H1'' to be 2-dimensional, maybe something like: ''h1 = tf.reshape( h1, [ batch_size, -1 ] )'' |
| - | So, we need to reshape it. The first line of your function ought to | + | |
| - | be something like: ''imgs = tf.reshape( imgs, [ batch_size, 28, | + | |
| - | 28, 1 ] )''. Note that it's 4-dimensional - that's important! | + | |
| - | - Similarly, the output of the ''H1'' layer will be a 4 dimensional | + | |
| - | tensor, but it needs to go through a linear layer to get mapped | + | |
| - | down to 1024 dimensions. The easiest way to accomplish this is to | + | |
| - | reshape ''H1'' to be 2-dimensional, maybe something like: ''h1 = tf.reshape( h1, [ batch_size, -1 ] )'' | + | |
| ---- | ---- | ||
| Line 148: | Line 140: | ||
| Your generator should have the following layers: | Your generator should have the following layers: | ||
| - ''H1'': A linear layer, mapping ''z'' to 128*7*7 features, followed by a relu | - ''H1'': A linear layer, mapping ''z'' to 128*7*7 features, followed by a relu | ||
| - | - ''D2'': a deconvolution layer, mapping ''H1'' to a tensor that is | + | - ''D2'': a deconvolution layer, mapping ''H1'' to a tensor that is ''[batch_size,14,14,128]'', followed by a relu |
| - | ''[batch_size,14,14,128]'', followed by a relu | + | - ''D3'': a deconvolution layer, mapping ''D2'' to a tensor that is ''[batch_size,28,28,1]'' |
| - | - ''D3'': a deconvolution layer, mapping ''D2'' to a tensor that is | + | |
| - | ''[batch_size,28,28,1]'' | + | |
| - The final output should be sigmoid of ''D3'' | - The final output should be sigmoid of ''D3'' | ||
| Line 159: | Line 149: | ||
| ---- | ---- | ||
| **Part 4: create your loss functions and training ops** | **Part 4: create your loss functions and training ops** | ||
| + | |||
| + | {{ :cs501r_f2016:lab7_graph.png?200|}} | ||
| You should create two loss functions, one for the discriminator, and | You should create two loss functions, one for the discriminator, and | ||
| Line 170: | Line 162: | ||
| relatively simple. Here's how we need to wire up all of the pieces: | relatively simple. Here's how we need to wire up all of the pieces: | ||
| - | - We need to pass the ''z'' variable into the generative model, and | + | - We need to pass the ''z'' variable into the generative model, and call the output ''sample_images'' |
| - | call the output ''sample_images'' | + | - We need to pass some true images into the discriminator, and get back some probabilities. |
| - | - We need to pass some true images into the discriminator, and | + | - We need to pass some sampled images into the discriminator, and get back some (different) probabilities. |
| - | get back some probabilities. | + | - We need to construct a loss function for the discriminator that attempts to maximize the log of the output probabilities on the true images and the log of 1.0 - the output probabilities on the sampled images; these two halves can be summed together |
| - | - We need to pass some sampled images into the discriminator, and | + | - We need to construct a loss function for the generator that attempts to maximize the log of the output probabilities on the sampled images |
| - | get back some (different) probabilities. | + | - For debugging purposes, I highly recommend you create an additional op called ''d_acc'' that calculates classification accuracy on a batch. This can just check the output probabilities of the discriminator on the real and sampled images, and see if they're greater (or less) than 0.5. |
| - | - We need to construct a loss function for the discriminator that | + | |
| - | attempts to maximize the log of the output probabilities on the | + | |
| - | true images and the log of 1.0 - the output probabilities on the | + | |
| - | sampled images; these two halves can be summed together | + | |
| - | - We need to construct a loss function for the generator that | + | |
| - | attempts to maximize the log of the output probabilities on the | + | |
| - | sampled images | + | |
| - | - For debugging purposes, I highly recommend you create an additional op | + | |
| - | called ''d_acc'' that calculates classification accuracy on a batch. | + | |
| - | This can just check the output probabilities of the discriminator on | + | |
| - | the real and sampled images, and see if they're greater (or less) than | + | |
| - | 0.5. | + | |
| **Here's the tricky part**. Note that in wiring up our overall model, | **Here's the tricky part**. Note that in wiring up our overall model, | ||
| Line 206: | Line 186: | ||
| I highly recommend using Tensorboard to visualize your final | I highly recommend using Tensorboard to visualize your final | ||
| - | computation graph to make sure you got this right. | + | computation graph to make sure you got this right. Check out my computation graph image on the right - you can see the two discriminator blocks, and you can see that the same variables are feeding into both of them. |
| ---- | ---- | ||
| Line 222: | Line 202: | ||
| like this: | like this: | ||
| - | '' | + | <code> |
| 0 1.37 0.71 0.88 | 0 1.37 0.71 0.88 | ||
| 10 0.90 0.98 1.00 | 10 0.90 0.98 1.00 | ||
| Line 244: | Line 224: | ||
| ... | ... | ||
| 490 1.25 1.12 0.68 | 490 1.25 1.12 0.68 | ||
| - | '' | + | </code> |
| Note that we see the struggle between the generator and discriminator | Note that we see the struggle between the generator and discriminator | ||
| Line 259: | Line 239: | ||
| 5000 steps, instead of 500. | 5000 steps, instead of 500. | ||
| + | **Hint for debugging**: if you ever see the cost function for the generator going higher and higher, it means that the discriminator is too powerful. | ||