This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cs501r_f2018:lab8 [2018/10/25 21:17] sadler [Hints and implementation notes:] |
cs501r_f2018:lab8 [2018/10/31 17:54] shreeya [Description:] |
||
---|---|---|---|
Line 57: | Line 57: | ||
* A fully connected layer | * A fully connected layer | ||
* 4 convolution transposed layers, followed by a batch norm layer and relu (except for the final layer) | * 4 convolution transposed layers, followed by a batch norm layer and relu (except for the final layer) | ||
- | * Followed by a tanh | + | * Followed by a sigmoid (The true image is between 0 to 1 and you want your gen img to be between 0 and 1 too) |
==Part 1: Implement a discriminator network== | ==Part 1: Implement a discriminator network== | ||
Line 63: | Line 63: | ||
Again, you are encouraged to use either a DCGAN-like architecture, or a ResNet. | Again, you are encouraged to use either a DCGAN-like architecture, or a ResNet. | ||
- | Our reference implementation used 4 convolution layers, each followed by a batch norm layer and leaky relu (leak 0.2)(except for the final layer) Note: No batch norm on the first layer. | + | Our reference implementation used 4 convolution layers, each followed by a batch norm layer and leaky relu (leak 0.2) No batch norm on the first layer. |
Note that the discriminator simply outputs a single scalar value. This value should unconstrained (ie, can be positive or negative), so you should **not** use a relu/sigmoid on the output of your network. | Note that the discriminator simply outputs a single scalar value. This value should unconstrained (ie, can be positive or negative), so you should **not** use a relu/sigmoid on the output of your network. | ||
Line 74: | Line 74: | ||
* **Reuse of variables.** Remember that because the discriminator is being called multiple times, you must ensure that you do not create new copies of the variables. Use ''requires_grad = True'' for the parameters of the discriminator. An easier way to do this would be to iterate through the discriminator model parameters and set ''param.requires_grad = True'' | * **Reuse of variables.** Remember that because the discriminator is being called multiple times, you must ensure that you do not create new copies of the variables. Use ''requires_grad = True'' for the parameters of the discriminator. An easier way to do this would be to iterate through the discriminator model parameters and set ''param.requires_grad = True'' | ||
* **Trainable variables.** In the algorithm, two different Adam optimizers are created, one for the generator, and one for the discriminator. You must make sure that each optimizer is only training the proper subset of variables! | * **Trainable variables.** In the algorithm, two different Adam optimizers are created, one for the generator, and one for the discriminator. You must make sure that each optimizer is only training the proper subset of variables! | ||
+ | |||
+ | <code python> | ||
+ | #initialize your generator and discriminator models | ||
+ | |||
+ | #initialize separate optimizer for both gen and disc | ||
+ | |||
+ | #initialize your dataset and dataloader | ||
+ | |||
+ | for e in epochs: | ||
+ | for true_img in trainloader: | ||
+ | | ||
+ | #train discriminator# | ||
+ | | ||
+ | #because you want to be able to backprop through the params in discriminator | ||
+ | for p in disc_model.parameters(): | ||
+ | p.requires_grad = True | ||
+ | | ||
+ | for p in gen_model.parameters(): | ||
+ | p.requires_grad = False | ||
+ | | ||
+ | for n in range(critic_iters): | ||
+ | disc_optim.zero_grad() | ||
+ | | ||
+ | # generate noise tensor z | ||
+ | # calculate disc loss: you will need autograd.grad | ||
+ | # call dloss.backward() and disc_optim.step() | ||
+ | | ||
+ | #train generator# | ||
+ | for p in disc_model.parameters(): | ||
+ | p.requires_grad = False | ||
+ | | ||
+ | for p in gen_model.parameters(): | ||
+ | p.requires_grad = True | ||
+ | | ||
+ | gen_optim.zero_grad() | ||
+ | | ||
+ | # generate noise tensor z | ||
+ | # calculate loss for gen | ||
+ | # call gloss.backward() and gen_optim.step() | ||
+ | | ||
+ | </code> | ||
Line 86: | Line 127: | ||
---- | ---- | ||
====Hints and implementation notes:==== | ====Hints and implementation notes:==== | ||
+ | |||
+ | We have recently tried turning off the batchnorms in both the generator and discriminator, and have gotten good results -- you may want to start without them, and only add them if you need them. Plus, it's faster without the batchnorms. | ||
The reference implementation was trained for 8 hours on a GTX 1070. It ran for 25 epochs (ie, scan through all 200,000 images), with batches of size 64 (3125 batches / epoch). | The reference implementation was trained for 8 hours on a GTX 1070. It ran for 25 epochs (ie, scan through all 200,000 images), with batches of size 64 (3125 batches / epoch). | ||
- | Although, it might work with far fewer ie 2 epochs... | + | However, we were able to get reasonable (if blurry) faces after training for 2-3 hours. |
I didn't try to optimize the hyperparameters; these are the values that I used: | I didn't try to optimize the hyperparameters; these are the values that I used: | ||
Line 98: | Line 141: | ||
lambda = 10 | lambda = 10 | ||
ncritic = 1 # 5 | ncritic = 1 # 5 | ||
- | alpha = 0.0002 # 0.0001 | + | learning_rate = 0.0002 # 0.0001 |
- | m = 64 | + | batch_size = 200 |
- | batch_norm decay=0.9 | + | batch_norm_decay=0.9 |
- | batch_norm epsilon=1e-5 | + | batch_norm_epsilon=1e-5 |
</code> | </code> | ||
Line 111: | Line 154: | ||
!wget --load-cookies cookies.txt 'https://docs.google.com/uc?export=download&confirm='"$(wget --save-cookies cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=0B7EVK8r0v71pZjFTYXZWM3FlRnM' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')"'&id=0B7EVK8r0v71pZjFTYXZWM3FlRnM' -O img_align_celeba.zip | !wget --load-cookies cookies.txt 'https://docs.google.com/uc?export=download&confirm='"$(wget --save-cookies cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=0B7EVK8r0v71pZjFTYXZWM3FlRnM' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')"'&id=0B7EVK8r0v71pZjFTYXZWM3FlRnM' -O img_align_celeba.zip | ||
!unzip -q img_align_celeba | !unzip -q img_align_celeba | ||
+ | !mkdir test | ||
+ | !mv img_align_celeba test | ||
+ | </code> | ||
+ | |||
+ | And using the data in a dataset class: | ||
+ | <code python> | ||
+ | class CelebaDataset(Dataset): | ||
+ | def __init__(self, root, size=128, train=True): | ||
+ | super(CelebaDataset, self).__init__() | ||
+ | self.dataset_folder = torchvision.datasets.ImageFolder(os.path.join(root) ,transform = transforms.Compose([transforms.Resize((size,size)),transforms.ToTensor()])) | ||
+ | def __getitem__(self,index): | ||
+ | img = self.dataset_folder[index] | ||
+ | return img[0] | ||
+ | | ||
+ | def __len__(self): | ||
+ | return len(self.dataset_folder) | ||
+ | |||
</code> | </code> |