User Tools

Site Tools


cs501r_f2018:lab8

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs501r_f2018:lab8 [2018/10/29 20:40]
wingated
cs501r_f2018:lab8 [2021/06/30 23:42] (current)
Line 90: Line 90:
     for p in disc_model.parameters():​     for p in disc_model.parameters():​
       p.requires_grad = True       p.requires_grad = True
 +      ​
 +    for p in gen_model.parameters():​
 +      p.requires_grad = False
       ​       ​
     for n in range(critic_iters):​     for n in range(critic_iters):​
Line 101: Line 104:
     for p in disc_model.parameters():​     for p in disc_model.parameters():​
       p.requires_grad = False       p.requires_grad = False
-      gen_optim.zero_grad() 
       ​       ​
-      ​# generate noise tensor z +    for p in gen_model.parameters():​ 
-      # calculate loss for gen +      p.requires_grad = True 
-      # call gloss.backward() and gen_optim.step()+       
 +    gen_optim.zero_grad() 
 +       
 +    ​# generate noise tensor z 
 +    # calculate loss for gen 
 +    # call gloss.backward() and gen_optim.step()
   ​   ​
 </​code>​ </​code>​
Line 120: Line 127:
 ---- ----
 ====Hints and implementation notes:==== ====Hints and implementation notes:====
 +
 +We have recently tried turning off the batchnorms in both the generator and discriminator,​ and have gotten good results -- you may want to start without them, and only add them if you need them.  Plus, it's faster without the batchnorms.
  
 The reference implementation was trained for 8 hours on a GTX 1070.  It ran for 25 epochs (ie, scan through all 200,000 images), with batches of size 64 (3125 batches / epoch). The reference implementation was trained for 8 hours on a GTX 1070.  It ran for 25 epochs (ie, scan through all 200,000 images), with batches of size 64 (3125 batches / epoch).
  
-Althoughit might work with far fewer ie epochs...+Howeverwe were able to get reasonable (if blurry) faces after training for 2-3 hours.
  
 I didn't try to optimize the hyperparameters;​ these are the values that I used: I didn't try to optimize the hyperparameters;​ these are the values that I used:
Line 132: Line 141:
 lambda = 10 lambda = 10
 ncritic = 1 # 5 ncritic = 1 # 5
-alpha = 0.0002 # 0.0001+learning_rate ​= 0.0002 # 0.0001
 batch_size = 200 batch_size = 200
  
-batch_norm decay=0.9 +batch_norm_decay=0.9 
-batch_norm epsilon=1e-5+batch_norm_epsilon=1e-5
 </​code>​ </​code>​
  
Line 145: Line 154:
 !wget --load-cookies cookies.txt '​https://​docs.google.com/​uc?​export=download&​confirm='"​$(wget --save-cookies cookies.txt --keep-session-cookies --no-check-certificate '​https://​docs.google.com/​uc?​export=download&​id=0B7EVK8r0v71pZjFTYXZWM3FlRnM'​ -O- | sed -rn '​s/​.*confirm=([0-9A-Za-z_]+).*/​\1\n/​p'​)"'&​id=0B7EVK8r0v71pZjFTYXZWM3FlRnM'​ -O img_align_celeba.zip !wget --load-cookies cookies.txt '​https://​docs.google.com/​uc?​export=download&​confirm='"​$(wget --save-cookies cookies.txt --keep-session-cookies --no-check-certificate '​https://​docs.google.com/​uc?​export=download&​id=0B7EVK8r0v71pZjFTYXZWM3FlRnM'​ -O- | sed -rn '​s/​.*confirm=([0-9A-Za-z_]+).*/​\1\n/​p'​)"'&​id=0B7EVK8r0v71pZjFTYXZWM3FlRnM'​ -O img_align_celeba.zip
 !unzip -q img_align_celeba !unzip -q img_align_celeba
 +!mkdir test 
 +!mv img_align_celeba test
 </​code>​ </​code>​
  
cs501r_f2018/lab8.1540845603.txt.gz · Last modified: 2021/06/30 23:40 (external edit)