User Tools

Site Tools


cs501r_f2016:tmp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
cs501r_f2016:tmp [2016/09/24 20:36]
wingated
cs501r_f2016:tmp [2016/09/24 20:38]
wingated
Line 12: Line 12:
   - You must implement dropout (NOT using the pre-defined Tensorflow layers)   - You must implement dropout (NOT using the pre-defined Tensorflow layers)
   - You must implement dropconnect   - You must implement dropconnect
-  - You must experiment with L1/L2 weight regularization+  - You must experiment with L1 weight regularization
  
 You should ​ turn in an iPython notebook that shows three plots, one for each of the regularization methods. You should ​ turn in an iPython notebook that shows three plots, one for each of the regularization methods.
Line 47: Line 47:
 For the first part of the lab, you should implement dropout. ​ The paper upon which you should base your implementation is found at: For the first part of the lab, you should implement dropout. ​ The paper upon which you should base your implementation is found at:
  
-[[https://​www.cs.toronto.edu/​~hinton/​absps/​JMLRdropout.pdf|Dropout]]+[[https://​www.cs.toronto.edu/​~hinton/​absps/​JMLRdropout.pdf|The dropout paper]]
  
 The relevant equations are found in section 4 (pg 1933). ​ You may also refer to the slides. The relevant equations are found in section 4 (pg 1933). ​ You may also refer to the slides.
Line 64: Line 64:
  
 Once you understand dropout, implementing it is not hard; you should only have to add ~10 lines of code. Once you understand dropout, implementing it is not hard; you should only have to add ~10 lines of code.
 +
 +Also note that because dropout involves some randomness, your curve may not match mine exactly; this is expected.
  
 **Part 2: implement dropconnect** **Part 2: implement dropconnect**
Line 77: Line 79:
 Dropconnect seems to want more training steps than dropout, so you should run the optimizer for 1500 iterations. Dropconnect seems to want more training steps than dropout, so you should run the optimizer for 1500 iterations.
  
-**Part 3: implement L1/L2 regularization**+**Part 3: implement L1 regularization**
  
-For this part, you should implement ​both L1 and L2 regularization on the weights. ​ This will change your computation graph a bit, and specifically will change your cost function -- instead of optimizing just ''​cross_entropy'',​ you should optimize ''​cross_entropy + lam*regularizers'',​ where ''​lam''​ is the \lambda regularization parameter from the slides. ​ You should regularize all of the weights and biases (six variables in total).+For this part, you should implement L1 regularization on the weights. ​ This will change your computation graph a bit, and specifically will change your cost function -- instead of optimizing just ''​cross_entropy'',​ you should optimize ''​cross_entropy + lam*regularizers'',​ where ''​lam''​ is the \lambda regularization parameter from the slides. ​ You should regularize all of the weights and biases (six variables in total).
  
 You should create a plot of test/​training performance as you scan across values of lambda. ​ You should test at least [0.1, 0.01, 0.001]. You should create a plot of test/​training performance as you scan across values of lambda. ​ You should test at least [0.1, 0.01, 0.001].
cs501r_f2016/tmp.txt ยท Last modified: 2021/06/30 23:42 (external edit)