Differences

This shows you the differences between two versions of the page.

--- cs501r_f2016:tmp [2016/09/24 20:36]
wingated
+++ cs501r_f2016:tmp [2016/09/24 20:38]
wingated
@@ Line 12: / Line 12: @@
   - You must implement dropout (NOT using the pre-defined Tensorflow layers)
   - You must implement dropconnect
-  - You must experiment with L1/L2 weight regularization
+  - You must experiment with L1 weight regularization
 You should  turn in an iPython notebook that shows three plots, one for each of the regularization methods.
@@ Line 47: / Line 47: @@
 For the first part of the lab, you should implement dropout.  The paper upon which you should base your implementation is found at:
-[[https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf|Dropout]]
+[[https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf|The dropout paper]]
 The relevant equations are found in section 4 (pg 1933).  You may also refer to the slides.
@@ Line 64: / Line 64: @@
 Once you understand dropout, implementing it is not hard; you should only have to add ~10 lines of code.
+Also note that because dropout involves some randomness, your curve may not match mine exactly; this is expected.
 **Part 2: implement dropconnect**
@@ Line 77: / Line 79: @@
 Dropconnect seems to want more training steps than dropout, so you should run the optimizer for 1500 iterations.
-**Part 3: implement L1/L2 regularization**
+**Part 3: implement L1 regularization**
-For this part, you should implement both L1 and L2 regularization on the weights.  This will change your computation graph a bit, and specifically will change your cost function -- instead of optimizing just ''cross_entropy'', you should optimize ''cross_entropy + lam*regularizers'', where ''lam'' is the \lambda regularization parameter from the slides.  You should regularize all of the weights and biases (six variables in total).
+For this part, you should implement L1 regularization on the weights.  This will change your computation graph a bit, and specifically will change your cost function -- instead of optimizing just ''cross_entropy'', you should optimize ''cross_entropy + lam*regularizers'', where ''lam'' is the \lambda regularization parameter from the slides.  You should regularize all of the weights and biases (six variables in total).
 You should create a plot of test/training performance as you scan across values of lambda.  You should test at least [0.1, 0.01, 0.001].

BYU CS classes

User Tools

Site Tools

Differences

Page Tools