User Tools

Site Tools


cs501r_f2016:lab6

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs501r_f2016:lab6 [2016/09/24 20:59]
wingated
cs501r_f2016:lab6 [2021/06/30 23:42] (current)
Line 6: Line 6:
 ====Deliverable:​==== ====Deliverable:​====
  
-{{ :​cs501r_f2016:​lab6_do.png?​direct&​200|}}+{{ :​cs501r_f2016:​lab6_v2.png?​direct&​200|}}
  
 For this lab, you will need to implement three different regularization methods from the literature, and explore the parameters of each. For this lab, you will need to implement three different regularization methods from the literature, and explore the parameters of each.
Line 22: Line 22:
 An example of my training/​test performance for dropout is shown at the right. An example of my training/​test performance for dropout is shown at the right.
  
-**NOTE**: because this lab can be more computationally time consuming than the others (since we're scanning across parameters),​ you are welcome to turn in your plots and your code separately. ​ (This means, for example, that you can develop and run all of your code using an IDE other than the Jupyter notebook, collect the data, and then run a separate little script to generate the plots. ​ Or, a particularly enterprising student may use his or her new supercomputer account to sweep all of the parameter values in parallel (!) ).  ​You will need to zip up your images and code into a single file for submission to Learning Suite.+**NOTE**: because this lab can be more computationally time consuming than the others (since we're scanning across parameters),​ you are welcome to turn in your plots and your code separately. ​ (This means, for example, that you can develop and run all of your code using an IDE other than the Jupyter notebook, collect the data, and then run a separate little script to generate the plots. ​ Or, a particularly enterprising student may use his or her new supercomputer account to sweep all of the parameter values in parallel (!) ).  ​If you do this, you will need to zip up your images and code into a single file for submission to Learning Suite.
  
 ---- ----
Line 37: Line 37:
 ====Description:​==== ====Description:​====
  
-This lab is a chance for you to start reading the literature on deep neural networks, and understand how to replicate methods from the literature. ​ You will implement ​different regularization methods, and will benchmark each one.+This lab is a chance for you to start reading the literature on deep neural networks, and understand how to replicate methods from the literature. ​ You will implement ​different regularization methods, and will benchmark each one.
  
 To help ensure that everyone is starting off on the same footing, you should download the following scaffold code: To help ensure that everyone is starting off on the same footing, you should download the following scaffold code:
  
-[[http://​liftothers.org/​byu/​lab6_scaffold.py|Lab 6 scaffold code]]+[[http://​liftothers.org/​byu/​lab6_scaffold.py|Lab 6 scaffold code (UPDATED WITH RELUs)]]
  
 For all 3 methods, we will run on a single, deterministic batch of the first 1000 images from the MNIST dataset. ​ This will help us to overfit, and will hopefully be small enough not to tax your computers too much. For all 3 methods, we will run on a single, deterministic batch of the first 1000 images from the MNIST dataset. ​ This will help us to overfit, and will hopefully be small enough not to tax your computers too much.
Line 85: Line 85:
 For this part, you should implement L1 regularization on the weights. ​ This will change your computation graph a bit, and specifically will change your cost function -- instead of optimizing ''​cross_entropy'',​ you must optimize ''​cross_entropy + lam*regularizer'',​ where ''​lam''​ is the \lambda parameter from the class slides. For this part, you should implement L1 regularization on the weights. ​ This will change your computation graph a bit, and specifically will change your cost function -- instead of optimizing ''​cross_entropy'',​ you must optimize ''​cross_entropy + lam*regularizer'',​ where ''​lam''​ is the \lambda parameter from the class slides.
  
-You should place an L1 regularizer on each of the weight and bias variables (a total of 6).  A different way of saying this is that the regularization term should be sum of the absolute value of all of the individual variables from all of the weights and biases; that entire sum is then multiplied by \lambda+You should place an L1 regularizer on each of the weight and bias variables (a total of 8).  A different way of saying this is that the regularization term should be sum of the absolute value of all of the individual variables from all of the weights and biases; that entire sum is then multiplied by \lambda
  
 You should experiment with a few different values of lambda, and generate a similar plot to those in Part 1 and Part 2.  You should test at least the values ''​[0.1,​ 0.01, 0.001]''​. You should experiment with a few different values of lambda, and generate a similar plot to those in Part 1 and Part 2.  You should test at least the values ''​[0.1,​ 0.01, 0.001]''​.
Line 97: Line 97:
  
 Note that you should **not** call your regularization variable "​lambda"​ because that is a reserved keyword in python. Note that you should **not** call your regularization variable "​lambda"​ because that is a reserved keyword in python.
 +
 +Remember that the "​masks"​ for both dropout and dropconnect change for **every** step in training.
  
cs501r_f2016/lab6.1474750768.txt.gz ยท Last modified: 2021/06/30 23:40 (external edit)