This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cs501r_f2016:lab6 [2016/09/24 21:01] wingated |
cs501r_f2016:lab6 [2021/06/30 23:42] (current) |
||
---|---|---|---|
Line 6: | Line 6: | ||
====Deliverable:==== | ====Deliverable:==== | ||
- | {{ :cs501r_f2016:lab6_do.png?direct&200|}} | + | {{ :cs501r_f2016:lab6_v2.png?direct&200|}} |
For this lab, you will need to implement three different regularization methods from the literature, and explore the parameters of each. | For this lab, you will need to implement three different regularization methods from the literature, and explore the parameters of each. | ||
Line 41: | Line 41: | ||
To help ensure that everyone is starting off on the same footing, you should download the following scaffold code: | To help ensure that everyone is starting off on the same footing, you should download the following scaffold code: | ||
- | [[http://liftothers.org/byu/lab6_scaffold.py|Lab 6 scaffold code]] | + | [[http://liftothers.org/byu/lab6_scaffold.py|Lab 6 scaffold code (UPDATED WITH RELUs)]] |
For all 3 methods, we will run on a single, deterministic batch of the first 1000 images from the MNIST dataset. This will help us to overfit, and will hopefully be small enough not to tax your computers too much. | For all 3 methods, we will run on a single, deterministic batch of the first 1000 images from the MNIST dataset. This will help us to overfit, and will hopefully be small enough not to tax your computers too much. | ||
Line 85: | Line 85: | ||
For this part, you should implement L1 regularization on the weights. This will change your computation graph a bit, and specifically will change your cost function -- instead of optimizing ''cross_entropy'', you must optimize ''cross_entropy + lam*regularizer'', where ''lam'' is the \lambda parameter from the class slides. | For this part, you should implement L1 regularization on the weights. This will change your computation graph a bit, and specifically will change your cost function -- instead of optimizing ''cross_entropy'', you must optimize ''cross_entropy + lam*regularizer'', where ''lam'' is the \lambda parameter from the class slides. | ||
- | You should place an L1 regularizer on each of the weight and bias variables (a total of 6). A different way of saying this is that the regularization term should be sum of the absolute value of all of the individual variables from all of the weights and biases; that entire sum is then multiplied by \lambda | + | You should place an L1 regularizer on each of the weight and bias variables (a total of 8). A different way of saying this is that the regularization term should be sum of the absolute value of all of the individual variables from all of the weights and biases; that entire sum is then multiplied by \lambda |
You should experiment with a few different values of lambda, and generate a similar plot to those in Part 1 and Part 2. You should test at least the values ''[0.1, 0.01, 0.001]''. | You should experiment with a few different values of lambda, and generate a similar plot to those in Part 1 and Part 2. You should test at least the values ''[0.1, 0.01, 0.001]''. | ||
Line 97: | Line 97: | ||
Note that you should **not** call your regularization variable "lambda" because that is a reserved keyword in python. | Note that you should **not** call your regularization variable "lambda" because that is a reserved keyword in python. | ||
+ | |||
+ | Remember that the "masks" for both dropout and dropconnect change for **every** step in training. | ||