This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cs501r_f2016:tmp [2016/09/24 20:38] wingated |
cs501r_f2016:tmp [2016/09/24 20:45] wingated |
||
---|---|---|---|
Line 12: | Line 12: | ||
- You must implement dropout (NOT using the pre-defined Tensorflow layers) | - You must implement dropout (NOT using the pre-defined Tensorflow layers) | ||
- You must implement dropconnect | - You must implement dropconnect | ||
- | - You must experiment with L1 weight regularization | + | - You must implement L1 weight regularization |
You should turn in an iPython notebook that shows three plots, one for each of the regularization methods. | You should turn in an iPython notebook that shows three plots, one for each of the regularization methods. | ||
Line 18: | Line 18: | ||
- For dropout: a plot showing training / test performance as a function of the "keep probability". | - For dropout: a plot showing training / test performance as a function of the "keep probability". | ||
- For dropconnect: the same | - For dropconnect: the same | ||
- | - For L1/L2: a plot showing training / test performance as a function of the regularization strength, \lambda | + | - For L1 a plot showing training / test performance as a function of the regularization strength, \lambda |
- | An example of my training/test performance is shown at the right. | + | An example of my training/test performance for dropout is shown at the right. |
---- | ---- | ||
Line 29: | Line 29: | ||
* 40% Correct implementation of Dropout | * 40% Correct implementation of Dropout | ||
* 30% Correct implementation of Dropconnect | * 30% Correct implementation of Dropconnect | ||
- | * 20% Correct implementation of L1/L2 regularization | + | * 20% Correct implementation of L1 regularization |
* 10% Tidy and legible plots | * 10% Tidy and legible plots | ||
Line 39: | Line 39: | ||
To help ensure that everyone is starting off on the same footing, you should download the following scaffold code: | To help ensure that everyone is starting off on the same footing, you should download the following scaffold code: | ||
+ | [[http://liftothers.org/byu/lab6_scaffold.py|Lab 6 scaffold code]] | ||
For all 4 methods, we will run on a single, deterministic batch of the first 1000 images from the MNIST dataset. This will help us to overfit, and will hopefully be small enough not to tax your computers too much. | For all 4 methods, we will run on a single, deterministic batch of the first 1000 images from the MNIST dataset. This will help us to overfit, and will hopefully be small enough not to tax your computers too much. | ||
Line 75: | Line 75: | ||
**Important note**: the dropconnect paper has a somewhat more sophisticated inference method (that is, the method used at test time). **We will not use that method.** Instead, we will use the same inference approximation used by the Dropout paper -- we will simply scale things by the ''keep_probability''. | **Important note**: the dropconnect paper has a somewhat more sophisticated inference method (that is, the method used at test time). **We will not use that method.** Instead, we will use the same inference approximation used by the Dropout paper -- we will simply scale things by the ''keep_probability''. | ||
- | You should scan across the same values of ''keep_probability'', and you should generate the same plot. | + | You should scan across the same values of ''keep_probability'', and you should generate a similar plot. |
Dropconnect seems to want more training steps than dropout, so you should run the optimizer for 1500 iterations. | Dropconnect seems to want more training steps than dropout, so you should run the optimizer for 1500 iterations. |