User Tools

Site Tools


cs601r_w2020:lab2

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cs601r_w2020:lab2 [2020/01/10 18:23]
wingated
cs601r_w2020:lab2 [2021/06/30 23:42]
Line 1: Line 1:
-====Objective:​==== 
- 
-Explore careful hyperparameter tuning in pytorch. ​ Gain experience and confidence in carefully comparing multiple options. 
- 
----- 
-====Deliverable:​==== 
- 
-For this lab, you will submit an ipython notebook via learningsuite. ​ Your notebook will contain two parts, as described below. 
- 
----- 
-====Grading standards:​==== 
- 
-Your notebook will be graded on the following: 
- 
-  * 35% Part 1: Clearly displayed 10 bars (one for baseline, one for each tweak independently) ​ 
-  * 5%  Part 1: Small writeup of conclusions from independent tweaks 
-  * 25% Part 2: Clear explanation of your tweaking strategy 
-  * 25% Part 2: Actually run your tweaking strategy and show the results 
-  * 10% Tidy and legible figures, including labeled axes where appropriate 
-  * 10% Extra credit - Error bars on your figure in Part 1. 
- 
----- 
-====Description:​==== 
- 
-The goal of this lab is to learn how to explore the combinatorial space of possible hyperparameter settings. 
- 
-Many deep learning papers present some sort of tweak on standard deep learning, and empirically illustrate that it improves performance (ideally across a wide variety of architectures and datasets). ​ It quickly becomes hard to know: which, if any, of these tweaks are truly important - and how do they work when combined? 
- 
-For this lab, you will explore various tweaks to the basic classifier you coded in lab 1. There are two parts to the lab. 
- 
----- 
-====Part 1==== 
- 
-You must clearly show the individual effect of each tweak compared to the baseline. ​ For this part, you should present a simple bar chart (or possibly two or more, depending on your layout), clearly labeled with the baseline performance,​ and then the performance of each tweak relative to baseline. ​ You may plot absolute or relative performances;​ whichever is clearer. 
- 
-You must include a few sentences describing what you can conclude from evaluating all of these tweaks. 
- 
-**Note:** I am not requiring error bars for this lab, because they are computationally intensive. ​ I have made them extra credit -- although if we were doing this for real, they would be absolutely required! 
- 
----- 
-====Part 2==== 
- 
-You must think about how to find the best combination of tweaks. ​ There is no right answer to this part; I want you to think carefully about how to search the space of possible combinations,​ and come up with a reasonable method for settling on a final combination of tweaks. ​ I have tried to provide enough tweaks that it should be impossible to brute-force try all possible combinations (although that is certainly a valid strategy!). 
- 
-For this part, you must include in your notebook a simple writeup describing your strategy (just a paragraph or two), and then show the final performance of whatever combination you hit upon. 
- 
-Note that you will not be graded on absolute performance of any run; what is important is thinking clearly through which tweaks make a difference. 
- 
----- 
-====The Tweaks==== 
- 
-Your baseline classifier must be a "​vanilla"​ classifier, with none of the features listed below. ​ We will systematically add them in. 
- 
-You must test the following: ​ 
- 
-    * Activation functions: relu (baseline), leakyrelu, selu, elu, hardshrink 
-    * Batchnorm: off (baseline), on (use one batchnorm per residual block) 
-    * Label smoothing: off (baseline), on 
-    * Learning rate: constant (baseline), CLR 
-    * Regularization:​ off (baseline), dropout 
-    * Initialization:​ xavier/he (baseline), orthogonal 
- 
-So, for part one, your bar chart should have **10 different bars**. 
- 
-Some of these tweaks require additional parameters. ​ You should either leave them at their default values, or think of some reasonable way to set them.  
- 
-Note: pytorch does not (AFAIK) natively implement label smoothing. ​ In the interests of focusing on hyperparameter searching, **you may verbatim copy any internet code you like to help implement label smoothing.** 
- 
----- 
-====Hints==== 
- 
-Activation functions and dropout can all be found in torch.nn 
- 
-Initialization functions can be found in torch.nn.init 
- 
-This lab should be pretty straightforward,​ with the right script -- you should be able to iterate over tweaks and run your classifier in a tidy loop. Ideally, you'll code it up, let it run, and come back in a few hours to find the results! 
- 
-If you find yourself cutting-and-pasting,​ you might want to rethink your strategy. 
- 
  
cs601r_w2020/lab2.txt ยท Last modified: 2021/06/30 23:42 (external edit)