User Tools

Site Tools


cs501r_f2016:lab3

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs501r_f2016:lab3 [2016/09/09 17:10]
wingated
cs501r_f2016:lab3 [2021/06/30 23:42] (current)
Line 38: Line 38:
   * 10% Tidy and legible visualization of cost function   * 10% Tidy and legible visualization of cost function
   * 10% Tidy and legible plot of classification accuracy over time   * 10% Tidy and legible plot of classification accuracy over time
 +  * +5% Complete auto gradient decent lab
  
 ---- ----
Line 57: Line 58:
 </​code>​ </​code>​
  
-Note a couple of things about this code: first, it is fully vectorized. ​ Second, the ''​numerical_gradient''​ function accepts a parameter called ''​loss_function''​ -- ''​numerical_gradient''​ is a higher-order function that accepts another function as an input. ​ This numerical gradient calculator could be used to calculate gradients for any function. ​+Note a couple of things about this code: first, it is fully vectorized. ​ Second, the ''​numerical_gradient''​ function accepts a parameter called ''​loss_function''​ -- ''​numerical_gradient''​ is a higher-order function that accepts another function as an input. ​ This numerical gradient calculator could be used to calculate gradients for any function. Third, you may wonder why my ''​loss_function''​ doesn'​t need the data!  Since the data never changes, I curried it into my loss function, resulting in a function that only takes one parameter -- the matrix ''​W''​.
  
-You should run your code for 1000 epochs.+You should run your code for 1000 epochs. (Here, by epoch, I mean "step in the gradient descent algorithm."​). ​ Note, however, that for each step, you have to calculate the gradient, and in order to calculate the gradient, you will need to evaluate the loss function many times.
  
-You should plot both the loss function and the classification accuracy.+You should plot both the loss function and the classification accuracy ​at each step.
  
 **Preparing the data:** **Preparing the data:**
Line 105: Line 106:
 You should use a linear score function, as discussed in class. ​ This should only be one line of code! You should use a linear score function, as discussed in class. ​ This should only be one line of code!
  
-You should use the log softmax loss function, as discussed in class. ​ For each training instance, you should compute the probability that the instance is classified as class ''​k'',​ using ''​p(instance i = class k) = exp( s_ik ) / sum_j exp( s_ij )''​ (where ''​s_ij''​ is the score of the i'th instance on the j'th class) and then calculate ''​L_i''​ as the log of the probability of the correct class.+You should use the log softmax loss function, as discussed in class. ​ For each training instance, you should compute the probability that the instance ​''​i'' ​is classified as class ''​k'',​ using ''​p(instance i = class k) = exp( s_ik ) / sum_j exp( s_ij )''​ (where ''​s_ij''​ is the score of the i'th instance on the j'th class)and then calculate ''​L_i''​ as the log of the probability of the correct class.  Your overall loss is then the mean of the individual ''​L_i''​ terms.
  
 **Note: you should be careful about numerical underflow!** To help combat that, you should use the **log-sum-exp** trick (or the **exp-normalize** trick): **Note: you should be careful about numerical underflow!** To help combat that, you should use the **log-sum-exp** trick (or the **exp-normalize** trick):
Line 121: Line 122:
 I used a delta of 0.000001. I used a delta of 0.000001.
  
-Please feel free to search around online for resources to +Please feel free to search around online for resources to understand this better. ​ For example:
  
 [[http://​www2.math.umd.edu/​~dlevy/​classes/​amsc466/​lecture-notes/​differentiation-chap.pdf|These lecture notes]] (see eq. 5.1) [[http://​www2.math.umd.edu/​~dlevy/​classes/​amsc466/​lecture-notes/​differentiation-chap.pdf|These lecture notes]] (see eq. 5.1)
Line 148: Line 149:
  
 You may find [[http://​matplotlib.org/​users/​pyplot_tutorial.html|this tutorial on pyplot]] helpful. You may find [[http://​matplotlib.org/​users/​pyplot_tutorial.html|this tutorial on pyplot]] helpful.
 +
 +
 +----
 +====Extra credit:====
 +
 +You may complete the old lab 04 for 5% of extra credits. [[http://​liftothers.org/​dokuwiki/​doku.php?​id=cs501r_f2016:​lab4]]
  
cs501r_f2016/lab3.1473441018.txt.gz ยท Last modified: 2021/06/30 23:40 (external edit)