User Tools

Site Tools


cs501r_f2016:lab3

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
cs501r_f2016:lab3 [2016/09/08 16:41]
wingated created
cs501r_f2016:lab3 [2021/06/30 23:42] (current)
Line 12: Line 12:
 To calculate the gradients, you should use numerical differentiation. ​ Because we didn't cover this in class, you will need to read up on it; resources are listed below. To calculate the gradients, you should use numerical differentiation. ​ Because we didn't cover this in class, you will need to read up on it; resources are listed below.
  
-Your code should be fully vectorized. ​ There should only be two ''​for''​ loops in your code: one that iterates over steps in the gradient descent algorithm, and one that loops over parameters to compute numerical gradients.+Your code should be fully vectorized.  ​<del>There should only be two ''​for''​ loops in your code: one that iterates over steps in the gradient descent algorithm, and one that loops over parameters to compute numerical gradients.</​del>​ **Clarification:​** You will have code for both calculating a score function, and a loss function. ​ You are not allowed to use ''​for''​ loops in either one!  However, there may be other places in your code where ''​for''​ loops are unavoidable - for example, the outermost loop (that is running the gradient descent algorithm) needs a ''​for''​ loop, and you may also need to use a ''​for''​ loop to iterate over the parameters as you calculate the gradients (I actually used two for loops - one to iterate over rows of the ''​W''​ matrix, and one to iterate over columns).
  
 Your notebook should display two plots: classification accuracy over time, and the loss function over time.  **Please cleanly label your axes!** Your notebook should display two plots: classification accuracy over time, and the loss function over time.  **Please cleanly label your axes!**
Line 38: Line 38:
   * 10% Tidy and legible visualization of cost function   * 10% Tidy and legible visualization of cost function
   * 10% Tidy and legible plot of classification accuracy over time   * 10% Tidy and legible plot of classification accuracy over time
 +  * +5% Complete auto gradient decent lab
  
 ---- ----
Line 57: Line 58:
 </​code>​ </​code>​
  
-Note a couple of things about this code: first, it is fully vectorized. ​ Second, the ''​numerical_gradient''​ function accepts a parameter called ''​loss_function''​ -- ''​numerical_gradient''​ is a higher-order function that accepts another function as an input. ​ This numerical gradient calculator could be used to calculate gradients for any function. ​+Note a couple of things about this code: first, it is fully vectorized. ​ Second, the ''​numerical_gradient''​ function accepts a parameter called ''​loss_function''​ -- ''​numerical_gradient''​ is a higher-order function that accepts another function as an input. ​ This numerical gradient calculator could be used to calculate gradients for any function. Third, you may wonder why my ''​loss_function''​ doesn'​t need the data!  Since the data never changes, I curried it into my loss function, resulting in a function that only takes one parameter -- the matrix ''​W''​.
  
-You should run your code for 1000 epochs.+You should run your code for 1000 epochs. (Here, by epoch, I mean "step in the gradient descent algorithm."​). ​ Note, however, that for each step, you have to calculate the gradient, and in order to calculate the gradient, you will need to evaluate the loss function many times.
  
-You should plot both the loss function and the classification accuracy.+You should plot both the loss function and the classification accuracy ​at each step.
  
 **Preparing the data:** **Preparing the data:**
Line 105: Line 106:
 You should use a linear score function, as discussed in class. ​ This should only be one line of code! You should use a linear score function, as discussed in class. ​ This should only be one line of code!
  
-You should use the log softmax loss function, as discussed in class. ​ For each training instance, you should compute the probability that the instance is classified as class ''​k'',​ using ''​p(instance i = class k) = exp( s_ik ) / sum_j exp( s_ij )''​ (where ''​s_ij''​ is the score of the i'th instance on the j'th class) and then calculate ''​L_i''​ as the log of the probability of the correct class.+You should use the log softmax loss function, as discussed in class. ​ For each training instance, you should compute the probability that the instance ​''​i'' ​is classified as class ''​k'',​ using ''​p(instance i = class k) = exp( s_ik ) / sum_j exp( s_ij )''​ (where ''​s_ij''​ is the score of the i'th instance on the j'th class)and then calculate ''​L_i''​ as the log of the probability of the correct class.  Your overall loss is then the mean of the individual ''​L_i''​ terms.
  
 **Note: you should be careful about numerical underflow!** To help combat that, you should use the **log-sum-exp** trick (or the **exp-normalize** trick): **Note: you should be careful about numerical underflow!** To help combat that, you should use the **log-sum-exp** trick (or the **exp-normalize** trick):
Line 121: Line 122:
 I used a delta of 0.000001. I used a delta of 0.000001.
  
-Please feel free to search around online for resources to +Please feel free to search around online for resources to understand this better. ​ For example:
  
 [[http://​www2.math.umd.edu/​~dlevy/​classes/​amsc466/​lecture-notes/​differentiation-chap.pdf|These lecture notes]] (see eq. 5.1) [[http://​www2.math.umd.edu/​~dlevy/​classes/​amsc466/​lecture-notes/​differentiation-chap.pdf|These lecture notes]] (see eq. 5.1)
Line 148: Line 149:
  
 You may find [[http://​matplotlib.org/​users/​pyplot_tutorial.html|this tutorial on pyplot]] helpful. You may find [[http://​matplotlib.org/​users/​pyplot_tutorial.html|this tutorial on pyplot]] helpful.
 +
 +
 +----
 +====Extra credit:====
 +
 +You may complete the old lab 04 for 5% of extra credits. [[http://​liftothers.org/​dokuwiki/​doku.php?​id=cs501r_f2016:​lab4]]
  
cs501r_f2016/lab3.1473352886.txt.gz ยท Last modified: 2021/06/30 23:40 (external edit)