Differences

This shows you the differences between two versions of the page.

--- cs501r_f2016:lab3 [2016/09/09 17:09]
wingated
+++ cs501r_f2016:lab3 [2016/09/09 17:11]
wingated
@@ Line 12: / Line 12: @@
 To calculate the gradients, you should use numerical differentiation.  Because we didn't cover this in class, you will need to read up on it; resources are listed below.
-Your code should be fully vectorized.  <del>There should only be two ''for'' loops in your code: one that iterates over steps in the gradient descent algorithm, and one that loops over parameters to compute numerical gradients.</del> **Clarification:** You will have code for both calculating a score function, and a loss function.  You are not allowed to use for loops in either one!  However, there may be other places in your code where ''for'' loops are unavoidable - for example, the outermost loop (that is running the gradient descent algorithm) needs a ''for'' loop, and you may also need to use a ''for'' loop to iterate over the parameters as you calculate the gradients (I actually used two for loops - one to iterate over rows of the ''W'' matrix, and one to iterate over columns).
+Your code should be fully vectorized.  <del>There should only be two ''for'' loops in your code: one that iterates over steps in the gradient descent algorithm, and one that loops over parameters to compute numerical gradients.</del> **Clarification:** You will have code for both calculating a score function, and a loss function.  You are not allowed to use ''for'' loops in either one!  However, there may be other places in your code where ''for'' loops are unavoidable - for example, the outermost loop (that is running the gradient descent algorithm) needs a ''for'' loop, and you may also need to use a ''for'' loop to iterate over the parameters as you calculate the gradients (I actually used two for loops - one to iterate over rows of the ''W'' matrix, and one to iterate over columns).
 Your notebook should display two plots: classification accuracy over time, and the loss function over time.  **Please cleanly label your axes!**
@@ Line 57: / Line 57: @@
 </code>
-Note a couple of things about this code: first, it is fully vectorized.  Second, the ''numerical_gradient'' function accepts a parameter called ''loss_function'' -- ''numerical_gradient'' is a higher-order function that accepts another function as an input.  This numerical gradient calculator could be used to calculate gradients for any function.
+Note a couple of things about this code: first, it is fully vectorized.  Second, the ''numerical_gradient'' function accepts a parameter called ''loss_function'' -- ''numerical_gradient'' is a higher-order function that accepts another function as an input.  This numerical gradient calculator could be used to calculate gradients for any function. Third, you may wonder why my ''loss_function'' doesn't need the data!  Since the data never changes, I curried it into my loss function, resulting in a function that only takes one parameter -- the matrix ''W''.
 You should run your code for 1000 epochs.

BYU CS classes

User Tools

Site Tools

Differences

Page Tools