This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cs501r_f2016:lab3 [2016/09/09 17:09] wingated |
cs501r_f2016:lab3 [2016/09/09 17:11] wingated |
||
---|---|---|---|
Line 12: | Line 12: | ||
To calculate the gradients, you should use numerical differentiation. Because we didn't cover this in class, you will need to read up on it; resources are listed below. | To calculate the gradients, you should use numerical differentiation. Because we didn't cover this in class, you will need to read up on it; resources are listed below. | ||
- | Your code should be fully vectorized. <del>There should only be two ''for'' loops in your code: one that iterates over steps in the gradient descent algorithm, and one that loops over parameters to compute numerical gradients.</del> **Clarification:** You will have code for both calculating a score function, and a loss function. You are not allowed to use for loops in either one! However, there may be other places in your code where ''for'' loops are unavoidable - for example, the outermost loop (that is running the gradient descent algorithm) needs a ''for'' loop, and you may also need to use a ''for'' loop to iterate over the parameters as you calculate the gradients (I actually used two for loops - one to iterate over rows of the ''W'' matrix, and one to iterate over columns). | + | Your code should be fully vectorized. <del>There should only be two ''for'' loops in your code: one that iterates over steps in the gradient descent algorithm, and one that loops over parameters to compute numerical gradients.</del> **Clarification:** You will have code for both calculating a score function, and a loss function. You are not allowed to use ''for'' loops in either one! However, there may be other places in your code where ''for'' loops are unavoidable - for example, the outermost loop (that is running the gradient descent algorithm) needs a ''for'' loop, and you may also need to use a ''for'' loop to iterate over the parameters as you calculate the gradients (I actually used two for loops - one to iterate over rows of the ''W'' matrix, and one to iterate over columns). |
Your notebook should display two plots: classification accuracy over time, and the loss function over time. **Please cleanly label your axes!** | Your notebook should display two plots: classification accuracy over time, and the loss function over time. **Please cleanly label your axes!** | ||
Line 57: | Line 57: | ||
</code> | </code> | ||
- | Note a couple of things about this code: first, it is fully vectorized. Second, the ''numerical_gradient'' function accepts a parameter called ''loss_function'' -- ''numerical_gradient'' is a higher-order function that accepts another function as an input. This numerical gradient calculator could be used to calculate gradients for any function. | + | Note a couple of things about this code: first, it is fully vectorized. Second, the ''numerical_gradient'' function accepts a parameter called ''loss_function'' -- ''numerical_gradient'' is a higher-order function that accepts another function as an input. This numerical gradient calculator could be used to calculate gradients for any function. Third, you may wonder why my ''loss_function'' doesn't need the data! Since the data never changes, I curried it into my loss function, resulting in a function that only takes one parameter -- the matrix ''W''. |
You should run your code for 1000 epochs. | You should run your code for 1000 epochs. |