This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
cs501r_f2016:lab14 [2017/11/20 20:06] jszendre [Deliverable:] |
cs501r_f2016:lab14 [2017/11/20 20:08] jszendre [Notes:] |
||
---|---|---|---|
Line 155: | Line 155: | ||
Debugging in PyTorch is significantly more straightforward than in TensorFlow. Tensors are available at any time to print or log. | Debugging in PyTorch is significantly more straightforward than in TensorFlow. Tensors are available at any time to print or log. | ||
- | Better hyperparameters to come. Started to converge after two hours on a K80. | + | Better hyperparameters to come. Started to converge after two hours on a K80 using Adam. |
<code python> | <code python> | ||
- | learning_rate = .01 # decayed, lowest .0001 | + | learning_rate = .01 # decayed |
batch_size = 40 # effective batch size | batch_size = 40 # effective batch size | ||
- | max_seq_length = 40 # ambitious | + | max_seq_length = 30 |
- | hidden_dim = 1024 # can use larger | + | hidden_dim = 1024 |
</code> | </code> | ||