User Tools

Site Tools


cs501r_f2016:lab14

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cs501r_f2016:lab14 [2017/11/20 20:47]
jszendre [Deliverable:]
cs501r_f2016:lab14 [2021/06/30 23:42] (current)
Line 27: Line 27:
  
 In order to debug this lab, you will probably want to translate **Spanish to English.** That way, you will be able to judge the quality of the sentences that are coming out! In order to debug this lab, you will probably want to translate **Spanish to English.** That way, you will be able to judge the quality of the sentences that are coming out!
 +
 +----
 +====Scaffolding code:====
 +
 +Some starter code is available for download via Dropbox:
 +
 +[[https://​www.dropbox.com/​s/​g967xqzkmydatxd/​nmt_scaffold_v2.py?​dl=0|nmt_scaffold_v2.py]]
  
 ---- ----
Line 43: Line 50:
 Some of the resources for this lab include [[https://​arxiv.org/​pdf/​1409.3215.pdf|Sequence to Sequence Learning with Neural Networks]] and [[https://​arxiv.org/​pdf/​1409.0473.pdf|D Bahdanau, 2015]]. The former will be of more use in implementing the lab. State of the art NMT systems use Badanau'​s attention mechanism, but context alone should be enough for our dataset. Some of the resources for this lab include [[https://​arxiv.org/​pdf/​1409.3215.pdf|Sequence to Sequence Learning with Neural Networks]] and [[https://​arxiv.org/​pdf/​1409.0473.pdf|D Bahdanau, 2015]]. The former will be of more use in implementing the lab. State of the art NMT systems use Badanau'​s attention mechanism, but context alone should be enough for our dataset.
  
-Seq2seq and encoder/​decoder are nearly synonymous architectures and represent the first major breakthrough using RNNs to map between source and target sequences of differing lengths. The encoder will map input sequences to a fixed length context vector and the decoder will then map that to the output sequence. ​Standard softmax / cross entropy ​is used on the scores output by the decoder and compared against the reference ​sequence.+Seq2seq and encoder/​decoder are nearly synonymous architectures and represent the first major breakthrough using RNNs to map between source and target sequences of differing lengths. The encoder will map input sequences to a fixed length context vector and the decoder will then map that to the output sequence. ​Loss is standard ​cross entropy ​between ​the scores output by the decoder and compared against the reference ​sentence.
  
 The hyperparameters used are given below. The hyperparameters used are given below.
Line 166: Line 173:
 </​code>​ </​code>​
  
 +----
 +====Pytorch on the supercomputer:​====
 +
 +The folks at the supercomputer center have installed `pytorch` and `torchvision`.
 +To use pytorch, you'll need to use the following modules in your SLURM file:
 +
 +<code bash>
 +    # these are dependencies:​
 +    module load cuda/8.0 cudnn/​6.0_8.0
 +    module load python/27
 +
 +    module load python-pytorch python-torchvision
 +
 +</​code>​
 +
 +If you need more python libraries you can install them to your home directory with:
 +
 +<code bash>
 +    pip install --user libraryname
 +</​code>​
  
  
  
cs501r_f2016/lab14.1511210860.txt.gz · Last modified: 2021/06/30 23:40 (external edit)