This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cs501r_f2016:lab14 [2017/11/20 20:47] jszendre [Deliverable:] |
cs501r_f2016:lab14 [2021/06/30 23:42] (current) |
||
---|---|---|---|
Line 27: | Line 27: | ||
In order to debug this lab, you will probably want to translate **Spanish to English.** That way, you will be able to judge the quality of the sentences that are coming out! | In order to debug this lab, you will probably want to translate **Spanish to English.** That way, you will be able to judge the quality of the sentences that are coming out! | ||
+ | |||
+ | ---- | ||
+ | ====Scaffolding code:==== | ||
+ | |||
+ | Some starter code is available for download via Dropbox: | ||
+ | |||
+ | [[https://www.dropbox.com/s/g967xqzkmydatxd/nmt_scaffold_v2.py?dl=0|nmt_scaffold_v2.py]] | ||
---- | ---- | ||
Line 43: | Line 50: | ||
Some of the resources for this lab include [[https://arxiv.org/pdf/1409.3215.pdf|Sequence to Sequence Learning with Neural Networks]] and [[https://arxiv.org/pdf/1409.0473.pdf|D Bahdanau, 2015]]. The former will be of more use in implementing the lab. State of the art NMT systems use Badanau's attention mechanism, but context alone should be enough for our dataset. | Some of the resources for this lab include [[https://arxiv.org/pdf/1409.3215.pdf|Sequence to Sequence Learning with Neural Networks]] and [[https://arxiv.org/pdf/1409.0473.pdf|D Bahdanau, 2015]]. The former will be of more use in implementing the lab. State of the art NMT systems use Badanau's attention mechanism, but context alone should be enough for our dataset. | ||
- | Seq2seq and encoder/decoder are nearly synonymous architectures and represent the first major breakthrough using RNNs to map between source and target sequences of differing lengths. The encoder will map input sequences to a fixed length context vector and the decoder will then map that to the output sequence. Standard softmax / cross entropy is used on the scores output by the decoder and compared against the reference sequence. | + | Seq2seq and encoder/decoder are nearly synonymous architectures and represent the first major breakthrough using RNNs to map between source and target sequences of differing lengths. The encoder will map input sequences to a fixed length context vector and the decoder will then map that to the output sequence. Loss is standard cross entropy between the scores output by the decoder and compared against the reference sentence. |
The hyperparameters used are given below. | The hyperparameters used are given below. | ||
Line 166: | Line 173: | ||
</code> | </code> | ||
+ | ---- | ||
+ | ====Pytorch on the supercomputer:==== | ||
+ | |||
+ | The folks at the supercomputer center have installed `pytorch` and `torchvision`. | ||
+ | To use pytorch, you'll need to use the following modules in your SLURM file: | ||
+ | |||
+ | <code bash> | ||
+ | # these are dependencies: | ||
+ | module load cuda/8.0 cudnn/6.0_8.0 | ||
+ | module load python/27 | ||
+ | |||
+ | module load python-pytorch python-torchvision | ||
+ | |||
+ | </code> | ||
+ | |||
+ | If you need more python libraries you can install them to your home directory with: | ||
+ | |||
+ | <code bash> | ||
+ | pip install --user libraryname | ||
+ | </code> | ||