Differences

This shows you the differences between two versions of the page.

--- cs501r_f2018:lab6 [2018/10/02 16:01]
carr
+++ cs501r_f2018:lab6 [2021/06/30 23:42] (current)
@@ Line 1: / Line 1: @@
 ====Objective:====
-Work with sequential data in Pytorch by building a Char-RNN for text generation
+To learn about recurrent neural networks, LSTMs, GRUs and
+Pytorch sequence-to-sequence capabilities.
 ----
 ====Deliverable:====
-For this lab, you will submit an ipython notebook via learningsuite.
+For this lab, you will need to implement the `char-rnn` model of Karpathy.
+You will train it on a text corpus that you're interested in, and then
+show samples from the model.
-There are many resources for character level recurrent neural networks. This [[http://karpathy.github.io/2015/05/21/rnn-effectiveness/|Blog Post]] will be helpful in understanding the potential, and getting a basic understanding.
+This lab is slightly different than previous labs in that we give you a large portion of the code, and you will just be filling in pieces of classes and functions. If you get suck, please get help from the TA's or your classmates.
+You should turn in your jupyter notebook
+including novel text samples showing that your code is working: first, after training on
+the provided "alma.txt" dataset, and second, samples from your
+network trained on a dataset of your choice.
- This lab will have three parts:
+An example of my final samples are shown below (more detail in the
+final section of this writeup), after 150 passes through the data.
+Please generate about 15 samples for each dataset.
-**Part 1:** Build RNN with built-in methods, train on _textfile.txt_
+<code>
+And ifte thin forgision forward thene over up to a fear not your
-**Part 2:** Build your own LSTM Cell
+And freitions, which is great God. Behold these are the loss sub
+And ache with the Lord hath bloes, which was done to the holy Gr
-**Part 3:** Build your own GRU Cell
+And appeicis arm vinimonahites strong in name, to doth piseling
+And miniquithers these words, he commanded order not; neither sa
-**Part 4:** Generate awesome text with a dataset of your choice
+And min for many would happine even to the earth, to said unto m
+And mie first be traditions? Behold, you, because it was a sound
+And from tike ended the Lamanites had administered, and I say bi
+</code>
-This is an example output from The Lord of the Rings, after only 20 minutes of training.
+Please turn in your samples inside of the jupyter notebook, not in a separate file.
-"Who now further here the learnest and
-south, looking slow you beastion, and that is all plainly day."
 ----
 ====Grading standards:====
-Your notebook will be graded on the following:
+Your code/image will be graded on the following:
-  * 100% Build something amazing
+  * 40% Correct implementation of the sequence to sequence class
-  * 20% Modified code to include a test/train split
+  * 20% Correct implementation of training and sampling
-  * 20% Modified code to include a visualization of train/test losses
+  * 5%  Correct implementation of GRU cell
-  * 10% Tidy and legible figures, including labeled axes where appropriate
+  * 20% Training and sampling on a novel text dataset (Must be your choice)
+  * 15% Good coding style, readable output
 ----
 ====Description:====
-At this point in the semester, we have worked primarily with
+For this lab, you will code up the
+[[http://karpathy.github.io/2015/05/21/rnn-effectiveness/|char-rnn
+model of Karpathy]].  This is a recurrent neural network that is
+trained probabilistically on sequences of characters, and that can
+then be used to sample new sequences that are like the original.
+This lab will help you develop several new skills, as well
+as understand some best practices needed for building large models.
+In addition, we'll be able to create networks that generate neat text!
+There are two parts of this lab: first, wiring up a basic
+sequence-to-sequence computation graph, and second, implementing your
+own GRU cell.
+Your data can be found at the following link, you should download and look at the
+[[http://liftothers.org/cs501r_f2018/text_files.tar.gz|Text Files]]
 ----
-====Part 1 detailed outline:====
+**Part 0: Readings, data loading, and high level training**
+There is a tutorial here that will help build out scaffolding code, and get an understanding of using sequences in pytorch.
+[[https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html|Pytorch
+sequence-to-sequence tutorial]]
+[[http://colah.github.io/posts/2015-08-Understanding-LSTMs/|Understanding LSTM Networks]]
+<code bash>
+! wget -O ./text_files.tar.gz 'https://piazza.com/redirect/s3?bucket=uploads&prefix=attach%2Fjlifkda6h0x5bk%2Fhzosotq4zil49m%2Fjn13x09arfeb%2Ftext_files.tar.gz'
+! tar -xzf text_files.tar.gz
+! pip install unidecode
+! pip install torch
+</code>
-**Step 1.** Get a colab notebook up and running with GPUs enabled.
-**Step 2.** Install pytorch and torchvision
 <code python>
-!pip3 install torch
+import unidecode
-!pip3 install torchvision
+import string
-!pip3 install tqdm
+import random
+import re
+import pdb
+all_characters = string.printable
+n_characters = len(all_characters)
+file = unidecode.unidecode(open('./text_files/lotr.txt').read())
+file_len = len(file)
+print('file_len =', file_len)
 </code>
-**Step 3.** Import pytorch and other important classes
+<code python>
+chunk_len = 200
+def random_chunk():
+    start_index = random.randint(0, file_len - chunk_len)
+    end_index = start_index + chunk_len + 1
+    return file[start_index:end_index]
+print(random_chunk())
+</code>
+<code python>
+import torch
+from torch.autograd import Variable
+# Turn string into list of longs
+def char_tensor(string):
+    tensor = torch.zeros(len(string)).long()
+    for c in range(len(string)):
+        tensor[c] = all_characters.index(string[c])
+    return Variable(tensor)
+print(char_tensor('abcDEF'))
+</code>
+<code python>
+def random_training_set():
+    chunk = random_chunk()
+    inp = char_tensor(chunk[:-1])
+    target = char_tensor(chunk[1:])
+    return inp, target
+</code>
+<code python>
+import time
+n_epochs = 2000
+print_every = 100
+plot_every = 10
+hidden_size = 100
+n_layers = 1
+lr = 0.005
+decoder = RNN(n_characters, hidden_size, n_characters, n_layers)
+decoder_optimizer = torch.optim.Adam(decoder.parameters(), lr=lr)
+criterion = nn.CrossEntropyLoss()
+start = time.time()
+all_losses = []
+loss_avg = 0
+for epoch in range(1, n_epochs + 1):
+    loss_ = train(*random_training_set())
+    loss_avg += loss_
+    if epoch % print_every == 0:
+        print('[%s (%d %d%%) %.4f]' % (time.time() - start, epoch, epoch / n_epochs * 100, loss_))
+        print(evaluate('Wh', 100), '\n')
+    if epoch % plot_every == 0:
+        all_losses.append(loss_avg / plot_every)
+        loss_avg = 0
+</code>
+----
+**Part 1: Building a sequence to sequence model**
+Great! We have the data in a useable form. We can switch out which text file we are reading from, and trying to simulate.
+We now want to build out an RNN model, in this section, we will use all built in Pytorch pieces when building our RNN class.
+Create an RNN class that extends from nn.Module.
 <code python>
 import torch
 import torch.nn as nn
-import torch.nn.functional as F
+from torch.autograd import Variable
-import torch.optim as optim
-from torch.utils.data import Dataset, DataLoader
-import numpy as np
-import matplotlib.pyplot as plt
-from torchvision import transforms, utils, datasets
-from tqdm import tqdm
-assert torch.cuda.is_available() # You need to request a GPU from Runtime > Change Runtime Type
+class RNN(nn.Module):
+    def __init__(self, input_size, hidden_size, output_size, n_layers=1):
+        super(RNN, self).__init__()
+        self.input_size = input_size
+        self.hidden_size = hidden_size
+        self.output_size = output_size
+        self.n_layers = n_layers
+        # encode using embedding layer
+        # set up GRU passing in number of layers parameter (nn.GRU)
+        # decode output
+    def forward(self, input_char, hidden):
+        # by reviewing the documentation, construct a forward function that properly uses the output
+        # of the GRU
+        # return output and hidden
+    def init_hidden(self):
+        return Variable(torch.zeros(self.n_layers, 1, self.hidden_size))
 </code>
-**Step 4.** Construct
+----
+**Part 2: Sample text and Training information**
-- a model class that inherits from “nn.Module”
+We now want to be able to train our network, and sample text after training.
-  * Check out [[https://pytorch.org/docs/stable/nn.html#torch.nn.Module]]
-  * Your model can contain any submodules you wish -- nn.Linear is a good, easy, starting point
-- a dataset class that inherits from “Dataset” and produces samples from [[https://pytorch.org/docs/stable/torchvision/datasets.html#fashion-mnist]]
-  * You may be tempted to use this dataset directly (as it already inherits from Dataset) but we want you to learn how a dataset is constructed. Your class should be pretty simple and output items from FashionMNIST
-**Step 5.** Create instances of the following objects:
+This function outlines how training a sequence style network goes. Fill in the pieces.
-  * SGD optimizer Check out [[https://pytorch.org/docs/stable/optim.html#torch.optim.SGD]]
+<code python>
-  * your model
+def train(inp, target):
-  * the DataLoader class using your dataset
+    ## initialize hidden layers, set up gradient and loss
-  * MSE loss function [[https://pytorch.org/docs/stable/nn.html#torch.nn.MSELoss]]
+      # your code here
+    ## /
+    loss = 0
+    for c in range(chunk_len):
+        output, hidden = # run the forward pass of your rnn with proper input
+        loss += criterion(output, target[c].unsqueeze(0))
-**Step 6.** Loop over your training dataloader, inside of this loop you should
+    ## calculate backwards loss and step the optimizer (globally)
+      # your code here
+    ## /
-  * zero out your gradients
+    return loss.item() / chunk_len
-  * compute the loss between your model and the true value
+</code>
-  * take a step on the optimizer
+You can at this time, if you choose, also write out your train loop boilerplate that samples random sequences and trains your RNN. This will be helpful to have working before writing your own GRU class.
+If you are finished training, or during training, and you want to sample from the network you may consider using the following function. If your RNN model is instantiated as `decoder`then this will probabilistically sample a sequence of length `predict_len`
+<code python>
+def evaluate(prime_str='A', predict_len=100, temperature=0.8):
+    ## initialize hidden variable, initialize other useful variables
+      # your code here
+    ## /
+    prime_input = char_tensor(prime_str)
+    # Use priming string to "build up" hidden state
+    for p in range(len(prime_str) - 1):
+        _, hidden = decoder(prime_input[p], hidden)
+    inp = prime_input[-1]
+    for p in range(predict_len):
+        output, hidden = #run your RNN/decoder forward on the input
+        # Sample from the network as a multinomial distribution
+        output_dist = output.data.view(-1).div(temperature).exp()
+        top_i = torch.multinomial(output_dist, 1)[0]
+        ## get character from your list of all characters, add it to your output str sequence, set input
+        ## for the next pass through the model
+         # your code here
+        ## /
+    return predicted
+</code>
+----
+**Part 3: Creating your own GRU cell**
+The cell that you used in Part 1 was a pre-defined Pytorch layer. Now, write your own GRU class using the same parameters as the built-in Pytorch class does.
+**Please try not to look at the GRU cell definition.**
+The answer is right there in the code, and in theory, you could
+just cut-and-paste it.  This bit is on your honor!
+----
+**Part 4: Run it and generate your final text!**
+Assuming everything has gone well, you should be able to run the main
+function in the scaffold code, using either your custom GRU cell or the built in layer, and see
+output something like this.  I trained on the "lotr.txt" dataset,
+using chunk_length=200, hidden_size=100 for 2000 epochs gave.
+<code>
+[0m 9s (100 5%) 2.2169]
+Whaiss Mainde
+'
+he and the
+'od and roulll and Are say the
+rere.
+'Wor
+'Iow anond wes ou
+'Yi
+[0m 19s (200 10%) 2.0371]
+Whimbe.
+'Thhe
+on not of they was thou hit of
+sil ubat thith hy the seare
+as sower and of len beda
+[0m 29s (300 15%) 2.0051]
+Whis the cart. Whe courn!' 'Bu't of they aid dou giter of fintard of the not you ous,
+'Thas orntie it
+[0m 38s (400 20%) 1.8617]
+Wh win took be to the know the gost bing to kno wide dought, and he as of they thin.
+The Gonhis gura
+[0m 48s (500 25%) 1.9821]
+When of they singly call the and thave thing
+they the nowly we'tly by ands, of less be grarmines of t
+[0m 58s (600 30%) 1.8170]
+Whinds to mass of I
+not ken we ting and dour
+and they.
+'Wat res swe Ring set shat scmaid. The
+ha
+[1m 7s (700 35%) 2.0367]
+Whad ded troud wanty agy. Ve tanle gour the gone veart on hear, as dent far of the Ridgees.'
+'The Ri
+[1m 17s (800 40%) 1.9458]
+Whis is brouch Heared this lack and was weself, for on't
+abothom my and go staid it
+they curse arsh
+[1m 27s (900 45%) 1.7522]
+Whout bear the
+Evening
+the pace spood, Arright the spaines beren the and Wish was was on the more yo
+[1m 37s (1000 50%) 1.6444]
+Whe Swarn. at colk. N(r)rce or they he
+wearing. And the on the he was are he said Pipin.
+'Yes and i
+[1m 47s (1100 55%) 1.8770]
+Whing at they and thins the Wil might
+happened you dlack rusting and thousting fy them, there lifted
+[1m 57s (1200 60%) 1.9401]
+Wh the said Frodo eary him that the herremans!
+'I the Lager into came and broveener he sanly
+for
+s
+[2m 7s (1300 65%) 1.8095]
+When lest
+- in sound fair, and
+the Did dark he in the gose cilling the stand I in the sight. Frodo y
+[2m 16s (1400 70%) 1.9229]
+Whing in a shade and Mowarse round and parse could pass not a have partainly. ' for as I come of I
+le
+[2m 26s (1500 75%) 1.8169]
+Whese one her of in a lief that,
+but. 'We repagessed,
+wandere in these fair of long one have here my
+[2m 36s (1600 80%) 1.6635]
+Where fread in thougraned in woohis, on the the green the
+pohered alked tore becaming was seen what c
+[2m 46s (1700 85%) 1.7868]
+Whil neat
+came to
+is laked,
+and fourst on him grey now they as pass away aren have in the border sw
+[2m 56s (1800 90%) 1.6343]
+Wh magered.
+Then tell some tame had bear that
+came as it nome in
+to houbbirnen and to heardy.
+'
+[3m 6s (1900 95%) 1.8191]
+Who expey to must away be to the master felkly and for, what shours was alons? I had be the long to fo
+[3m 16s (2000 100%) 1.8725]
+White, and his of his in before that for brown before can then took on the fainter smass about rifall
+</code>

BYU CS classes

User Tools

Site Tools

Differences

Page Tools