BYU CS classes

Site Tools

cs501r_f2018:lab3

Objectives:

• Build and train a deep conv net
• Explore and implement various initialization techniques
• Implement a parameterized module in Pytorch
• Use a principled loss function

Deliverable:

For this lab, you will submit an ipython notebook via learningsuite. This is where you build your first deep neural network!

For this lab, we'll be combining several different concepts that we've covered during class, including new layer types, initialization strategies, and an understanding of convolutions.

• 30% Part 0: Successfully followed lab video and typed in code
• 20% Part 1: Re-implement Conv2D and CrossEntropy loss function
• 20% Part 2: Implement different initialization strategies
• 10% Part 3: Print parameters, plot train/test accuracy
• 10% Part 4: Convolution parameters quiz
• 10% Tidy and legible figures, including labeled axes where appropriate

Detailed specs:

Part 0: Watch and follow video tutorial

Part 1: Re-implement a Conv2D module with parameters and a CrossEntropy loss function.

You will need to use

   https://pytorch.org/docs/stable/nn.html#torch.nn.Parameter
https://pytorch.org/docs/stable/nn.html#torch.nn.functional.conv2d
https://pytorch.org/docs/stable/torch.html#torch.exp
https://pytorch.org/docs/stable/torch.html#torch.log


Part 2: Implement a few initialization strategies which can include Xe initialization (sometimes called Xavier), Orthogonal initailization, and uniform random. You can specify which strategy you want to use with a parameter. Helpful links include:

  https://hjweide.github.io/orthogonal-initialization-in-convolutional-layers (or the orignal paper: http://arxiv.org/abs/1312.6120)
http://andyljones.tumblr.com/post/110998971763/an-explanation-of-xavier-initialization

Part 3: Print the number of parameters in your network and plot accuracy of your training and validation set over time. You should experiment with some deep networks and see if you can get a network with close to 1,000,000 parameters.

Part 4: Learn about how convolution layers affect the shape of outputs, and answer the following quiz questions. Include these in a new markdown cell in your jupyter notebook.

Using a Kernel size of 3×3 what should the settings of your 2d convolution be that results in the following mappings (first answer given to you)

(c=3, h=10, w=10) ⇒ (c=10, h=8, w=8) : (out_channels=10, kernel_size=(3, 3), padding=(0, 0))

(c=3, h=10, w=10) ⇒ (c=22, h=10, w=10) :

(c=3, h=10, w=10) ⇒ (c=65, h=12, w=12) :

(c=3, h=10, w=10) ⇒ (c=7, h=20, w=20) :

Using a Kernel size of 5×5:

(c=3, h=10, w=10) ⇒ (c=10, h=8, w=8) : (out_channels=10, kernel_size=(5, 5), padding=(1, 1))

(c=3, h=10, w=10) ⇒ (c=100, h=10, w=10) :

(c=3, h=10, w=10) ⇒ (c=23, h=12, w=12) :

(c=3, h=10, w=10) ⇒ (c=5, h=24, w=24) :

Using Kernel size of 5×3:

(c=3, h=10, w=10) ⇒ (c=10, h=8, w=8) :

(c=3, h=10, w=10) ⇒ (c=100, h=10, w=10) :

(c=3, h=10, w=10) ⇒ (c=23, h=12, w=12) :

(c=3, h=10, w=10) ⇒ (c=5, h=24, w=24) :

Determine the kernel that requires the smallest padding size to make the following mappings possible:

(c=3, h=10, w=10) ⇒ (c=10, h=9, w=7) :

(c=3, h=10, w=10) ⇒ (c=22, h=10, w=10) :