CS501r, Fall 2018 - Deep Learning: Theory and Practice
As big data and deep learning gain more prominence in both industry
and academia, the time seems ripe for a class focused exclusively on
the theory and practice of deep learning, both to understand why deep
learning has had such a tremendous impact across so many disciplines,
and also to spur research excellence in deep learning at BYU.
Learning activities
This class will be a graduate-level coding class. Students will be
exposed to the theoretical aspects of deep learning (including
derivatives, regularization, and optimization theory), as well as
practical strategies for training large-scale networks, leveraging
hardware acceleration, distributing training across multiple machines,
and coping with massive datasets. Students will engage the material
primarily through weekly coding labs dedicated to implementing
state-of-the-art techniques, using modern deep learning software
frameworks. The class will culiminate with a substantial data
analysis project.
Preliminary Syllabus and topics to be covered:
Neuron-based models of computation
Integrate-and-fire
Hodgkin-Huxley
Population codes
Schematic and organization of visual cortex
HMAX
Basics of DNNs
Convolution / deconvolution layers
Maxpooling layers
Relu units
Softmax units
Local response normalization / contrast normalization
Regularization strategies
Dropout
Dropconnect
Batch normalization
Adversarial networks
Data augmentation
High-level implementation packages - pros and cons
Tensorflow, Theano, Caffe, Keras, Torch, Mocha
Case studies / existing networks and why they're interesting
AlexNet
VGG
GoogLeNet / Inception
ZFNet
Training & initialization
Initialization strategies: Xavier, Gaussian, Identity, Sparse
Optimization theory and algorithms
Local minima; saddle points; plateaus
SGD
RPROP
RMS prop
Adagrad
Adam
Higher-order algorithms (LBFGS; Hessian-free; trust-region)
Nesterov and momentum
Large-scale distributed learning
Parameter servers
Asynchronous vs. synchronous architectures
Temporal networks and how to train them
Basic RNNs and Backprop-through-time
LSTMs
Deep Memory Nets
Application areas
Deep reinforcement learning
NN models of style vs. content (deepart.io)
Imagenet classification
The Neural Turing Machine
Sentiment classification
Word embeddings
Understanding and visualizing CNNs
tSNE embeddings
deconvnets
data gradients / inceptionism
Misc
Network compression
Low bit-precision networks
Sum-product networks
Evolutionary approaches to topology discovery
Spatial transformer networks
Network-in-network
Regions-with-CNN