User Tools

Site Tools


cs501r_f2016:desc

CS501r, Fall 2016 - Deep Learning: Theory and Practice

As big data and deep learning gain more prominence in both industry and academia, the time seems ripe for a class focused exclusively on the theory and practice of deep learning, both to understand why deep learning has had such a tremendous impact across so many disciplines, and also to spur research excellence in deep learning at BYU.

Learning activities

This class will be a graduate-level coding class. Students will be exposed to the theoretical aspects of deep learning (including derivatives, regularization, and optimization theory), as well as practical strategies for training large-scale networks, leveraging hardware acceleration, distributing training across multiple machines, and coping with massive datasets. Students will engage the material primarily through weekly coding labs dedicated to implementing state-of-the-art techniques, using modern deep learning software frameworks. The class will culiminate with a substantial data analysis project.

Preliminary Syllabus and topics to be covered:

  1. Neuron-based models of computation
    1. Integrate-and-fire
    2. Hodgkin-Huxley
    3. Population codes
    4. Schematic and organization of visual cortex
    5. HMAX
  2. Basics of DNNs
    1. Convolution / deconvolution layers
    2. Maxpooling layers
    3. Relu units
    4. Softmax units
    5. Local response normalization / contrast normalization
  3. Regularization strategies
    1. Dropout
    2. Dropconnect
    3. Batch normalization
    4. Adversarial networks
    5. Data augmentation
  4. High-level implementation packages - pros and cons
    1. Tensorflow, Theano, Caffe, Keras, Torch, Mocha
  5. Case studies / existing networks and why they're interesting
    1. AlexNet
    2. VGG
    3. GoogLeNet / Inception
    4. ZFNet
  6. Training & initialization
    1. Initialization strategies: Xavier, Gaussian, Identity, Sparse
    2. Optimization theory and algorithms
    3. Local minima; saddle points; plateaus
    4. SGD
    5. RPROP
    6. RMS prop
    7. Adagrad
    8. Adam
    9. Higher-order algorithms (LBFGS; Hessian-free; trust-region)
    10. Nesterov and momentum
  7. Large-scale distributed learning
    1. Parameter servers
    2. Asynchronous vs. synchronous architectures
  8. Temporal networks and how to train them
    1. Basic RNNs and Backprop-through-time
    2. LSTMs
    3. Deep Memory Nets
  9. Application areas
    1. Deep reinforcement learning
    2. NN models of style vs. content (deepart.io)
    3. Imagenet classification
    4. The Neural Turing Machine
    5. Sentiment classification
    6. Word embeddings
  10. Understanding and visualizing CNNs
    1. tSNE embeddings
    2. deconvnets
    3. data gradients / inceptionism
  11. Misc
    1. Network compression
    2. Low bit-precision networks
    3. Sum-product networks
    4. Evolutionary approaches to topology discovery
    5. Spatial transformer networks
    6. Network-in-network
    7. Regions-with-CNN
cs501r_f2016/desc.txt · Last modified: 2016/08/29 09:22 by admin