Differences

This shows you the differences between two versions of the page.

@@ Line 1: / Line 1: @@
+====CS501r, Fall 2016 - Deep Learning: Theory and Practice====
+As big data and deep learning gain more prominence in both industry
+and academia, the time seems ripe for a class focused exclusively on
+the theory and practice of deep learning, both to understand why deep
+learning has had such a tremendous impact across so many disciplines,
+and also to spur research excellence in deep learning at BYU.
+===Learning activities===
+This class will be a graduate-level coding class.  Students will be
+exposed to the theoretical aspects of deep learning (including
+derivatives, regularization, and optimization theory), as well as
+practical strategies for training large-scale networks, leveraging
+hardware acceleration, distributing training across multiple machines,
+and coping with massive datasets.  Students will engage the material
+primarily through weekly coding labs dedicated to implementing
+state-of-the-art techniques, using modern deep learning software
+frameworks.  The class will culiminate with a substantial data
+analysis project.
+===Preliminary Syllabus and topics to be covered:===
+  - **Neuron-based models of computation**
+    - Integrate-and-fire
+    - Hodgkin-Huxley
+    - Population codes
+    - Schematic and organization of visual cortex
+    - HMAX
+  - **Basics of DNNs**
+    - Convolution / deconvolution layers
+    - Maxpooling layers
+    - Relu units
+    - Softmax units
+    - Local response normalization / contrast normalization
+  - **Regularization strategies**
+    - Dropout
+    - Dropconnect
+    - Batch normalization
+    - Adversarial networks
+    - Data augmentation
+  - **High-level implementation packages - pros and cons**
+    - Tensorflow, Theano, Caffe, Keras, Torch, Mocha
+  - **Case studies / existing networks and why they're interesting**
+    - AlexNet
+    - VGG
+    - GoogLeNet / Inception
+    - ZFNet
+  - **Training & initialization**
+    - Initialization strategies: Xavier, Gaussian, Identity, Sparse
+    - Optimization theory and algorithms
+    - Local minima; saddle points; plateaus
+    - SGD
+    - RPROP
+    - RMS prop
+    - Adagrad
+    - Adam
+    - Higher-order algorithms (LBFGS; Hessian-free; trust-region)
+    - Nesterov and momentum
+  - **Large-scale distributed learning**
+    - Parameter servers
+    - Asynchronous vs. synchronous architectures
+  - **Temporal networks and how to train them**
+    - Basic RNNs and Backprop-through-time
+    - LSTMs
+    - Deep Memory Nets
+  - **Application areas**
+    - Deep reinforcement learning
+    - NN models of style vs. content (deepart.io)
+    - Imagenet classification
+    - The Neural Turing Machine
+    - Sentiment classification
+    - Word embeddings
+  - **Understanding and visualizing CNNs**
+    - tSNE embeddings
+    - deconvnets
+    - data gradients / inceptionism
+  - **Misc**
+    - Network compression
+    - Low bit-precision networks
+    - Sum-product networks
+    - Evolutionary approaches to topology discovery
+    - Spatial transformer networks
+    - Network-in-network
+    - Regions-with-CNN

BYU CS classes

User Tools

Site Tools

Differences

Page Tools