Differences

This shows you the differences between two versions of the page.

--- cs501r_f2016 [2016/03/31 23:11]
admin
+++ cs501r_f2016 [2021/06/30 23:42]
@@ Line 1: / Line 1: @@
-=CS501r, Fall 2016 - Deep Learning: Theory and Practice=
-As big data and deep learning gain more prominence in both industry
-and academia, the time seems ripe for a class focused exclusively on
-the theory and practice of deep learning, both to understand why deep
-learning has had such a tremendous impact across so many disciplines,
-and also to spur research excellence in deep learning at BYU.
-==Learning activities==
-This class will be a graduate-level coding class.  Students will be
-exposed to the theoretical aspects of deep learning (including
-derivatives, regularization, and optimization theory), as well as
-practical strategies for training large-scale networks, leveraging
-hardware acceleration, distributing training across multiple machines,
-and coping with massive datasets.  Students will engage the material
-primarily through weekly coding labs dedicated to implementing
-state-of-the-art techniques, using modern deep learning software
-frameworks.  The class will culiminate with a substantial data
-analysis project.
-==Preliminary Syllabus and topics to be covered:==
-  - **Basics of DNNs**
-    - Convolution layers
-    - Maxpooling layers
-    - Relu units
-    - Softmax units
-    - Local response normalization / contrast normalization
-  - **Regularization strategies**
-    - Dropout
-    - Dropconnect
-    - Batch normalization
-    - Adversarial networks
-    - Data augmentation
-  - **High-level implementation packages - pros and cons**
-    - Tensorflow, Theano, Caffe, Keras, Torch, Mocha
-  - **Case studies / existing networks and why they're interesting**
-    - AlexNet
-    - VGG
-    - GoogLeNet / Inception
-    - ZFNet
-  - **Training & initialization**
-    - Initialization strategies: Xavier, Gaussian, Identity, Sparse
-    - Optimization theory and algorithms
-    - Local minima; saddle points; plateaus
-    - SGD
-    - RPROP
-    - RMS prop
-    - Adagrad
-    - Adam
-    - Higher-order algorithms (LBFGS; Hessian-free; trust-region)
-    - Nesterov and momentum
-  - **Large-scale distributed learning**
-    - Parameter servers
-    - Asynchronous vs. synchronous architectures
-  - **Temporal networks and how to train them**
-    - Basic RNNs and Backprop-through-time
-    - LSTMs
-    - Deep Memory Nets
-  - **Application areas**
-    - Deep reinforcement learning
-    - NN models of style vs. content (deepart.io)
-    - Imagenet classification
-    - The Neural Turing Machine
-    - Sentiment classification
-    - Word embeddings
-  - **Understanding and visualizing CNNs**
-    - tSNE embeddings
-    - deconvnets
-    - data gradients / inceptionism
-  - **Misc**
-    - Network compression
-    - Low bit-precision networks
-    - Sum-product networks
-    - Evolutionary approaches to topology discovery
-    - Spatial transformer networks
-    - Network-in-network
-    - Regions-with-CNN

BYU CS classes

User Tools

Site Tools

Differences

Page Tools