User Tools

Site Tools


cs501r_f2016

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cs501r_f2016 [2016/03/31 23:11]
admin
cs501r_f2016 [2021/06/30 23:42]
Line 1: Line 1:
-=CS501r, Fall 2016 - Deep Learning: Theory and Practice= 
- 
-As big data and deep learning gain more prominence in both industry 
-and academia, the time seems ripe for a class focused exclusively on 
-the theory and practice of deep learning, both to understand why deep 
-learning has had such a tremendous impact across so many disciplines,​ 
-and also to spur research excellence in deep learning at BYU. 
- 
-==Learning activities== 
- 
-This class will be a graduate-level coding class. ​ Students will be 
-exposed to the theoretical aspects of deep learning (including 
-derivatives,​ regularization,​ and optimization theory), as well as 
-practical strategies for training large-scale networks, leveraging 
-hardware acceleration,​ distributing training across multiple machines, 
-and coping with massive datasets. ​ Students will engage the material 
-primarily through weekly coding labs dedicated to implementing 
-state-of-the-art techniques, using modern deep learning software 
-frameworks. ​ The class will culiminate with a substantial data 
-analysis project. 
- 
-==Preliminary Syllabus and topics to be covered:== 
- 
-  - **Basics of DNNs** 
-    - Convolution layers 
-    - Maxpooling layers 
-    - Relu units 
-    - Softmax units 
-    - Local response normalization / contrast normalization 
-  - **Regularization strategies** 
-    - Dropout 
-    - Dropconnect 
-    - Batch normalization 
-    - Adversarial networks 
-    - Data augmentation 
-  - **High-level implementation packages - pros and cons** 
-    - Tensorflow, Theano, Caffe, Keras, Torch, Mocha 
-  - **Case studies / existing networks and why they'​re interesting** 
-    - AlexNet 
-    - VGG 
-    - GoogLeNet / Inception 
-    - ZFNet 
-  - **Training & initialization** 
-    - Initialization strategies: Xavier, Gaussian, Identity, Sparse 
-    - Optimization theory and algorithms 
-    - Local minima; saddle points; plateaus 
-    - SGD 
-    - RPROP 
-    - RMS prop 
-    - Adagrad 
-    - Adam 
-    - Higher-order algorithms (LBFGS; Hessian-free;​ trust-region) 
-    - Nesterov and momentum 
-  - **Large-scale distributed learning** 
-    - Parameter servers 
-    - Asynchronous vs. synchronous architectures 
-  - **Temporal networks and how to train them** 
-    - Basic RNNs and Backprop-through-time 
-    - LSTMs 
-    - Deep Memory Nets 
-  - **Application areas** 
-    - Deep reinforcement learning 
-    - NN models of style vs. content (deepart.io) 
-    - Imagenet classification 
-    - The Neural Turing Machine 
-    - Sentiment classification 
-    - Word embeddings 
-  - **Understanding and visualizing CNNs** 
-    - tSNE embeddings 
-    - deconvnets 
-    - data gradients / inceptionism 
-  - **Misc** 
-    - Network compression 
-    - Low bit-precision networks 
-    - Sum-product networks 
-    - Evolutionary approaches to topology discovery 
-    - Spatial transformer networks 
-    - Network-in-network 
-    - Regions-with-CNN 
  
cs501r_f2016.txt ยท Last modified: 2021/06/30 23:42 (external edit)