User Tools

Site Tools


cs501r_f2018:lab9

This is an old revision of the document!


Objective:

  • To implement the Proximal Policy Optimization algorithm

Deliverable:

For this lab, you will turn in a colab notebook that implements the proximal policy optimization (PPO) algorithm.


Grading standards:

Your notebook will be graded on the following:

  • 40% Proper design, creation and debugging of a dense prediction network
  • 40% Proper implementation of a loss function and train/test set accuracy measure
  • 10% Tidy visualizations of loss of your dense predictor during training
  • 10% Test image output

Description:

For this lab, you will implement the PPO algorithm, and train it on a few simple worlds.

Here is the paper with a technical description of the algorithm: Proximal policy optimization.

Here is a video describing it at a high level: PPO video

cs501r_f2018/lab9.1541799218.txt.gz · Last modified: 2021/06/30 23:40 (external edit)