This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
cs501r_f2018:lab9 [2018/11/19 21:09] wingated |
cs501r_f2018:lab9 [2021/06/30 23:42] (current) |
||
|---|---|---|---|
| Line 16: | Line 16: | ||
| * 45% Proper design, creation and debugging of an actor and critic networks | * 45% Proper design, creation and debugging of an actor and critic networks | ||
| * 25% Proper implementation of the PPO loss function and objective on cart-pole ("CartPole-v0") | * 25% Proper implementation of the PPO loss function and objective on cart-pole ("CartPole-v0") | ||
| - | * 20% Implementation and demonstrated learning of PPO on another domain of your choice | + | * 20% Implementation and demonstrated learning of PPO on another domain of your choice (**except** VizDoom) |
| * 10% Visualization of policy return as a function of training | * 10% Visualization of policy return as a function of training | ||
| Line 59: | Line 59: | ||
| **Update**: Here is our | **Update**: Here is our | ||
| - | [[https://github.com/joshgreaves/reinforcement-learning|our lab's implementation of PPO]]. | + | [[https://github.com/joshgreaves/reinforcement-learning|our lab's implementation of PPO]]. NOTE: because this code comes with a complete implementation of running on VizDoom, **you may not use that as your additional test domain.** |
| + | Here are some [[https://stackoverflow.com/questions/50667565/how-to-install-vizdoom-using-google-colab|instructions for installing vizdoom on colab]]. | ||
| + | |||
| + | |||
| + | ---- | ||
| Here is some code from our reference implementation. Hopefully it will serve as a good outline of what you need to do. | Here is some code from our reference implementation. Hopefully it will serve as a good outline of what you need to do. | ||