EECS 545 Machine Learning Course Project Fall 2019
This project implemented two algorithms; the deep deterministic policy gradient algorithm (DDPG) (Lillicrap et al. [2015]) and the advantage actor-criticmethod (A2C) (Mnih et al. [2016]). Each algorithm was evaluated on the OpenAI Gym environments LunarLanderContinuous-v2 and MountainCarContinuous-v0.
Videos of learned policies can be viewed at the following links.