Git Product home page Git Product logo

reinforcement_learning_algo_evaluation's Introduction

Course project of Advanced Machine Learning MICRO-570

We study the theory and mechanism of two reinforcement learning algorithms, A2C and PPO2 and their application for multiple tasks e.g., Cart pole pendulum and lunar landing task. To do so, we build up a pipeline able to train and manage our models and environments.

From the experimental results obtained, we compute multiple reliability metrics to be able to lead a discussion about the performance evaluation of the algorithms.

From the above statistics tables, we can observe that A2C and PPO2 have the similar ranks of short-term and long-term risks across time but risks of PPO2 have less variance than that of A2C, which means that they have the similar average performance for the worst-case scenarios on different tasks. However, PPO2 is more stable than A2C in this case across different tasks and runs with various settings. We can also see that PPO2 has better reliability performance than A2C with respect to dispersion across time and risk across runs, which infers that A2C has the larger average variance of performances and the worse average final performance across the training runs across training runs under different tasks. But A2C has better dispersion across time than PPO2, which means A2C has the smaller distribution for differential of training curves.

reinforcement_learning_algo_evaluation's People

Contributors

juwu-19 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.