Git Product home page Git Product logo

ritchiehuang / deeprl_algorithms Goto Github PK

View Code? Open in Web Editor NEW
307.0 11.0 78.0 8.42 MB

DeepRL algorithms implementation easy for understanding and reading with Pytorch and Tensorflow 2(DQN, REINFORCE, VPG, A2C, TRPO, PPO, DDPG, TD3, SAC)

Python 97.57% Shell 2.43%
reinforcement reinforcement-learning-algorithms pytorch-implementation deep-reinforcement-learning dqn policygradient ppo trpo mujoco policy-gradient tensorflow2 td3 pytorch-rl soft-actor-critic

deeprl_algorithms's Introduction

About Deep Reinforcement Learning

The combination of Reinforcement Learning and Deep Learning produces a series of important algorithms. This project will focus on referring to relevant papers and implementing relevant algorithms as far as possible.

This repo aims to implement Deep Reinforcement Learning algorithms using Pytorch and Tensorflow 2.

1.Why do this?

  • Implementing all of this algorithms from scratch really helps you with your parameter tuning;
  • The coding process allows you to better understand the principles of the algorithm.

2.Lists of Algorithms

2.1 Value based

Value based algorithms include DQNs.

[1]. DQN Pytorch / Tensorflow, Paper: Playing Atari with Deep Reinforcement Learning
[2]. Double DQN Pytorch / Tensorflow, Paper: Deep Reinforcement Learning with Double Q-learning
[3]. Dueling DQN Pytorch / Tensorflow, Paper: Dueling Network Architectures for Deep Reinforcement Learning

2.2 Policy based

Policy based algorithms is currently perform better, including Policy Gradient Methods.

[1]. REINFORCE Pytorch / Tensorflow, Paper: Policy Gradient Methods for Reinforcement Learning with Function Approximation
[2]. VPG(Vanilla Policy Gradient) Pytorch / Tensorflow, Paper: High Dimensional Continuous Control Using Generalized Advantage Estimation
[3]. A2C Pytorch, Paper: Asynchronous Methods for Deep Reinforcement Learning Synchronous version of A3C
[4]. DDPG Pytorch, Paper: Continuous Control With Deep Reinforcement Learning
[5]. TRPO Pytorch / Tensorflow, Paper: Trust Region Policy Optimization
[6]. PPO Pytorch / Tensorflow, Paper: Proximal Policy Optimization Algorithms
[7]. SAC Pytorch, Paper: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
[8]. SAC with Automatically Adjusted Temperature Pytorch, Paper: Soft Actor-Critic Algorithms and Applications
[9]. TD3(Twin Delayed DDPG) Pytorch, Paper: Addressing Function Approximation Error in Actor-Critic Methods

2.3 Imitation Learning

Imitation learning learn from expert data.

[1]. GAIL Pytorch, Paper: Generative Adversarial Imitation Learning

3.Project Dependencies

  • Python >=3.6
  • Tensorflow >= 2.4.0
  • Pytorch >= 1.5.0
  • Seaborn >= 0.10.0
  • Click >= 7.0

Full dependencies are listed in the requirements.txt file, install with pip:

pip install -r requirements.txt

You can install the project by typing the following command:

python install -e .

4.Run

Each algorithm is implemented in a single folder including 4 files:

1. main.py # A minimal executable example for algorithm  

2. [algorithm].py # Main body for algorithm implementation  

3. [algorithm]_step.py # Algorithm update core step 

4. test.py # Loading pretrained model and test performance of the algorithm

The default main.py is a an executable example, the parameters are parsed by click.

You can run algorithm from the main.py or bash scripts.

  • You can simply type python main.py --help in the algorithm package to view all configurable parameters.
  • The directory Scripts gives some bash scripts, you can modify them at will.

5.Visualization of performance

Utils/plot_util.py provide a simple plot tool based on Seaborn and Matplotlib. All the plots in this project are drawn by this plot util.

5.1 Benchmarks for DQNs

Pytorch Version

bench_dqn

Tensorflow2 Version

bench_dqn_tf2

5.2 Benchmarks for PolicyGradients

Pytorch Version

bench_pg

Tensorflow2 Version

Currently only VPG, PPO and TRPO Available:

bench_pg_tf2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.