Git Product home page Git Product logo

async-rl's Introduction

Async-RL

A3C FF playing Breakout A3C LSTM playing Space Invaders

This is a repository where I attempt to reproduce the results of Asynchronous Methods for Deep Reinforcement Learning. Currently I have only replicated A3C FF/LSTM for Atari.

Any feedback is welcome :)

Current Status

A3C FF

I trained A3C FF for ALE's Breakout with 36 processes (AWS EC2 c4.8xlarge) for 80 million training steps, which took about 17 hours. The mean and median of scores of test runs along training are plotted below. Ten test runs for every 1 million training steps (counted by the global shared counter). The results seems slightly worse than theirs.

The trained model is uploaded at trained_model/breakout_ff/80000000_finish.h5, so you can make it to play Breakout by the following command:

python demo_a3c_ale.py <path-to-rom> trained_model/breakout_ff/80000000_finish.h5

The animation gif above is the episode I cherry-picked from 10 demo runs using that model.

A3C LSTM

I also trained A3C LSTM for ALE's Space Invaders in the same manner with A3C FF. Training A3C LSTM took about 24 hours for 80 million training steps.

The trained model is uploaded at trained_model/space_invaders_lstm/80000000_finish.h5, so you can make it to play Space Invaders by the following command:

python demo_a3c_ale.py <path-to-rom> trained_model/space_invaders_lstm/80000000_finish.h5 --use-lstm

The animation gif above is the episode I cherry-picked from 10 demo runs using that model.

Implementation details

I received a confirmation about their implementation details and some hyperparameters by e-mail from Dr. Mnih. I summarized them in the wiki: https://github.com/muupan/async-rl/wiki

Requirements

  • Python 3.5.1
  • chainer 1.8.1
  • cached-property 1.3.0
  • h5py 2.5.0
  • Arcade-Learning-Environment

Training

python a3c_ale.py <number-of-processes> <path-to-atari-rom> [--use-lstm]

a3c_ale.py will save best-so-far models and test scores into the output directory.

Unfortunately it seems this script has some bug now. Please see the issues #5 and #6. I'm trying to fix it.

Evaluation

python demo_a3c_ale.py <path-to-atari-rom> <trained-model> [--use-lstm]

Similar Projects

async-rl's People

Contributors

muupan avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.