Git Product home page Git Product logo

bootstrap_dqn's Introduction

Bootstrap DQN

This repo contains our implementation of a Bootstrapped DQN with options to add a Randomized Prior, Dueling, and Double DQN in ALE games.

Deep Exploration via Bootstrapped DQN

Randomized Prior Functions for Deep Reinforcement Learning

Installation

pip install -r requirements.txt --index-url https://download.pytorch.org/whl/cu118  # For CUDA (GPU)
pip install -r requirements.txt --index-url https://download.pytorch.org/whl/cpu  # For CPU

Some results on Breakout

alt text

This gif depicts the orange agent from below winning the first game of Breakout and eventually winning a second game. The agent reaches a high score of 830 in this evaluation. There are several gaps in playback due to file size. We show agent steps [1000-1500], [2400-2600], [3000-4500], and [16000-16300].

Comparison:

  • (blue) DQN with epsilon greed annealed between 1 and 0.01
  • (orange) Bootstrap with epsilon greedy annealed between 1 and 0.01
  • (green) Bootstrap without epsilon greedy exploration
  • (red) Bootstrap with randomized prior

All agents were implemented as Dueling, Double DQNs. The xlabel in these plots, "steps", refers to the number of states the agent observed thus far in training. Multiply by 4 to account for a frame-skip of 4 to describe the total number of frames the emulator has progressed.

Our agents are sent a terminal signal at the end of life. They face a deterministic state progression after a random number<30 of no-op steps at the beginning of each episode.

alt text

Some results on Pong

Here are some results on Pong with Boostrap DQN w/ a Randomized Prior. A optimal strategy is learned within 2.5m steps.

alt text

Pong agent score in evaluation - reward vs steps alt text

Some results on Freeway

Here are some results on Freeway with Boostrap DQN w/ a Randomized Prior. The random prior allowed us to solve this "hard exploration" problem within 4 millions steps.

alt text

Freeway agent score in evaluation - reward vs steps

alt text

Dependencies

atari-py installed from https://github.com/kastnerkyle/atari-py
torch='1.0.1.post2'
cv2='4.0.0'

References

We referenced several execellent examples/blogposts to build this codebase:

Discussion and debugging w/ Kyle Kaster

Fabio M. Graetz's DQN

hengyuan-hu's Rainbow

Dopamine's baseline

bootstrap_dqn's People

Contributors

johannah avatar mazurel avatar

Watchers

 avatar

Forkers

tyssak

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.