Git Product home page Git Product logo

adityajn105 / flappy-bird-deep-q-learning Goto Github PK

View Code? Open in Web Editor NEW
9.0 2.0 7.0 252.17 MB

Flappy Bird Game trained on a Double Dueling Deep Q Network with Prioritized Experience Replay

Home Page: https://adityajn105.github.io/blogs/deep-q-learning.html

License: MIT License

Jupyter Notebook 80.96% Python 19.04%
q-learning deep-learning image-processing reinforcement-learning dueling-dqn prioritized-experience-replay torch cnn

flappy-bird-deep-q-learning's Introduction

Flappy Bird with Deep Reinforcement Learning

Flappy Bird Game trained on a Double Dueling Deep Q Network with Prioritized Experience Replay implemented using Pytorch.

Gameplay

See Full 3 minutes video

Getting Started

Here I will explain how to run the game which runs automatically using saved model, also I will breif you about basics of Q Learning, Deep Q learning, Dueling architecture and Prioritized Experience Replay.

Prerequisites

You will need Python 3.X.X with some packages which you can install direclty using requirements.txt.

pip install -r requirements.txt

Running The Game

Use the following command to run the game where '--model' indicates the location of saved DQN model.

python3 play_game.py --model checkpoints/flappy_best_model.dat

Deep Q Learning

Q Learning is off policy learning method in reinforcement learning which is a developement over on-policy Temporal Difference control algorithm. Q-learning tries to estimate a state-action value function for target policy that deterministically selects the action of highest value.

The problem with Tradition Q learning is that it is not suitable for continuous environment (like Flappy Bird) where an agent can be in infinite number of states. So it is not feasible to store all states in a grid which we use in tradition Q learning. So we use Deep Q learning in these environments.

Deep Q learning is based on Deep Neural Network which takes current state in the form of image or say continuous value and approximates Q-values for each action based on that state.

Deep Q Learning

Take a look at this article which explains Deep Q Learning

Network Architecture (Dueling Architecture)

Here I have used Dueling architecture to calculate Q values. Q-values correspond to how good it is to be at that state and taking an action at that state Q(s,a). So we can decompose Q(s,a) as the sum of: V(s) - the value of being at that state A(s) - the advantage of taking that action at that state (how much better is to take this action versus all other possible actions at that state).

Q(s,a) = V(s) + A(s,a)

Dueling Architecure

Prioritized Experience Replay

The idea behind PER was that some experiences may be more important than others for our training, but might occur less frequently. Because we sample the batch uniformly (selecting the experiences randomly) these rich experiences that occur rarely have practically no chance to be selected. We want to take in priority experience where there is a big difference between our prediction and the TD target, since it means that we have a lot to learn about it.

pt = |dt| + e
where,
	pt = priority of the experience
	dt = magnitude of TD error
	e = constant assures that priority do not become 0

Take a look at this article which explains Double Dueling and PER

Authors

Licence

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledements

  • The Game has been taken from this repository
  • Thanks Siraj Raval for Move37 course on theschool.ai which helped understand these concepts.

flappy-bird-deep-q-learning's People

Contributors

adityajn105 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.