Git Product home page Git Product logo

drl-flappybird's Introduction

Playing Flappy Bird Using Deep Reinforcement Learning (Based on Deep Q Learning DQN)

Include NIPS 2013 version and Nature Version DQN

I rewrite the code from another repo and make it much simpler and easier to understand Deep Q Network Algorithm from DeepMind

The code of DQN is only 160 lines long.

To run the code, just type python FlappyBirdDQN.py

Since the DQN code is a unique class, you can use it to play other games.

About the code

As a reinforcement learning problem, we knows we need to obtain observations and output actions, and the 'brain' do the processing work.

Therefore, you can easily understand the BrainDQN.py code. There are three interfaces:

  1. getInitState() for initialization
  2. getAction()
  3. setPerception(nextObservation,action,reward,terminal)

the game interface just need to be able to feed the action to the game and output observation,reward,terminal

Disclaimer

This work is based on the repo: yenchenlin1994/DeepLearningFlappyBird

drl-flappybird's People

Contributors

floodsung avatar wkcn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

drl-flappybird's Issues

Setting Difficulty level of the Game

Hi,

Thanks for your nice code and documentation.

I saw the report from Kevin Chen [http://cs229.stanford.edu/proj2015/362_report.pdf] where he experimented with three difficulty levels (easy, medium, hard) of the game. Can you please tell me which difficulty level the game is set in your code ? and How to change the difficulty level if I want to?

I guess, it's related to value of PIPEGAPSIZE in wrapped_flappy_bird.py.. currently it's set to 100. Is that hard mode? By Increasing or decreasing the PIPEGAPSIZE, can I change the difficulty level? If so, are there any specific value for those modes?

Thanks!

New observation update problem

Hello @songrotek , code here seems to keep the oldest 3 frames forever, which means the algorithm is not using the newest 4 frames to represent state.

How do 1 and -1 reward be used?

I find from here that all the rewards are add into the deque. We need to sample the 1 and -1 reward from the deque to use them. So do you think it may be slow.

In Chinese:是不是reward为1和-1的情况也都放在deque里,那么reward为1和-1的被sample出来的几率岂不是很低,反馈就会很慢?

@songrotek Thank you.

Not found: saved_networks/network-dqn-10000

我套用了您的BrainDQN_Nature模块,基本都没有改变,但是跑起来的时候发现有个问题是:
tensorflow.python.framework.errors.NotFoundError: saved_networks/network-dqn-10000.tempstate14721420171424531239
[[Node: save/save = SaveSlices[T=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/save/tensor_names, save/save/shapes_and_slices, Variable, Variable/Adam, Variable/Adam_1, Variable_1, Variable_1/Adam, Variable_1/Adam_1, Variable_10, Variable_11, Variable_12, Variable_13, Variable_14, Variable_15, Variable_16, Variable_17, Variable_18, Variable_19, Variable_2, Variable_2/Adam, Variable_2/Adam_1, Variable_3, Variable_3/Adam, Variable_3/Adam_1, Variable_4, Variable_4/Adam, Variable_4/Adam_1, Variable_5, Variable_5/Adam, Variable_5/Adam_1, Variable_6, Variable_6/Adam, Variable_6/Adam_1, Variable_7, Variable_7/Adam, Variable_7/Adam_1, Variable_8, Variable_8/Adam, Variable_8/Adam_1, Variable_9, Variable_9/Adam, Variable_9/Adam_1, beta1_power, beta2_power)]]
请问下这个是啥问题,该在哪里修改?谢谢!

Why did you need copyTargetQNetwork

I have no idea about the meaning of copyTargetQNetwork. Why did we need QValueT to eval the QValue_batch? In order to let training process more stable ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.