Light

dldnxks12 / drqn-pytorch-cartpole-v1 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from keep9oing/drqn-pytorch-cartpole-v1

0.0 0.0 0.0 753 KB

Deep recurrent Q learning on CartPole-v1 environment

Python 100.00%

drqn-pytorch-cartpole-v1's Introduction

Deep Recurrent Q learning(DRQN) with Pytorch

Reference: https://arxiv.org/pdf/1507.06527.pdf

Pytorch(1.5.0)
Openai Gym(0.17.1)
Tensorboard (2.1.0)

Training envrionment: OpenAI gym (CartPolev1)

POMDP

CartPole-v1 environment consists of the cart's position&velocity and pole's angle&velocity.

I set the partially observed state as the position of cart and pole's angle. The agent has any idea of the velocity.

Stable Recurrent Updates

1. Bootstrapped Sequential Updates

episodes are selected randomly from the replay memory then updating stage starts at the beginning of the episode. The targets at each timestep are generated from the target Q-network. The RNN's hidden state is carried forward throughout episode.

2. Bootstrapped Random update

Episodes are selected randomly from the replay memory then updating stage starts at random points in the episode and proceed for only unroll iterations timesteps(lookup_step). The targets at each timestep are generated from the target Q-network. The RNN's initial state is zeroed at the start of the update.

The above parameters are used to set the DRQN setting. random update choose what update method to use.
lookup_step is how long step to observe. I found that longer lookup_step is better.

DQN with Fully Oberserved vs DQN with POMDP vs DRQN with POMDP

(orange)DQN with fully observed MDP situation can reach the highest reward.
(blue)DQN with POMDP never can be reached to the high reward situation.
(red)DRQN with POMDP can be reached the somewhat performance although it only can observe the position.

TODO

Random update of DRQN

drqn-pytorch-cartpole-v1's People

Contributors

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.