Git Product home page Git Product logo

cartpole-openai-tensorflow's Introduction

Cartpole-OpenAI-Tensorflow

A Tensorflow implementation of an RL agent to balance a Cartpole from OpenAI Gym

Implementation Details:

I have used a Policy gradient (as explained in Chapter 13 of book by Sutton and Barto) based agent to solve the MDP for the cartpole. The position of the cart is fed as an input to a neural network which then produces a probability of the action to choose(only two in this case: right/left).

The neural net is a simple 3 layer feed forward network for which the hidden layer activation is ReLU and output function is sigmoid. The environment resets once the average reward reaches 200. It also starts to render the environment only after average reward is 100 as anything before that is not a good agent and rendering only makes training slower.

Usage

Run:

$ python cartpole.py

Fun-Side Note:

I came across this while reading about actor-critic methods as a part of review on Generative Adversarial Networks. Turns out Policy Gradient Methods are much better in the continuous space than Q-learning or other value function based methods.

Edit:

Turns out that Policy Gradient Methods seem to be doing well in this case because the number of episodes is quite small and the rewards are being provided continuously. In case the rewards were sparse and the number of episodes were high, the technique would fail due to high variance of the algorithm.

cartpole-openai-tensorflow's People

Contributors

ashutoshkrjha avatar

Watchers

James Cloos avatar Gagandeep Garg avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.