Git Product home page Git Product logo

openaigym.jl's Introduction

OpenAIGym

Build Status Gitter

Author: Thomas Breloff (@tbreloff)

This wraps the open source python library gym, released by OpenAI. See their website for more information. Collaboration welcome!


Hello world!

using OpenAIGym
env = GymEnv("CartPole-v0")
for i=1:20
    R, T = episode!(env, RandomPolicy())
    info("Episode $i finished after $T steps. Total reward: $R")
end

If everything works you should see output like this:

INFO: Episode 1 finished after 10 steps. Total reward: 10.0
INFO: Episode 2 finished after 46 steps. Total reward: 46.0
INFO: Episode 3 finished after 14 steps. Total reward: 14.0
INFO: Episode 4 finished after 19 steps. Total reward: 19.0
INFO: Episode 5 finished after 15 steps. Total reward: 15.0
INFO: Episode 6 finished after 32 steps. Total reward: 32.0
INFO: Episode 7 finished after 36 steps. Total reward: 36.0
INFO: Episode 8 finished after 13 steps. Total reward: 13.0
INFO: Episode 9 finished after 62 steps. Total reward: 62.0
INFO: Episode 10 finished after 14 steps. Total reward: 14.0
INFO: Episode 11 finished after 14 steps. Total reward: 14.0
INFO: Episode 12 finished after 28 steps. Total reward: 28.0
INFO: Episode 13 finished after 21 steps. Total reward: 21.0
INFO: Episode 14 finished after 15 steps. Total reward: 15.0
INFO: Episode 15 finished after 12 steps. Total reward: 12.0
INFO: Episode 16 finished after 20 steps. Total reward: 20.0
INFO: Episode 17 finished after 19 steps. Total reward: 19.0
INFO: Episode 18 finished after 17 steps. Total reward: 17.0
INFO: Episode 19 finished after 35 steps. Total reward: 35.0
INFO: Episode 20 finished after 23 steps. Total reward: 23.0

Note: this is equivalent to the python code:

import gym
env = gym.make('CartPole-v0')
for i_episode in xrange(20):
    total_reward = 0.0
    observation = env.reset()
    for t in xrange(100):
        # env.render()
        # print observation
        action = env.action_space.sample()
        observation, reward, done, info = env.step(action)
        total_reward += reward
        if done:
            print "Episode {} finished after {} timesteps. Total reward: {}".format(i_episode, t+1, total_reward)
            break

We're using the RandomPolicy from Reinforce.jl. To do something better, you can create your own policy simply by implementing the action method, which takes a reward, a state, and an action set, then returns an action selection:

type RandomPolicy <: AbstractPolicy end
Reinforce.action(policy::AbstractPolicy, r, s, A) = rand(A)

Note: You can override default behavior of in the episode! method by overriding Reinforce.on_step(env, i, sars) or by passing your own stepfunc. You could also just iterate yourself:

ep = Episode(env, policy)
for (s, a, r, sā€²) in ep
    # do something special?
    OpenAIGym.render(env)
end
R = ep.total_reward
N = ep.niter

Install gym

First install gym. Follow the instructions here if you're using a system-wide python, or to use Conda.jl:

Pkg.add("PyCall")
withenv("PYTHON" => "") do
   Pkg.build("PyCall")
end

then install gym from the command line:

cd /opt
git clone https://github.com/openai/gym
cd gym
~/.julia/v0.5/Conda/deps/usr/bin/pip install -e .[all]

For additional environments, use a similar process. For example, here's how I installed Soccer on my Ubuntu machine:

cd /opt
git clone https://github.com/LARG/HFO
cd HFO
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j4
make install

cd /opt
git clone https://github.com/openai/gym-soccer
cd gym-soccer
~/.julia/v0.5/Conda/deps/usr/bin/pip install -e .

and here's how I installed the ones based on Box2D (for example BipedalWalker-v2) since it didn't work out of the box (due to this issue):

~/.julia/v0.5/Conda/deps/usr/bin/pip uninstall box2d-py
sudo apt-get install build-essential python-dev swig
cd /opt/
git clone https://github.com/pybox2d/pybox2d
cd pybox2d/
~/.julia/v0.5/Conda/deps/usr/bin/pip install -e .

Install OpenAIGym and Reinforce

The easiest way to get started is with MetaPkg:

Pkg.clone("https://github.com/tbreloff/MetaPkg.jl")
using MetaPkg
MetaPkg.add("MetaRL")

which will install OpenAIGym, Reinforce.jl, and the JuliaML Learn ecosystem. You might also want to install the Plots ecosystem with MetaPkg.add("MetaPlots").

To do the install manually, add this julia package:

Pkg.clone("https://github.com/JuliaML/OpenAIGym.jl.git")

and until it's registered in METADATA, you'll also need to manually install Reinforce.jl:

Pkg.clone("https://github.com/JuliaML/Reinforce.jl.git")

openaigym.jl's People

Contributors

tbreloff avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.