Git Product home page Git Product logo

evolution's Introduction

evolution

This is a local, not distributed, go, not python, implementation of the Evolution Strategies as a Scalable Alternative to Reinforcement Learning (Salimans et. al). The original starter from the paper can be found openai/evolution-strategies-starter. Under the covers it uses the openai/gym-http-api, more specifically binding-go, and uses unixpickle/anynet and unixpickle/anyvec for efficient high-level vector computation. Enjoy!

instructions

The goal is to solve CartPole-v0, This requires 195 epochs/reward over 100 episodes. Install openai/gym, openai/gym-http-api is a dependency required from the Go source.

Get the binary. Clone, download, or whatever you want, or just

$ go get github.com/wenkesj/evolution

In a seperate terminal, open the gym from wherever github.com/openai/gym-http-api is located in your fs.

$ python gym_http_server.py

Run the trainer and evaluater with whatever concauction you choose.

$ # 200 episodes of "training" by 2 agents and 100
$ # finalepisodes of evaluation with a single agent
$ # Saving results to a directory "~/agents2eps200"
$ evolution --outmonitor ~/agents2eps200 \
  --finalepisodes 100 \
  --episodes 200
  --agents 2

example results

cartpole average training example

So, after 42 episodes, the 2 agents evolve enough to simply destroy at the game on their own. In this simple case, we apply a cutoff average reward of 195 or above for both agents, signifying the parameters on average should be able to solve the game with a single offspring. So we test that fact,

cartpole average evaluation example

And it works! We get 198.5 average reward over 100 episodes!

roadmap

  • Parallelize where needed to avoid embarrassment ๐Ÿ˜
  • 32/64 bit support?
  • Support multiple environments gym-http-api#47
  • Serialize/deserialize networks the anynet way
    • Goals $ evolution -net net.proto -env Pong-v0 ...
    • Input network for specific environment (i.e. Pong-v0)
    • Save/load
  • Optimizations
  • Plotting, statistics, performance profiling, uploading

disclaimer

This is a project for my Complex Systems and Networks class. This isn't meant to be comparable to the original work; I'm not a master coder/statistical god/andrej karpathy, I just thought this was a cool idea. This is an implementation with results and intrepretation.

evolution's People

Contributors

wenkesj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

evolution's Issues

Normally-distributed noise

Theoretically, Evolution Strategies requires normally-distributed noise in order to be an accurate gradient estimator. I see here that noise is generated as a random number from [-1, 1], which is not a normal distribution (rather, it's a uniform distribution). You can sample a normal distribution with r.NormFloat64(). I'd love to see how/if this affects performance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.