Git Product home page Git Product logo

mario-ppo's Introduction

PRs Welcome

Mario-PPO

For continuous domain (Mujoco) look at this.

29 / 32 Levels Completed (Let the page to be fully loaded)!









Usage

experiment = Experiment() # Add your Comet configs!

How to run

usage: main.py [-h] [--world WORLD] [--stage STAGE]
               [--total_iterations TOTAL_ITERATIONS] [--interval INTERVAL]
               [--do_train] [--render] [--train_from_scratch]

Variable parameters based on the configuration of the machine or user's choice

optional arguments:
  -h, --help            show this help message and exit
  --world WORLD         The id number of the mario world.
  --stage STAGE         The id number of the mario world's stage.
  --total_iterations TOTAL_ITERATIONS
                        The total number of iterations.
  --interval INTERVAL   The interval specifies how often different parameters
                        should be saved and printed, counted by iterations.
  --do_train            The flag determines whether to train the agent or play
                        with it.
  --render              The flag determines whether to render each agent or
                        not.
  --train_from_scratch  The flag determines whether to train from scratch or
                        continue previous tries.
  • In order to train the agent with your choice of world and stage (do_train flag changes the mode of training to testing the agent):
python3 main.py --world=2 --stage=2
  • If you want to keep training your previous run, execute the following:
python3 main.py --world=2 --stage=2 --train_from_scratch

Hardware requirements

  • The whole training procedure was done on Quadro P5000 on paperspace.com thus, a machine with similar configuration would be sufficient.

Documented Hyper-Parameters fo different worlds (W) and stages (S)

I forgot to document hyper-parameters for all environments on comet.ml. ๐Ÿ˜…

W-S T n_epochs batch_size lr gamma lambda ent_coeff clip_range n_workers grad_clip_norm
1-1 128 8 64 2.5e-4 0.9 0.95 0.01 0.2 8 0.5
1-2 128 8 64 2.5e-4 0.9 0.95 0.01 0.2 8 0.5
2-2 128 10 32 2.5e-4 0.9 0.95 0.01 0.2 8 No Clipping
3-1 128 4 64 2.5e-4 0.9 0.95 0.01 0.2 8 No Clipping
3-2 128 4 64 2.5e-4 0.9 0.95 0.01 0.2 8 No Clipping
3-3 128 8 64 1e-4 0.9 0.95 0.01 0.1 8 1
3-4 128 8 64 1e-4 0.9 0.95 0.01 0.1 8 1
4-1 128 8 64 1e-4 0.9 0.95 0.01 0.1 8 1
4-2 128 8 64 1e-4 0.95 0.95 0.01 0.2 8 No Clippiing
4-3 128 8 64 2.5e-4 0.97 0.95 0.01 0.2 8 0.5
5-1 128 8 64 2.5e-4 0.97 0.95 0.01 0.2 8 0.5
5-2 128 8 64 2.5e-4 0.97 0.95 0.01 0.2 8 0.5
5-3 128 8 64 2.5e-4 0.98 0.98 0.03 0.2 8 0.5
6-1 128 8 64 2.5e-4 0.9 0.95 0.01 0.2 8 0.5
6-2 128 8 64 2.5e-4 0.9 0.95 0.01 0.2 8 0.5
6-3 128 8 64 2.5e-4 0.98 0.98 0.03 0.2 8 0.5
6-4 128 8 64 2.5e-4 0.9 0.95 0.01 0.2 8 0.5
7-1 128 8 64 2.5e-4 0.9 0.95 0.01 0.2 8 0.5
7-2 128 8 64 2.5e-4 0.9 0.95 0.01 0.2 8 0.5
7-3 128 8 64 2.5e-4 0.9 0.95 0.01 0.2 8 0.5
8-1 128 8 64 2.5e-4 0.9 0.95 0.01 0.2 8 0.5
8-2 128 8 64 2.5e-4 0.9 0.95 0.01 0.2 8 0.5
8-3 128 8 64 2.5e-4 0.9 0.95 0.01 0.2 8 0.5
  • For world 4 - stage 2, the effect of the game score was not used since it distracts Mario from his original goal which is reaching the flag.

Acknowledgment

  1. @OpenAI for Mario Wrapper.
  2. @uvipen for Super-mario-bros-PPO-pytorch.
  3. @roclark for Mario Reward.

mario-ppo's People

Contributors

alirezakazemipour avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.