Git Product home page Git Product logo

garage's Introduction

Docs Garage CI License codecov PyPI version

garage

garage is a toolkit for developing and evaluating reinforcement learning algorithms, and an accompanying library of state-of-the-art implementations built using that toolkit.

The toolkit provides wide range of modular tools for implementing RL algorithms, including:

  • Composable neural network models
  • Replay buffers
  • High-performance samplers
  • An expressive experiment definition interface
  • Tools for reproducibility (e.g. set a global random seed which all components respect)
  • Logging to many outputs, including TensorBoard
  • Reliable experiment checkpointing and resuming
  • Environment interfaces for many popular benchmark suites
  • Supporting for running garage in diverse environments, including always up-to-date Docker containers

See the latest documentation for getting started instructions and detailed APIs.

Installation

pip install --user garage

Examples

Starting from version v2020.10.0, garage comes packaged with examples. To get a list of examples, run:

garage examples

You can also run garage examples --help, or visit the documentation for even more details.

Join the Community

Join the garage-announce mailing list for infrequent updates (<1/mo.) on the status of the project and new releases.

Need some help? Want to ask garage is right for your project? Have a question which is not quite a bug and not quite a feature request?

Join the community Slack by filling out this Google Form.

Algorithms

The table below summarizes the algorithms available in garage.

Algorithm Framework(s)
CEM numpy
CMA-ES numpy
REINFORCE (a.k.a. VPG) PyTorch, TensorFlow
DDPG PyTorch, TensorFlow
DQN PyTorch, TensorFlow
DDQN PyTorch, TensorFlow
ERWR TensorFlow
NPO TensorFlow
PPO PyTorch, TensorFlow
REPS TensorFlow
TD3 PyTorch, TensorFlow
TNPG TensorFlow
TRPO PyTorch, TensorFlow
MAML PyTorch
RL2 TensorFlow
PEARL PyTorch
SAC PyTorch
MTSAC PyTorch
MTPPO PyTorch, TensorFlow
MTTRPO PyTorch, TensorFlow
Task Embedding TensorFlow
Behavioral Cloning PyTorch

Supported Tools and Frameworks

garage requires Python 3.6+. If you need Python 3.5 support, the last garage release to support Python 3.5 was v2020.06.

The package is tested on Ubuntu 18.04. It is also known to run on Ubuntu 16.04, 18.04, and 20.04, and recent versions of macOS using Homebrew. Windows users can install garage via WSL, or by making use of the Docker containers.

We currently support PyTorch and TensorFlow for implementing the neural network portions of RL algorithms, and additions of new framework support are always welcome. PyTorch modules can be found in the package garage.torch and TensorFlow modules can be found in the package garage.tf. Algorithms which do not require neural networks are found in the package garage.np.

The package is available for download on PyPI, and we ensure that it installs successfully into environments defined using conda, Pipenv, and virtualenv.

Testing

The most important feature of garage is its comprehensive automated unit test and benchmarking suite, which helps ensure that the algorithms and modules in garage maintain state-of-the-art performance as the software changes.

Our testing strategy has three pillars:

  • Automation: We use continuous integration to test all modules and algorithms in garage before adding any change. The full installation and test suite is also run nightly, to detect regressions.
  • Acceptance Testing: Any commit which might change the performance of an algorithm is subjected to comprehensive benchmarks on the relevant algorithms before it is merged
  • Benchmarks and Monitoring: We benchmark the full suite of algorithms against their relevant benchmarks and widely-used implementations regularly, to detect regressions and improvements we may have missed.

Supported Releases

Release Build Status Last date of support
v2021.03 Garage CI Release-2021.03 May 31st, 2021

Maintenance releases have a stable API and dependency tree, and receive bug fixes and critical improvements but not new features. We currently support each release for a window of 2 months.

Citing garage

If you use garage for academic research, please cite the repository using the following BibTeX entry. You should update the commit field with the commit or release tag your publication uses.

@misc{garage,
 author = {The garage contributors},
 title = {Garage: A toolkit for reproducible reinforcement learning research},
 year = {2019},
 publisher = {GitHub},
 journal = {GitHub repository},
 howpublished = {\url{https://github.com/rlworkgroup/garage}},
 commit = {be070842071f736eb24f28e4b902a9f144f5c97b}
}

Credits

The earliest code for garage was adopted from predecessor project called rllab. The garage project is grateful for the contributions of the original rllab authors, and hopes to continue advancing the state of reproducibility in RL research in the same spirit. garage has previously been supported by the Amazon Research Award "Watch, Practice, Learn, Do: Unsupervised Learning of Robust and Composable Robot Motion Skills by Fusing Expert Demonstrations with Robot Experience."


Made with ❤  at and  

garage's People

Contributors

ahtsan avatar airuichen avatar avnishn avatar breakend avatar catherinesue avatar cheng-kevin avatar cjcchen avatar dementrock avatar gautams3 avatar gitanshu avatar haydenshively avatar hejia-zhang avatar irisliucy avatar jonashen avatar krzentner avatar lywong92 avatar maliesa96 avatar naeioi avatar nicolengsy avatar nish21 avatar pelillian avatar ryanjulian avatar sytham avatar utkarshjp7 avatar vitchyr avatar yeukfu avatar yonghyuc avatar zequnyu avatar zhanpenghe avatar ziyiwu9494 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

garage's Issues

tf/TD3

Incorporate the TD3 algorithm in garage.

DQN for TensorFlow

Original paper:
https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf

OpenAI baselines implementation:
https://github.com/openai/baselines/tree/master/baselines/deepq

(There are many more resources on Google with DQN implementations in TF)

Sketch:

  • Provide an implementation in sandbox/rocky/tf/dqn.py
  • Add any needed primitives to rllab/ and sandbox/rocky/tf/
  • Provide a regression test (of the reward curve) against the openai/baselines implementation

Imported from ryanjulian/rllab#50

Normalize batch shaping codebase-wide

Currently this is proxied by Policy.recurrent, but there are loss functions for non-recurrent policies which need fixed-length input trajectories/valid variables (i.e. any time you want to differentiate through the loss function).

Permanent fix for 'GLEW initalization error: Missing GL version' on Linux machines

Right now the temporary solution to this issue is prepend python examples/xxx.py with LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so:/usr/lib/nvidia-384/libGL.so; note that you must replace nvidia-384 to the one installed on your machine (use nvidia-smi to determine the driver version currently in use). Relevant comments are here and here, and this link is a wrapper written as a temporary fix.

However, the more permanent solution would require pre-loading without the use of LD_PRELOAD on the command line. See DeepMind's implementation as a starting point.

Imported from ryanjulian/rllab#117

Cross-platform support for TF asynchronous plotting

The current implementation of async plotting uses Multithreading. However, using multiprocessing will throw segmentation for Mac OSX machines, but we want to prioritize multiprocessing for Linux machines. Therefore, write code that will support both implementations.

Fix mujoco env

I found an issue in mujoco_env. The MjSim of mujoco_py does not has an attribute of geom_margin, which is used by _get_full_obs(). This is caused by a pr that refactors rllab.mujoco_py to mujoco_py

There might be other issues imported this pr. So a regression test of mujoco envs is desired for this issue.

Bullet support

We would like to add support for the Bullet physics engine to rllab. Thankfully, the Bullet team have recently provided Python bindings in the form of pybullet, and even provides examples of how to implement the gym.Env interface (from OpenAI Gym) using pyBullet.

This task is to add pybullet to the rllab conda environment, and implement a class (similar to GymEnv, e.g. BulletEnv) which allows any rllab algorithm to learn against pybullet environments. You will also need to implement the plot interface, if pybullet does not already, which shows the user a 3D animation of the environment. Essentially, you should duplicate the experience of running one of the MuJoCo-based examples (e.g. trpo_swimmer.py), but using a Bullet environment instead. You should include examples (in examples/ and sandbox/rocky/tf/launchers/) of launcher scripts which use an algorithms (suggestion: TRPO) to train the KukaGymEnv environment.

This is conceptually the same as GymEnv, which allows rllab users to import any OpenAI Gym environment and learn against them. In fact, pybullet environments implement the Gym interface, so in theory we should be done as soon as we can import pybullet. In practice, our constructor for Gym environments only takes the string name (e.g. "Humanoid-v1") of a Gym environment, not the class of a Gym environment. The pybullet environments do not have string shortcuts because they are not part of the official Gym repository. Furthermore, we'd like to use other unofficial Gym environments in rllab, but it is currently difficult for the same reason.

So you might structure this task as two pull requests (1) adding pybullet to the conda environment and (2) Modifying GymEnv to support arbitrary environments which implement the gym.Env interface (attempted in ryanjulian/rllab#12).

Consider this a professional software engineering task, and provide a high-quality solution which does not break existing users, minimizes change, and is stable. Please always use PEP8 style in your code, and format it using YAPF (with the PEP8 setting). Submit your pull request against the integration branch of this repository.

Some notes:

  • You can find examples of how to launch rllab in examples and sandbox/rocky/tf/launchers. Note that everything must run using the run_experiment_lite wrapper.
  • rllab currently has two parallel implementation of the neural network portions of the library. The original is written in Theano and is found in rllab/. The tree sandbox/rocky/tf re-implements classes from the original tree using TensorFlow, and is backwards-compatible with the Theano tree. We are working towards using only one NN library soon, but for now your implementation needs to work in both trees.
  • rllab is an upstream dependency to many projects, so it is important we do not break the existing APIs. Adding to APIs is fine as long as there is a good reason.

Imported from ryanjulian/rllab#5

Create ray sampler

Right now rllab uses parallelism in an ad-hoc manner through multiprocessing library, mostly in the sampler. If we use a principled parallelism library (e.g. mpi, ray, others), we can probably clean up the code while avoiding tricky multiprocessing bugs in the future.

Imported from ryanjulian/rllab#82

Asynchronous plotting for TensorFlow

Task

Presently asynchronous plotting/3D rendering is not supported in the part of rllab based on TensorFlow (sandbox.rocky.tf), but it is supported in the rllab code which uses Theano.

This means that when you turn plotting on for a Theano training session, the plot does not block the training process. The TensorFlow implementation runs the rendering loop directly in the training algorithm (rather than a worker), so it blocks. This makes training using TensorFlow much slower than Theano when plotting is turned on (they are about the same without plotting).

TensorFlow's notion of a session makes this tricky. I'm not 100% sure that there is a solution. If you figure out that it is impossible, or requires rewriting large parts of the repository, email me with what you tried and some explanations why.

Current behavior
3D plotting with MuJoCo in TensorFlow is synchronous, blocks the training process.

Desired behavior
3D plotting with MuJoCo in TensorFlow is asynchronous and does not block the training process (just as with Theano)

Pointers to relevant source

Submission instructions

Fork this repository into you own Github account, and implement this feature in its own branch, based on the master branch. When you are done and would like a code review, send me an email with a link to your feature branch. DO NOT SUBMIT A PULL REQUEST TO THIS REPOSITORY.

Consider this a professional software engineering task, and provide a high-quality solution which does not break existing users, minimizes change, and is stable. Tests are welcome where appropriate. Please always use PEP8 style in your code, and format it using YAPF (with the PEP8 setting).

Notes

  • Testing the software requires a freely-available student license for MuJoCo available here. It takes a couple days to get approved, so do it early.
  • rllab has setup instructions at http://rllab.readthedocs.io
  • You can find examples of how to launch rllab in examples and sandbox/rocky/tf/launchers. Note that everything must run using the run_experiment_lite wrapper with the parameter n_parallel greater than 1 (this triggers multiprocess operation).
  • rllab currently has two parallel implementation of the neural network portions of the library. The original is written in Theano and is found in rllab/. The tree sandbox/rocky/tf re-implements classes from the original tree using TensorFlow, and is backwards-compatible with the Theano tree. We are working towards using only one NN library soon, but for now your implementation needs to work in both trees.
  • rllab is an upstream dependency to many projects, so it is important we do not break the existing APIs. Adding to APIs is fine as long as there is a good reason.

Imported from ryanjulian/rllab#1

Add tf GPU options

It should be possible to set the TensorFlow session options whenever a tf.Session is created for training, such as here (other places where a session is constructed might not need to use these options). It is sometimes necessary to limit the available memory to tf running on a GPU, etc.
A possible implementation could allow the user to specify the GPU options via a ConfigProto setting in config_personal.py (set to None by default).

Imported from ryanjulian/rllab#123

DDPG for TensorFlow

Original paper:
https://arxiv.org/abs/1509.02971

Implementation in the Theano tree:
https://github.com/ryanjulian/rllab/blob/integration/rllab/algos/ddpg.py

OpenAI baselines implementation:
https://github.com/openai/baselines/tree/master/baselines/ddpg

blog post (there are many other resources
http://pemami4911.github.io/blog/2016/08/21/ddpg-rl.html

Sketch:

  • Provide an implementation in sandbox/rocky/tf/ddpg.py
  • Add any needed primitives to rllab/ and sandbox/rocky/tf/
  • Provide a regression test (of the reward curve) against the openai/baselines implementation

Imported from ryanjulian/rllab#26

Replace Distributions with tf.Distributions

To add some details: Distributions are used by policies and other modules to add distribution functionality, such as computing the KL divergence between two distributions etc. given the parameters of a distribution (which are often output tensors of an NN). TF.Distributions probably implements the same thing so that we should try to replace our code by using the TF counterpart.

Imported from ryanjulian/rllab#52

Rewrite the logger

The current logger interface is okay, but the implementation is a bit messy and the TensorBoard integration is kind of a bolt-on afterthought. It's also package-global which can induce bad implementation decisions.

I'd like to rewrite the logger to be properly encapsulated as a class(es). There can still be a global singleton instance for easy access.

Ideas:

  • Eliminate global scope
  • First-class TensorBoard support
  • Multiprocess-aware logging
  • Make the logger API appropriate for all aspects of rllab (e.g. bring-up, training algo, random debugging, etc.) to avoid code peppered with print statements
  • Decouple logged datapoints from output formats
  • Decouple checkpointing and logging
  • Sophisticated checkpoint/log destinations (e.g. remote buckets?)
  • Take advantage of a minimalist logging framework rather than hand-crafting logger formats?

Imported from ryanjulian/rllab#80

Replace conda with a pip package

conda makes it difficult to use rllab as a library.

We would like to transition to using the standard Python package interface. This will require getting all the dependencies to install using pip, plus probably a some custom setup scripts for setup.py.

Adding wheel compilation to the CI (e.g. appveyor) is also in-scope for this project.

Imported from ryanjulian/rllab#81

Move theano-specific code to garage.theano

Theano should no longer be first-class while tf second-class. We are aiming for major parts of rllab to be NN-framework agnostic.

This should move theano-specific components into garage.theano, while stripping Theano dependencies from common parts of the code.

Imported fromhttps://github.com/ryanjulian/rllab/issues/83

Move sandbox.rocky.tf to rllab.tf

We are moving towards making the common parts of rllab agnostic of the NN library. TensorFlow should no longer be a second-class citizen.

This change would remove the TensorFlow sandbox and make the TensorFlow tree a first-class rllab citizen.

Imported from ryanjulian/rllab#84

Support multi-modal policies

In order to support visuomotor control learning and other problems, we need to implement a way to use policies that consist of submodules which handle certain input modalities, such as images and vectors. OpenAI Gym already has support for a tuple_space that is a tuple of different spaces. The most common use-case of such multi-modal observation spaces are combinations of 2d images and vectors.

Exact specification needs to be done but for now the task items look as follows:

  • add a new space representing 2d images
  • implement a test environment that has a tuple_space as observation space consisting of an image and a vector (e.g. reacher with top-down view image and 2d endeffector position)
  • additionally a wrapper would be useful that adds a visual output to an existing environment (renders user-defined camera to 2d pixel array and adds it to the tuple space, or makes a tuple space if environment was unimodal before)
  • implement a multi-modal policy that builds convolutional submodules for image spaces and MLPs for vectors, and merges the top layers from these submodules via an MLP that computes the final output

Your feedback on this issue is most welcome so that we can split up this feature into smaller tasks.

Imported from ryanjulian/rllab#108

Replace rllab.envs.Env with gym.Env

The community has settled on gym.Env as a de-facto standard environment interface. There's no reason to keep our own around.

The scope of this change is to remove the rllab.envs.Env base interface, and refactor implementing classes to instead implement gym.Env. Note that this explicitly does not mean that we are adopting the physics engine, registration system, benchmarks, etc of OpenAI Gym--just the gym.Env abstract interface.

Imported from ryanjulian/rllab#85

InvertedDoublePendulumEnv is broken

Traceback (most recent call last):
  File "tests/envs/test_envs.py", line 65, in <module>
    envs = [cls() for cls in simple_env_classes]
  File "tests/envs/test_envs.py", line 65, in <listcomp>
    envs = [cls() for cls in simple_env_classes]
  File "/home/rjulian/code/garage/rllab/envs/mujoco/point_env.py", line 21, in __init__
    super(PointEnv, self).__init__(*args, **kwargs)
  File "/home/rjulian/code/garage/rllab/envs/mujoco/mujoco_env.py", line 85, in __init__
    self.reset()
  File "/home/rjulian/code/garage/rllab/envs/mujoco/mujoco_env.py", line 131, in reset
    return self.get_current_obs()
  File "/home/rjulian/code/garage/rllab/envs/mujoco/mujoco_env.py", line 134, in get_current_obs
    return self._get_full_obs()
  File "/home/rjulian/code/garage/rllab/envs/mujoco/mujoco_env.py", line 138, in _get_full_obs
    cdists = np.copy(self.sim.geom_margin).flat
AttributeError: 'mujoco_py.cymj.MjSim' object has no attribute 'geom_margin'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.