Light

kamyargh / rl_swiss Goto Github PK

View Code? Open in Web Editor NEW

62.0 2.0 11.0 1.97 MB

License: MIT License

Python 99.93% TeX 0.06% Shell 0.01%

rl_swiss's Introduction

All of the things mentioned in the implemented section are not yet implemented in the refactored version. Hopefully will be done by the end of the weekend

Important Notes

This repository (rlswiss) has been extended from the August 2018 version of rlkit. Since then the design approaches of rlswiss and rlkit have deviated quite a bit, and it is for this reason that we are releasing rlswiss as a separate repository. If you find this repository useful for your research/projects, please cite this repository as well as rlkit.

rlswiss

Reinforcement Learning (RL) and Learning from Demonstrations (LfD) framework for the single task as well as meta-learning settings.

Our goal throughout has been to make it very efficient to implement new ideas quickly and cleanly. The core infrastructure is learning-framework-agnostic (PyTorch, Tf, etc.), however current implementations of specific algorithms are all in PyTorch.

Implemented RL algorithms:

Soft-Actor-Critic (SAC)

Implemented LfD algorithms:

Adversarial methods for Inverse Reinforcement Learning
- AIRL / GAIL / FAIRL / Discriminator-Actor-Critic
Behaviour Cloning
DAgger

Implemented Meta-RL algorithms:

RL with observed task parameters

Implemented Meta-LfD algorithms:

SMILe
Meta Behaviour Cloning
Meta DAgger

rl_swiss's People

Contributors

Stargazers

Watchers

Forkers

weiqiao marioyc wwxfromtju xinzhang525 yifan-you-37 keuntaeklee infwinston zhongpuxia 1998x-stack qinghuachen007 wzk136915209

rl_swiss's Issues

scalable meta inverse reinforcement learning Code

Where is the code of a paper titled "SMILe : Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies?"

Error when running the demo

Hi, I was running the demo code
python run_experiment.py --nosrun -e exp_specs/sac.yaml

but get following error, do you have any idea why this happens? Thank you!
(I am using pytorch 1.6.0 and python version 3.7.9)

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [32, 1]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

rlkit/launchers/config.py missing

Hi, thanks for releasing your code for reproduction. However, due to the lack of the rlkit/launchers/config.py, I do not know how to appropriately modify it and run the experiments. Would you please check this problem? Thanks?

Wrong implementation of AIRL

I check the code and I wonder if you implement AIRL simply by changing the reward function as the disc logit? This is different from the original paper where they use a disentangled discriminator which is computed by f / f + \pi where f is an approximation of "exp(r)" and \pi is the policy.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.