Git Product home page Git Product logo

mbppol's Introduction

MBPPO-Lagrangian

This repository contains code for the paper "Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm" accepted at NeurIPS 2022. Read paper here.

1) Requirements - 
    a) Python 3.7+
    b) PyTorch==1.10.0 and cuda11.3
    c) numpy==1.21.4
    d) gym==0.15.7 
    e) Hardware : Cuda supported GPU with atleast 4GB memory
2) Install mujoco200 using https://roboti.us/download/mujoco200_linux.zip 
3) Install Safety Gym using https://github.com/openai/safety-gym
4) For reproducing results (upto same extent because of seed randomness) -
    a) Take backup of  /…/safety-gym/safety_gym/envs/suite.py 
    b) Copy ./src/env_suite_file/suite.py to above path. This removes "Vases" and increases "Hazards" from 10 to 15.
    c) Change ‘num_steps’ = 750’ in ‘DEFAULT’ dict of class Engine in  /…/safety-gym/safety_gym/envs/engine.py 
    d) Run for 8 random seeds :
        i) cd src
        ii) python3  mbppo_lagrangian.py –exp_name=”experiment_name” –seed=0 –env=”<environment_name>” –beta=0.02

Where environment names are [Safexp-PointGoal2-v0,Safexp-CarGoal2-v0]

5) Use https://github.com/openai/safety-starter-agents/blob/master/scripts/plot.py for plotting -  
a) python plot.py –logdir=’<path to data>’’ --value=<plot_choice>

Where plot_choice are ‘AverageEpRet’ for reward performance, ‘AverageEpCost’ for cost performance.  

If you are using this in your work, please cite using :

@inproceedings{NEURIPS2022_9a8eb202, author = {Jayant, Ashish K and Bhatnagar, Shalabh}, booktitle = {Advances in Neural Information Processing Systems}, editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh}, pages = {24432--24445}, publisher = {Curran Associates, Inc.}, title = {Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm}, url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/9a8eb202c060b7d81f5889631cbcd47e-Paper-Conference.pdf}, volume = {35}, year = {2022} }

mbppol's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

timckai yuning30

mbppol's Issues

Unable to run mbppo_lagrangian.py

Hi @akjayant ,

I followed the installation pipeline from the README.md but I get the following conflicts when running python3.8 mbppo_lagrangian.py --exp_name="test_1" --seed=0 --env="Safexp-PointGoal2-v0" --beta=0.02:

Traceback (most recent call last):
File "mbppo_lagrangian.py", line 9, in
from utils.logx import EpochLogger
File "/lustre03/project/6002409/cmartel/safe_rl/mbppo/mbppol/src/utils/logx.py", line 16, in
from .mpi_tools import proc_id, mpi_statistics_scalar
File "/lustre03/project/6002409/cmartel/safe_rl/mbppo/mbppol/src/utils/mpi_tools.py", line 1, in
from mpi4py import MPI
ModuleNotFoundError: No module named 'mpi4py'
(mbppo) [cmartel@beluga2 src]$ module load mpi4py
(mbppo) [cmartel@beluga2 src]$ vim ~/.bashrc
connect 192.168.2.224 port 6000: Connection refused
(mbppo) [cmartel@beluga2 src]$ source .bashrc
-bash: .bashrc: No such file or directory
(mbppo) [cmartel@beluga2 src]$ source ~/.bashrc
(mbppo) [cmartel@beluga2 src]$ python3.8 mbppo_lagrangian.py --exp_name="test_1" --seed=0 --env="Safexp-PointGoal2-v0" --beta=0.02
2023-07-03 19:00:17.595867: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Creating environment: Safexp-PointGoal2-v0
Traceback (most recent call last):
File "mbppo_lagrangian.py", line 795, in
env = SafetyGymEnv(robot=robot, task="goal", level='2', seed=10, config=env_config)
File "/lustre03/project/6002409/cmartel/safe_rl/mbppo/mbppol/src/env_utils.py", line 57, in init
self.env = gym.make(env_name)
File "/lustre03/project/6002409/cmartel/venv/mbppo/lib/python3.8/site-packages/gym/envs/registration.py", line 156, in make
return registry.make(id, **kwargs)
File "/lustre03/project/6002409/cmartel/venv/mbppo/lib/python3.8/site-packages/gym/envs/registration.py", line 101, in make
env = spec.make(**kwargs)
File "/lustre03/project/6002409/cmartel/venv/mbppo/lib/python3.8/site-packages/gym/envs/registration.py", line 72, in make
cls = load(self.entry_point)
File "/lustre03/project/6002409/cmartel/venv/mbppo/lib/python3.8/site-packages/gym/envs/registration.py", line 17, in load
mod = importlib.import_module(mod_name)
File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx512/Core/python/3.8.10/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 848, in exec_module
File "", line 219, in _call_with_frames_removed
File "/lustre03/project/6002409/cmartel/safe_rl/mbppo/safety-gym/safety_gym/envs/mujoco.py", line 9, in
from safety_gym.envs.engine import * # noqa
File "/lustre03/project/6002409/cmartel/safe_rl/mbppo/safety-gym/safety_gym/envs/engine.py", line 9, in
import mujoco_py
File "/lustre03/project/6002409/cmartel/venv/mbppo/lib/python3.8/site-packages/mujoco_py/init.py", line 3, in
from mujoco_py.builder import cymj, ignore_mujoco_warnings, functions, MujocoException
File "/lustre03/project/6002409/cmartel/venv/mbppo/lib/python3.8/site-packages/mujoco_py/builder.py", line 506, in
cymj = load_cython_ext(mujoco_path)
File "/lustre03/project/6002409/cmartel/venv/mbppo/lib/python3.8/site-packages/mujoco_py/builder.py", line 101, in load_cython_ext
mod = load_dynamic_ext('cymj', cext_so_path)
File "/lustre03/project/6002409/cmartel/venv/mbppo/lib/python3.8/site-packages/mujoco_py/builder.py", line 125, in load_dynamic_ext
return loader.load_module()
File "mujoco_py/cymj.pyx", line 1, in init mujoco_py.cymj
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

It seems to be related to the numpy version required by the different packages. I am currently trying to run using numpy==1.21.4. Would it be possible to provide:

  1. The commit id of the mujoco-py repository or the version of the mujoco-py package used to run the script
  2. The commit if of the safety_gym repository used to run the script

Incorrect calculation

ldv = dist_xy(o[40:],goal_posv)

Seems like here is a bug
ov,staticv = env.reset()
ldv = dist_xy(o[40:],goal_posv)

Cuz firstly you have calculated the initial observation ov in
ov,staticv = env.reset()
But use o instead of ov,staticv = env.reset(), so the initial distance ld is incorrect.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.