Light

younggyoseo / cadm Goto Github PK

View Code? Open in Web Editor NEW

60.0 7.0 8.0 81 KB

CaDM: Context-aware Dynamics Model for Generalization in Model-based Reinforcement Learning

Home Page: https://sites.google.com/view/cadm

Python 100.00%

reinforcement-learning rl deep-learning mujoco codebase model-based deep-reinforcement-learning ppo deep-learning-algorithms

cadm's Introduction

Context-aware Dynamics Model for Generalization in Model-based Reinforcement Learning

This repository contains code for the paper "Context-aware Dynamics Model for Generalization in Model-based Reinforcement Learning"

Requirements

python3
tensorflow==1.15.0
gym==0.16.0
baselines==0.1.5
mujoco-py==2.0.2.9

Model-based RL

Context-aware Dynamics Model

Vanilla DM + CaDM

python -m run_scripts.run_cadm_pets --dataset halfcheetah --policy_type CEM --n_candidate 200 \
--normalize_flag --ensemble_size 1 --n_particles 1 --deterministic_flag 1 --history_length 10 \
--future_length 10 --seed 0

PE-TS + CaDM

python -m run_scripts.run_cadm_pets --dataset halfcheetah --policy_type CEM --n_candidate 200 \
--normalize_flag --ensemble_size 5 --n_particles 20 --deterministic_flag 0 --history_length 10 \
--future_length 10 --seed 0

Baselines

Vanilla DM

python -m run_scripts.run_pets --dataset halfcheetah --policy_type CEM --n_candidate 200 \
--normalize_flag --ensemble_size 1 --n_particles 1 --deterministic_flag 1 --seed 0

PE-TS

python -m run_scripts.run_pets --dataset halfcheetah --policy_type CEM --n_candidate 200 \
--normalize_flag --ensemble_size 5 --n_particles 20 --deterministic_flag 0 --seed 0

Model-free RL

Context-aware Dynamics Model

PPO + (Vanilla + CaDM)

python -m run_scripts.model_free.run_ppo_cadm --entropy_coeff 0.0 --lr 0.0005 \
--num_rollouts 10 --num_steps 200 --num_minibatches 4 --policy_type CEM --n_candidate 200 \
--normalize_flag --deterministic_flag 1 --ensemble_size 1 --n_particles 1 --history_length 10 \
--future_length 10 --load_path [saved_path] --seed 0

PPO + (PE-TS + CaDM)

python -m run_scripts.model_free.run_ppo_cadm --entropy_coeff 0.0 --lr 0.0005 \
--num_rollouts 10 --num_steps 200 --num_minibatches 4 --policy_type CEM --n_candidate 200 \
--normalize_flag --deterministic_flag 0 --ensemble_size 5 --n_particles 20 --history_length 10 \
--future_length 10 --load_path [saved_path] --seed 0

For example, saved_path looks like: data/HALFCHEETAH/NORMALIZED/CaDM/DET/CEM/CAND_200/H_10/F_10/BACK_COEFF_0.5/DIFF/BATCH_256/EPOCH_5/hidden_200_lr_0.001_horizon_30_seed_0/checkpoints/params_epoch_19

cadm's People

Contributors

Stargazers

Watchers

Forkers

guoyijie alitk susanpotters shubhampachori12110095 hkx888 nkpmedia mohammadrezanakhaei zxq-0058

cadm's Issues

When the trained mujoco model is visualized, it is found that the robot will fly

I chose Ant as the environment. After finishing the training, I wrote a script to load the latest model and visualized it. I found that the trained agent did not touch the ground, but because the torque was too large, the agent flew up. When I modified the original xml file, the agent stopped flying when performing random actions and maintained its normal behavior. But when the modified agent is used for training, it is found that the algorithm has lost its original effect and the reward value has no upward trend. How should this problem be solved? Looking forward to your reply, thank you~

Conflict between version of mujoco-py and baselines

Hello there, I'm trying to establish an environment for your repo, and I found the version of baselines and mujoco-py in your readme produce a conflict

The baselines==0.1.5 package requires mujoco-py<2.0,>=1.50, as said in pip, while your repo is using mujoco-py==2.0.2.9

Is there any mis-writting in readme, or should I change my installation version ? I'm using python 3.6 with pip version 21.3.1

How to visualize the environment？

When I run the code, I don't see the visualization related settings. How do I perform the visualization? thanks～

Not working after iteration 0

Hi! I cloned this repo and tried to run one of the experiment (Vanilla DM+CaDM). However, it seems like it stuck after iteration 0 and not running anymore. Do you have any idea about this?

Thank you! Have a nice day!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.