Git Product home page Git Product logo

cadm's Introduction

Context-aware Dynamics Model for Generalization in Model-based Reinforcement Learning

This repository contains code for the paper "Context-aware Dynamics Model for Generalization in Model-based Reinforcement Learning"

Requirements

  • python3
  • tensorflow==1.15.0
  • gym==0.16.0
  • baselines==0.1.5
  • mujoco-py==2.0.2.9

Model-based RL

Context-aware Dynamics Model

  1. Vanilla DM + CaDM
python -m run_scripts.run_cadm_pets --dataset halfcheetah --policy_type CEM --n_candidate 200 \
--normalize_flag --ensemble_size 1 --n_particles 1 --deterministic_flag 1 --history_length 10 \
--future_length 10 --seed 0
  1. PE-TS + CaDM
python -m run_scripts.run_cadm_pets --dataset halfcheetah --policy_type CEM --n_candidate 200 \
--normalize_flag --ensemble_size 5 --n_particles 20 --deterministic_flag 0 --history_length 10 \
--future_length 10 --seed 0

Baselines

  1. Vanilla DM
python -m run_scripts.run_pets --dataset halfcheetah --policy_type CEM --n_candidate 200 \
--normalize_flag --ensemble_size 1 --n_particles 1 --deterministic_flag 1 --seed 0
  1. PE-TS
python -m run_scripts.run_pets --dataset halfcheetah --policy_type CEM --n_candidate 200 \
--normalize_flag --ensemble_size 5 --n_particles 20 --deterministic_flag 0 --seed 0

Model-free RL

Context-aware Dynamics Model

  1. PPO + (Vanilla + CaDM)
python -m run_scripts.model_free.run_ppo_cadm --entropy_coeff 0.0 --lr 0.0005 \
--num_rollouts 10 --num_steps 200 --num_minibatches 4 --policy_type CEM --n_candidate 200 \
--normalize_flag --deterministic_flag 1 --ensemble_size 1 --n_particles 1 --history_length 10 \
--future_length 10 --load_path [saved_path] --seed 0
  1. PPO + (PE-TS + CaDM)
python -m run_scripts.model_free.run_ppo_cadm --entropy_coeff 0.0 --lr 0.0005 \
--num_rollouts 10 --num_steps 200 --num_minibatches 4 --policy_type CEM --n_candidate 200 \
--normalize_flag --deterministic_flag 0 --ensemble_size 5 --n_particles 20 --history_length 10 \
--future_length 10 --load_path [saved_path] --seed 0

For example, saved_path looks like: data/HALFCHEETAH/NORMALIZED/CaDM/DET/CEM/CAND_200/H_10/F_10/BACK_COEFF_0.5/DIFF/BATCH_256/EPOCH_5/hidden_200_lr_0.001_horizon_30_seed_0/checkpoints/params_epoch_19

cadm's People

Contributors

younggyoseo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cadm's Issues

When the trained mujoco model is visualized, it is found that the robot will fly

I chose Ant as the environment. After finishing the training, I wrote a script to load the latest model and visualized it. I found that the trained agent did not touch the ground, but because the torque was too large, the agent flew up. When I modified the original xml file, the agent stopped flying when performing random actions and maintained its normal behavior. But when the modified agent is used for training, it is found that the algorithm has lost its original effect and the reward value has no upward trend. How should this problem be solved? Looking forward to your reply, thank you~

Conflict between version of mujoco-py and baselines

Hello there, I'm trying to establish an environment for your repo, and I found the version of baselines and mujoco-py in your readme produce a conflict

The baselines==0.1.5 package requires mujoco-py<2.0,>=1.50, as said in pip, while your repo is using mujoco-py==2.0.2.9

Is there any mis-writting in readme, or should I change my installation version ? I'm using python 3.6 with pip version 21.3.1

Not working after iteration 0

Hi! I cloned this repo and tried to run one of the experiment (Vanilla DM+CaDM). However, it seems like it stuck after iteration 0 and not running anymore. Do you have any idea about this?

Thank you! Have a nice day!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.