Git Product home page Git Product logo

i2c's Introduction

Installation

  • Known dependencies: Python (3.5.4), OpenAI gym (0.10.5), tensorflow (1.14.0), numpy (1.18.2)

Environment options

  • --scenario: defines which environment in the MPE is to be used (default: "cn")

  • --max-episode-len maximum length of each episode for the environment (default: 25)

  • --num-episodes total number of training episodes (default: 60000)

  • --num-adversaries: number of adversaries in the environment (default: 0)

Core training parameters

  • --lr: learning rate (default: 1e-2)

  • --gamma: discount factor (default: 0.95)

  • --batch-size: batch size (default: 800)

  • --num-units: number of units in the MLP (default: 128)

Training for prior network

  • --prior-buffer-size: prior network training buffer size

  • --prior-num-iter: prior network training iterations

  • --prior-training-rate: prior network training rate

  • --prior-training-percentile: control threshold for KL value to get labels

Checkpointing

  • --exp-name: name of the experiment, used as the file name to save all results (default: None)

  • --save-dir: directory where intermediate training results and model will be saved (default: "/tmp/policy/")

  • --save-rate: model is saved every time this number of episodes has been completed (default: 1000)

  • --load-dir: directory where training state and model are loaded from (default: "")

  • --plots-dir: directory where training curves are saved (default: "./learning_curves/")

  • --restore_all: whether to restore existing I2C network

Training procedure

I2C be learned end-to-end or in a two-phase manner. This code is implemented for end-to-end manner which could take more training time compared with the latter manner

For Cooperative Navigation, python3 train.py --scenario 'cn' --prior-training-percentile 60 --lr 1e-2

For Predator Prey, python3 train.py --scenario 'pp' --prior-training-percentile 40 --lr 1e-3

Citations

If you are using the codes, please cite our paper.

Ziluo Ding, Tiejun Huang, and Zongqing Lu. Learning Individually Inferred Communication for Multi-Agent Cooperation. NeurIPS'20.

@inproceedings{ding2020learning,
    	title={Learning Individually Inferred Communication for Multi-Agent Cooperation},
    	author={Ding, Ziluo and Huang, Tiejun and Lu, Zongqing},
    	booktitle={NeurIPS},
    	year={2020}
}

Acknowledgements

This code is developed based on the source code of MADDPG by Ryan Lowe

i2c's People

Contributors

leonkding avatar z0ngqing avatar

Stargazers

 avatar Chao Wang avatar abiaoaaaaa avatar  avatar  avatar MEC avatar Jie zhou avatar  avatar  avatar  avatar  avatar  avatar Wu Zhen avatar lixiaotong avatar  avatar  avatar  avatar Xudong Guo avatar Kevin Corder avatar Xihuai Wang avatar  avatar James avatar Xinran Li avatar 燦哲 avatar  avatar  avatar Lifeng Fan avatar Kory avatar  avatar fccc0417 avatar wenqingji avatar xj_zh avatar  avatar  avatar daxelb avatar  avatar Yuanfei Wang avatar  avatar

Watchers

James Cloos avatar

i2c's Issues

Computing Metrics

I'm trying to understand how the metrics in the paper were generated from this code, particularly in the Cooperative Navigation environment. Could you please explain how you aggregate the metrics for each episode? Thanks for your help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.