[AAAI 2023 Oral] Contrastive Identity-Aware Learning for Multi-Agent Value Decomposition

Official codebase for paper Contrastive Identity-Aware Learning for Multi-Agent Value Decomposition. This codebase is based on the open-source PyMARL framework and please refer to that repo for more documentation.

Overview

TLDR: The first work identifies the ambiguous credit assignment problem in Value Decomposition (VD), a highly important ingredient for multi-agent diversity yet largely overlooked by existing literature. Moreover, we propose a novel contrastive identity-aware learning (CIA) method to promote diverse behaviors via explicitly encouraging credit-level distinguishability. The proposed CIA module imposes no constraints over the network architecture, and serves as a plug-and-play module readily applicable to various VD methods.

Abstract: Value Decomposition (VD) aims to deduce the contributions of agents for decentralized policies in the presence of only global rewards, and has recently emerged as a powerful credit assignment paradigm for tackling cooperative Multi-Agent Reinforcement Learning (MARL) problems. One of the main challenges in VD is to promote diverse behaviors among agents, while existing methods directly encourage the diversity of learned agent networks with various strategies. However, we argue that these dedicated designs for agent networks are still limited by the indistinguishable VD network, leading to homogeneous agent behaviors and thus downgrading the cooperation capability. In this paper, we propose a novel Contrastive Identity-Aware learning (CIA) method, explicitly boosting the credit-level distinguishability of the VD network to break the bottleneck of multi-agent diversity. Specifically, our approach leverages contrastive learning to maximize the mutual information between the temporal credits and identity representations of different agents, encouraging the full expressiveness of credit assignment and further the emergence of individualities. The algorithm implementation of the proposed CIA module is simple yet effective that can be readily incorporated into various VD architectures. Experiments on the SMAC benchmarks and across different VD backbones demonstrate that the proposed method yields results superior to the state-of-the-art counterparts.

Prerequisites

Install dependencies

See requirments.txt file for more information about how to install the dependencies.

Install StarCraft II

Please use the Blizzard's repository to download the Linux version 4.10 of StarCraft II. By default, the game is expected to be in ~/StarCraftII/ directory. This can be changed by setting the environment variable SC2PATH.

- Please pay attention to the version of SC2 you are using for your experiments. 
- We use the latest version SC2.4.10 for all SMAC experiments instead of SC2.4.6.2.69232.
- Performance is not comparable across versions.

The SMAC maps used for all experiments is in CIA/src/envs/starcraft2/maps/SMAC_Maps directory. You should place the SMAC_Maps directory in StarCraftII/Maps.

Usage

Please follow the instructions below to replicate the results in the paper.

Didactic Games: Turn

# QMIX
python src/main.py --config=qmix_turn --env-config=turn with env_args.map_name=turn

# QMIX (CIA)
python src/main.py --config=cia_grad_qmix_turn --env-config=turn with env_args.map_name=turn

SMAC

# QMIX
python src/main.py --config=qmix_<MAP_NAME> --env-config=sc2 with env_args.map_name=<MAP_NAME>

# QPLEX
python src/main.py --config=qplex_<MAP_NAME> --env-config=sc2 with env_args.map_name=<MAP_NAME>

# QMIX (CIA)
python src/main.py --config=cia_grad_qmix_<MAP_NAME> --env-config=sc2 with env_args.map_name=<MAP_NAME>

# QPLEX (CIA)
python src/main.py --config=cia_qplex_<MAP_NAME> --env-config=sc2 with env_args.map_name=<MAP_NAME>

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{liu2023CIA,
  title     = {Contrastive Identity-Aware Learning for Multi-Agent Value Decomposition},
  author    = {Liu, Shunyu and Zhou, Yihe and Song, Jie and Zheng, Tongya and Chen, Kaixuan and Zhu, Tongtian and Feng, Zunlei and Song, Mingli},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  publisher = {{AAAI} Press},
  volume    = {37},
  number    = {10},
  pages     = {11595-11603},
  year      = {2023},
  month     = {Jun.},
  doi       = {10.1609/aaai.v37i10.26370},
  url       = {https://ojs.aaai.org/index.php/AAAI/article/view/26370}
}

Contact

Please feel free to contact me via email ([email protected]) if you are interested in my research :)

hj5717 / cia Goto Github PK

cia's Introduction

[AAAI 2023 Oral] Contrastive Identity-Aware Learning for Multi-Agent Value Decomposition

Overview

Prerequisites

Install dependencies

Install StarCraft II

Usage

Didactic Games: Turn

SMAC

Citation

Contact

cia's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent