Git Product home page Git Product logo

pytorch-maddpg's Introduction

An implementation of MADDPG

1. Introduction

This is a pytorch implementation of multi-agent deep deterministic policy gradient algorithm.

The experimental environment is a modified version of Waterworld based on MADRL.

2. Environment

The main features (different from MADRL) of the modified Waterworld environment are:

  • evaders and poisons now bounce at the wall obeying physical rules
  • sizes of the evaders, pursuers and poisons are now the same so that random actions will lead to average rewards around 0.
  • need exactly n_coop agents to catch food.

3. Dependency

  • pytorch
  • visdom
  • python==3.6.1 (recommend using the anaconda/miniconda)
  • if you need to render the environments, opencv is required

4. Install

  • Install MADRL.
  • Replace the madrl_environments/pursuit directory with the one in this repo.
  • python main.py

if scene rendering is enabled, recommend to install opencv through conda-forge.

5. Results

two agents, cooperation = 2

The two agents need to cooperate to achieve the food for reward 10.

PNG/demo.gif

PNG/3.png

the average

PNG/4.png

one agent, cooperation = 1

PNG/newplot.png

6. TODO

  • reproduce the experiments in the paper with competitive environments.

pytorch-maddpg's People

Contributors

xuehy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-maddpg's Issues

question of the way to update the actor

In my understanding, the actor does not have any loss. The chain rule is needed to update the actor, but why there is the loss for the actor in your code? Could you give me the hint why you update your actor in this way?

Running on Thread?

I haven't looked at the code. Is it possible to train the model with multiple threads?

About OrnsteinUhlenbeckProcess

Hello! I am studying maddpg via your code. Thanks to you
At that time, I have one question to you.
I didn't find the OrnsteinUhlenbeckProcess using main.py or maddpg.py.
Didn't you using this process?
Thanks.

ConnectionRefusedError

当我运行时,会出现
`------------------------------------------------------------
/home/clb/anaconda3/envs/tensorflow/bin/python /home/clb/桌面/pytorch-maddpg-master/main.py
Setting up a new session...
Exception in user code:

Traceback (most recent call last):
F`/home/clb/anaconda3/envs/tensorflow/bin/python /home/clb/桌面/pytorch-maddpg-master/main.py
Setting up a new session...
Exception in user code:ile "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connection.py", line 159, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/util/connection.py", line 84, in create_connection
raise err
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/util/connection.py", line 74, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connectionpool.py", line 677, in urlopen
chunked=chunked,
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connectionpool.py", line 392, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1262, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1308, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1257, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1036, in _send_output
self.send(msg)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 974, in send
self.connect()
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connection.py", line 189, in connect
conn = self._new_conn()
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connection.py", line 171, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fd2d59caf60>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connectionpool.py", line 725, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/util/retry.py", line 439, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=5274): Max retries exceeded with url: /env/main (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd2d59caf60>: Failed to establish a new connection: [Errno 111] Connection refused',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/visdom/init.py", line 711, in _send
data=json.dumps(msg),
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/visdom/init.py", line 677, in _handle_post
r = self.session.post(url, data=data)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/requests/sessions.py", line 578, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/requests/sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5274): Max retries exceeded with url: /env/main (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd2d59caf60>: Failed to establish a new connection: [Errno 111] Connection refused',))
[Errno 111] Connection refused
/home/clb/.local/lib/python3.6/site-packages/torch/nn/functional.py:1340: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
Episode: 0, reward = -78.971497
Exception in user code:

Exception in user code:

Traceback (most recent call last):
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connection.py", line 159, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/util/connection.py", line 84, in create_connection
raise err
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/util/connection.py", line 74, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connectionpool.py", line 677, in urlopen
chunked=chunked,
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connectionpool.py", line 392, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1262, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1308, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1257, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1036, in _send_output
self.send(msg)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 974, in send
self.connect()
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connection.py", line 189, in connect
conn = self._new_conn()
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connection.py", line 171, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fd2819846d8>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connectionpool.py", line 725, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/util/retry.py", line 439, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=5274): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd2819846d8>: Failed to establish a new connection: [Errno 111] Connection refused',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/visdom/init.py", line 711, in _send
data=json.dumps(msg),
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/visdom/init.py", line 677, in _handle_post
r = self.session.post(url, data=data)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/requests/sessions.py", line 578, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/requests/sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5274): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fd2819846d8>: Failed to establish a new connection: [Errno 111] Connection refused',))
Traceback (most recent call last):
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connection.py", line 159, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/util/connection.py", line 84, in create_connection
raise err
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/util/connection.py", line 74, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connectionpool.py", line 677, in urlopen
chunked=chunked,
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connectionpool.py", line 392, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1262, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1308, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1257, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 1036, in _send_output
self.send(msg)
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/http/client.py", line 974, in send
self.connect()
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connection.py", line 189, in connect
conn = self._new_conn()
File "/home/clb/anaconda3/envs/tensorflow/lib/python3.6/site-packages/urllib3-1.25.8-py3.6.egg/urllib3/connection.py", line 171, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fd281984a58>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:`

就像这样,一边运行一边报错,最后不再运行。请问该怎么解决。

show

hello,i have already run this program,but not show the picture,is because i did not install opencv?or any problems?the picture shows on the browser?
thanks!

port

i use a port 8097 instead of the 5274.because if i use the 5074,connect error.i want to know why,please.

squeeze problem

thank you for your codes
Traceback (most recent call last):
File "maina.py", line 76, in
c_loss, a_loss = maddpg.update_policy()
File "/content/drive/drive/matd3/MADDPG.py", line 117, in update_policy
self.n_agents * self.n_actions)
AttributeError: 'tuple' object has no attribute 'squeeze'

Updating policy issue

Hello,

I encountered an issue when I try to run your code. I tried to find why it occurs but have no clue yet.
Could you please take a look the error below? Appreciate.

Traceback (most recent call last):

File "", line 1, in
runfile('/home/zishun/pytorch-maddpg-master/main.py', wdir='/home/zishun/pytorch-maddpg-master')

File "/home/zishun/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
execfile(filename, namespace)

File "/home/zishun/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/zishun/pytorch-maddpg-master/main.py", line 83, in
c_loss, a_loss = maddpg.update_policy()

File "/home/zishun/pytorch-maddpg-master/MADDPG.py", line 111, in update_policy
self.n_agents * self.n_actions))

RuntimeError: expand(torch.FloatTensor{[999, 1]}, size=[999]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

Negative actor losses

I am sorry if this is a stupid question. I have been getting negative losses for the actors. Is this normal? If not, how should I interpret it?

Thanks!

no positive reward

When the blue ball hits the green ball does not get a positive reward, and the green ball does not disappear? is't correct?

How to install MADRL?

I've downloaded MADRL, but I didn't install it successfully.
Does it need the below command?
conda env create -f environment.yml

ResolvePackageNotFound:

  • pygame
  • tensorflow=0.10.0rc0

I've installed pygame using "pip intall pygame".

When I run main.py, I got the below error: ModuleNotFoundError: No module named 'madrl_environments'

type error

Thank you for your codes.
I met this error:
non_final_next_actions.view(-1, self.n_agents * self.n_actions)
TypeError: can't assign a tuple to a torch.cuda.FloatTensor

MultiGPU training

Thanks for the great work. Do you think the training can be sped up if we use multiple gpu / some form of parallel training in the update policy part instead of looping through the agent one by one? If so, what kind of approach would you suggest in doing so?

Thanks again

有关网络结构的提问

看您这边个人介绍是在网易,所以直接中文提问了。有下面三个问题不是很清楚,麻烦您有时间解答一下:
1、为什么critic的网络结构是先把obs观测值升高到1024?
2、直接与原始的action变量进行拼接,而不是通过一次全连接变换?
3、我测试了不升高obs的维度,即尝试((obs->256)+action)->128->64,结果也表现较好,这里有相应解释吗?

Train result

Hello,I use your program and run 2000 episodes,but compare with your reslut , my reward didn't have the same obvious effect. The reward variation tendency didn't rise. I don't change the code, just set max_steps = 100. I don't konw why , did I missing somthing ? ?
I run the program on virtual machine , and didn't use GPU , 2000 epiode use approximately 50 hours, it's too slowly, To get the result how much time you spend on training process ? ?

2000-episode_diff

gif problem

hello!i have already set up the environment,and trainned to 20 episode.I see two pane on the visdom,but not see the gif(two balls find other balls in the gif),why?

no pursuit directory ?

Replace the madrl_environments/pursuit directory with the one in this repo.

but i don't find that directory in madrl_environments

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.