inspirai / timechamber Goto Github PK

View Code? Open in Web Editor NEW

193.0 193.0 22.0 186.65 MB

A Massively Parallel Large Scale Self-Play Framework

License: MIT License

Python 100.00%

deep-reinforcement-learning isaac-gym multi-agent reinforcement-learning self-play

timechamber's People

Contributors

Stargazers

Watchers

timechamber's Issues

How much time costs on training?

In Humanoid strike task config training file, training process seems epoch has to reach until 100000, which I assume, may take about 10days or more..
Could you give the specfic training days for this?

Bug

TimeChamber/timechamber/learning/ppo_sp_agent.py

Line 57 in f63b040

if self.player_pool_type == 'multi_thread':

TimeChamber/timechamber/learning/ppo_sp_agent.py

Line 58 in f63b040

return PFSPPlayerProcessPool(max_length=self.max_his_player_num,

TimeChamber/timechamber/learning/ppo_sp_agent.py

Line 60 in f63b040

elif self.player_pool_type == 'multi_process':

TimeChamber/timechamber/learning/ppo_sp_agent.py

Line 61 in f63b040

return PFSPPlayerThreadPool(max_length=self.max_his_player_num,

Maybe line 58, and 59 should exchange with line 61, and 62 ?

As configured. Strike is only using 50% of my GPU

I have a rtx 4090 and when I use watch-smi and train Strike it uses less than half the memory and only 50% of the the GPU. How can I change the settings to use the rest of the GPU?

Maybe a Bug

It occurred:

NotADirectoryError: [Errno 20] Not a directory: '/home/lzy/lzy/MARL/self-play/TimeChamber/timechamber/models/ant_sumo/policy.pth/../elo.jpg'

when I run policy evaluation:

python train.py task=MA_Ant_Sumo test=True headless=True checkpoint='models/ant_sumo/policy.pth'

Maybe this line code has some problem: TimeChamber/timechamber/learning/ppo_sp_player.py", line 286,

And the bug can be fixed by replacing

plt.savefig(self.params['load_path'] + '/../elo.jpg')

parent_path = os.path.dirname(self.params['load_path'])

plt.savefig(os.path.join(parent_path, 'elo.jpg'))

how to reproduce the README demo

I want to see the video of policy evaluation, so I set headless=False. But I found that the code run very slow in https://github.com/Jackory/TimeChamber/blob/37c2b40b8306921b748b318828769abafb90f347/timechamber/tasks/ma_ant_sumo.py#L63.
Could you give me some instructions to reproduce the README demo? Thanks

How much time costs on training?

In Humanoid strike task config training file, it seems that epoch has to reach until 100000, which I assume, may take about 10days or more..
Could you give the specfic training days for this?

Can't find isaacgym module

Hello,
When I try to run train.py, the following message appears:

Traceback (most recent call last):
File "train.py", line 33, in
import isaacgym
ModuleNotFoundError: No module named 'isaacgym'

But the thing is, IsaacGym is installed and works on my computer, I can run IsaacGymEnvs without problem. Thanks a lot for any help.

Suggestion to use RNN for multi-agent tasks

Hello,
Good job for the amazing work. I noticed that for the task of ma humanoid strike, you used a similar reward design to the one used for boxing agents in this paper:

https://dl.acm.org/doi/abs/10.1145/3450626.3459761

I was thinking that maybe strategic behavior could emerge with the sword fighters and the results could be better if you add a memory module (lstm, gru, transformer) like in the paper. Also, as far as I understand, in the literature of multi-agent partially observable mdp, each agent should take the history of its observations when taking actions to have more accurate belief about the global state and also to account for the non-stationary environment.

Thanks

ModuleNotFoundError: No module named 'timechamber.ase'

when I run the code, there is an error:
Traceback (most recent call last): File "train.py", line 47, in <module> from timechamber.ase import ase_agent ModuleNotFoundError: No module named 'timechamber.ase'
I find that there is not 'ase' directory in timechamber, how can i get ase?

Issue with contact forces when computing r damage in strike task

Hello,

When trying to see the value of each reward term for the task of striking, I noticed that the r_damage can be triggered even when the agents are not hitting each other. After further investigation, I found that the contact buffers can also include self collision in addition to contact between different agents; for example, if an agent's hand collide with its own head it will be included in the contact buffer, therefore it will be used in the reward as if it got hit by the opponent which was not the case.

This could be solved by filtering the contact buffer by specifying the bodies involved in the contact, but I saw that this functionality is not available right now in isaacgym. The workaround I can think of is to detect first if there was contact between specific bodies by measuring the distance between them, once the distance is lower than some threshold, you assume there was contact, then you check the magnitude of the contact force.

Hope I was clear.

numEnvs

How long does it take to train humanoid strike? I calculated that my computer takes about a month to train, is that normal? My GPU is RTX 4080.
Why is training one epoch slower than before after setting numEnvs=8192?
What parameters should I change to make training faster?
Please let me know, thank you very much.

inspirai / timechamber Goto Github PK

timechamber's People

Contributors

Stargazers

Watchers

Forkers

timechamber's Issues

How much time costs on training?

Bug

As configured. Strike is only using 50% of my GPU

Maybe a Bug

how to reproduce the README demo

How much time costs on training?

Can't find isaacgym module

Suggestion to use RNN for multi-agent tasks

ModuleNotFoundError: No module named 'timechamber.ase'

Issue with contact forces when computing r damage in strike task

numEnvs

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent