replicable-marl / marllib Goto Github PK
View Code? Open in Web Editor NEWOne repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)
Home Page: https://marllib.readthedocs.io
License: MIT License
One repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)
Home Page: https://marllib.readthedocs.io
License: MIT License
Hi, I saw the previous issue:TuneError,and used the new APIs,the bug still persists
I run the code below:
from marllib import marl
env = marl.make_env(environment_name="mpe", map_name="simple_spread")
iddpg = marl.algos.iddpg(hyperparam_source="mpe")
model = marl.build_model(env, iddpg, {"core_arch": "mlp", "encode_layer": "128-256"})
iddpg.fit(env, model, stop={"timesteps_total": 1000000}, checkpoint_freq=100, share_policy="group")
the error as follow:
(pid=125495) File "/home/hjl/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/ddpg/ddpg_tf_policy.py", line 436, in validate_spaces
(pid=125495) raise UnsupportedSpaceException(
(pid=125495) ray.rllib.utils.error.UnsupportedSpaceException: Action space (Discrete(5)) of <ray.rllib.policy.policy_template.IDDPGTorchPolicy object at 0x7f62a9d889d0> is not supported for DDPG.
(IDDPGTrainer pid=125496)
Traceback (most recent call last):
File "/home/hjl/桌面/代码测试/main.py", line 9, in
iddpg.fit(env, model, stop={"timesteps_total": 1000000}, checkpoint_freq=100, share_policy="group")
File "/home/hjl/anaconda3/envs/marllib/lib/python3.8/site-packages/MARLlib-master/marllib/marl/init.py", line 309, in fit
run_il(self.config_dict, env_instance, model_class, stop=stop)
File "/home/hjl/anaconda3/envs/marllib/lib/python3.8/site-packages/MARLlib-master/marllib/marl/algos/run_il.py", line 196, in run_il
results = POlICY_REGISTRY[exp_info["algorithm"]](model, exp_info, run_config, env_info, stop_config,
File "/home/hjl/anaconda3/envs/marllib/lib/python3.8/site-packages/MARLlib-master/marllib/marl/algos/scripts/iddpg.py", line 109, in run_iddpg
results = tune.run(IDDPGTrainer,
File "/home/hjl/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/tune.py", line 624, in run
raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [IDDPGTrainer_mpe_simple_spread_cb637_00000])
(RolloutWorker pid=125495)
Could anyone help me?
Thanks!!!
One comments shown in marl/algos/hyperparams/finetuned/mpe/maddpg.yaml suggests
# Detailed explanation for each hyper parameter can be found in ray/rllib/agents/ddpg/ddpg.py
However, it looks like ray just updated their documentations. There are no detailed explanations in ray/rllib/agents/ddpg/ddpg.py.
Hello.
Trained result confirmed that the log was saved.
Can I render trained model using this log?
How can I do that?
(I want to recall the saved results and reproduce them.)
Hi
Can you please provide support for petting zoo Sisl environments like waterworld, multi walker etc
Thanks
I'm new to MARLlib and am currently in the process of understanding all the great things it can do :)
Unfortunately, when executing python load_and_render_model.py
from the examples
directory, I get the following error:
2023-06-13 16:57:15,259 ERROR trial_runner.py:1124 -- Trial MAPPOTrainer_mpe_simple_spread_95240_00000: Error processing restore. Traceback (most recent call last): File "/opt/conda/lib/python3.9/site-packages/ray/tune/trial_runner.py", line 1117, in _process_trial_restore self.trial_executor.fetch_result(trial) File "/opt/conda/lib/python3.9/site-packages/ray/tune/ray_trial_executor.py", line 788, in fetch_result result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT) File "/opt/conda/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/opt/conda/lib/python3.9/site-packages/ray/worker.py", line 1625, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(TypeError): ray::MAPPOTrainer.restore_from_object() (pid=144888, repr=MAPPOTrainer) File "/opt/conda/lib/python3.9/site-packages/ray/tune/trainable.py", line 433, in restore_from_object self.restore(checkpoint_path) File "/opt/conda/lib/python3.9/site-packages/ray/tune/trainable.py", line 411, in restore self.load_checkpoint(checkpoint_path) File "/opt/conda/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 830, in load_checkpoint self.__setstate__(extra_data) File "/opt/conda/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 289, in __setstate__ Trainer.__setstate__(self, state) File "/opt/conda/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 1813, in __setstate__ self.workers.local_worker().restore(state["worker"]) File "/opt/conda/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1274, in restore objs = pickle.loads(objs) TypeError: an integer is required (got type bytes)
I'd appreciate any pointer to what is maybe going wrong. Thank you !
Hi
Currently, the supported version of mujoco is v2. It doesn't have many functionalities available in v4. Can you please update mujoco to version v4.
Thank you.
whene I run the code belwo:
python3 marl/main.py --algo_config=qmix [--finetuned] --env_config=smac with env_args.map_name=3m
I got this error:
(pid=341594) [2023-03-20 15:46:49,756 E 341594 341920] raylet_client.cc:159: IOError: Broken pipe [RayletClient] Failed to disconnect from raylet.
Traceback (most recent call last):
File "marl/main.py", line 53, in
run_vd(config_dict)
File "/media/user/APPS/AI Safety/MultiAgents RL/Algorithms/MARLlib/marl/algos/run_vd.py", line 218, in run_vd
results = POlICY_REGISTRY[config_dict["algorithm"]](config_dict, common_config, env_info_dict, stop)
File "/media/user/APPS/AI Safety/MultiAgents RL/Algorithms/MARLlib/marl/algos/scripts/vdn_qmix_iql.py", line 78, in run_joint_q
results = tune.run(Trainer,
File "/home/user/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/tune.py", line 624, in run
raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [QMIX_grouped_smac_3m_43fcd_00000])
(pid=341594) CloseHandler: 127.0.0.1:51406 disconnected
could any one help me with that?
im using
ubuntu: 20.04
python: 3.8.16 (conda)
torch: 1.13.1+cu117
ray: 1.8.0
To replicate the result of your work in the MPE environment, the following packages should be installed (and are not mentioned in your documentation)
pip install gym==0.21.0
pip install pettingzoo==1.21.0
pip install supersuit ==3.3.0
pip install icecream
Hello,
maybe I've read over this/its basic knowledge but while looking at your example results I noticed how different the csv looks.
If I did understand correctly (and my small training worked) I "only" get a progress.csv and a result.json and the progress.csv has a lot of information which is really hard to read. I wanted to ask how did you manage to do that? What data did you take out of the progress.csv?
Thank you and sorry if its a inconvenient
In localmode, the code will not report an error, but when localmode=False, the following error will be reported every time the 36th iteration is reached:
Failure # 1 (occurred at 2023-05-09_21-08-13)
Traceback (most recent call last):
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\tune\trial_runner.py", line 890, in _process_trial
results = self.trial_executor.fetch_result(trial)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\tune\ray_trial_executor.py", line 788, in fetch_result
result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray_private\client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\worker.py", line 1625, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): �[36mray::VDA2CTrainer.train()�[39m (pid=23324, ip=127.0.0.1, repr=VDA2CTrainer)
File "E:\Linghao\MARLlib-sy_dev_0\marllib\marl\algos\core\VD\vda2c.py", line 65, in value_mix_actor_critic_loss
dist = dist_class(logits, model)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\models\torch\torch_action_dist.py", line 186, in init
self.dist = torch.distributions.normal.Normal(mean, torch.exp(log_std))
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\torch\distributions\normal.py", line 50, in init
super(Normal, self).init(batch_shape, validate_args=validate_args)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\torch\distributions\distribution.py", line 53, in init
raise ValueError("The parameter {} has invalid values".format(param))
ValueError: The parameter loc has invalid values
The above exception was the direct cause of the following exception:
�[36mray::VDA2CTrainer.train()�[39m (pid=23324, ip=127.0.0.1, repr=VDA2CTrainer)
File "python\ray_raylet.pyx", line 558, in ray._raylet.execute_task
File "python\ray_raylet.pyx", line 596, in ray._raylet.execute_task
File "python\ray_raylet.pyx", line 565, in ray._raylet.execute_task
File "python\ray_raylet.pyx", line 569, in ray._raylet.execute_task
File "python\ray_raylet.pyx", line 519, in ray._raylet.execute_task.function_executor
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray_private\function_manager.py", line 576, in actor_method_executor
return method(__ray_actor, *args, **kwargs)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\tracing\tracing_helper.py", line 451, in _resume_span
return method(self, *_args, **_kwargs)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\agents\trainer.py", line 682, in train
raise e
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\agents\trainer.py", line 668, in train
result = Trainable.train(self)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\tune\trainable.py", line 283, in train
result = self.step()
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\tracing\tracing_helper.py", line 451, in _resume_span
return method(self, *_args, **_kwargs)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\agents\trainer_template.py", line 206, in step
step_results = next(self.train_exec_impl)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\iter.py", line 756, in next
return next(self.built_iterator)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\iter.py", line 783, in apply_foreach
for item in it:
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\iter.py", line 843, in apply_filter
for item in it:
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\iter.py", line 843, in apply_filter
for item in it:
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\util\iter.py", line 791, in apply_foreach
result = fn(item)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\execution\train_ops.py", line 230, in call
results = policy.learn_on_loaded_batch(
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\policy\torch_policy.py", line 632, in learn_on_loaded_batch
return self.learn_on_batch(batch)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\utils\threading.py", line 21, in wrapper
return func(self, *a, **k)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\policy\torch_policy.py", line 529, in learn_on_batch
grads, fetches = self.compute_gradients(postprocessed_batch)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\policy\policy_template.py", line 336, in compute_gradients
return parent_cls.compute_gradients(self, batch)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\utils\threading.py", line 21, in wrapper
return func(self, *a, **k)
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\policy\torch_policy.py", line 709, in compute_gradients
tower_outputs = self._multi_gpu_parallel_grad_calc(
File "C:\Users\DELL\miniconda3\envs\marllib4\lib\site-packages\ray\rllib\policy\torch_policy.py", line 1083, in _multi_gpu_parallel_grad_calc
raise last_result[0] from last_result[1]
ValueError: The parameter loc has invalid values
In tower 0 on device cpu
There should be no conflicting packages at the moment
Hi.
I am reading through MARLlib's implementation for a related academic project. In looking at the COMA model implementation, I see functions to update the critic (and actor) models by calling backward() and step() on the gradients. However, I cannot see how and where these are ever called, so it's not clear to me how you are getting RLLib to update the critic weights. Is this done implicitly because cc_rnn subclasses both nn.Module and TorchModel through the parent base_rnn?
Thanks for producing this library as well. It's a mammoth effort :)
Hi,
We have created our custom environment for and wrapped it in a gym class. After training using MAPPO, we got the .pkl files. Can you elaborate upon how to inference the learned policy ?
We already have a visualization of the env using pygame and just want to load the learned policies and see them play.
Thanks in advance.
Hi! We are trying to apply MARLlib to fulfill our task. The observation from our custom environment can be a little too complicated for a simple MLP/CNN encoder. We want to apply a pretrained model to improve the feature extraction.
Furthermore, the decision network, which is currently RNN based or MLP based, can only tune a little parameter in the config file. It would be great if we can directly use a self-designed torch model (and this can solve the problem of loading pretrained model), or at least release the full customization ability like in Ray.
Wondering if there is any plan to enhance these kinds of ability?
It would be great to upgrade Ray to v2.5 and Gym to Gymnasium to ensure compatibilty
According to the doc about Environments, it would be fine to just use pip install pettingzoo[magent]
. However, the latest version of pettingzoo has moved MAgent to a dedicated project, so its unusable now.
I've compared the code in envs/base_env/magent.py
with pettingzoo's previous version.
MARLlib/envs/base_env/magent.py
Lines 5 to 14 in e1ddcef
It seems like we have to use version before 1.15.0 (e.g. 1.14.0). Yet, after pip install pettingzoo[magent]==1.14.0
and run a test, it reports problem as: "cannot import name 'aec_to_parallel' from 'pettingzoo.utils.conversions'"
I've tried newer version of MAgent and changed magent.py accordingly, but it will raise other problems.
So which version has been tested exactly?
Hello,
I am trying to run an algorithm on a MetaDrive environment as follows:
python marl/main.py --algo_config=mappo --env_config=metadrive with env_args.map_name=Roundabout
I have installed the MetaDrive environment as per the documentation:
pip install metadrive-simulator==0.2.3
But I get the following error after training for some time:
2023-01-31 09:56:28,733 ERROR trial_runner.py:924 -- Trial MAPPOTrainer_metadrive_Roundabout_45631_00000: Error processing event.
Traceback (most recent call last):
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 890, in _process_trial
results = self.trial_executor.fetch_result(trial)
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 788, in fetch_result
result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/worker.py", line 1625, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::MAPPOTrainer.train_buffered() (pid=20470, ip=192.168.0.98, repr=MAPPOTrainer)
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trainable.py", line 224, in train_buffered
result = self.train()
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 682, in train
raise e
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 668, in train
result = Trainable.train(self)
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trainable.py", line 283, in train
result = self.step()
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 240, in step
evaluation_metrics = self.evaluate()
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 958, in evaluate
self.evaluation_workers.local_worker().sample()
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 753, in sample
batches = [self.input_reader.next()]
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 103, in next
batches = [self.get_data()]
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 233, in get_data
item = next(self._env_runner)
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 586, in _env_runner
base_env.poll()
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/env/base_env.py", line 422, in poll
obs[i], rewards[i], dones[i], infos[i] = env_state.poll()
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/env/base_env.py", line 478, in poll
self.reset()
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/env/base_env.py", line 528, in reset
self.last_obs = self.env.reset()
File "/home/oscar/msc/MARLlib/envs/base_env/metadrive.py", line 69, in reset
original_obs = self.env.reset()
File "/home/oscar/anaconda3/envs/marllib/lib/python3.8/site-packages/metadrive/envs/base_env.py", line 291, in reset
self.engine.reset()
AttributeError: 'NoneType' object has no attribute 'reset'
Any guidance would be appreciated,
Thanks.
It seems that your tutorial on Colab has some problems in the installation of environmental installation. What is your python version. This is not an urgent problem, but it affects the learning of the beginner for the project。(* ̄︶ ̄)
Is the centralized_critic_q
postprocessing only implemented for MADDPG algorithm?
MARLlib/marl/algos/utils/postprocessing.py
Lines 347 to 370 in a0fe513
To my best knowledge for RLlib
, the implementation above only works with strict conditions. It only works for DDPG / TD3
policies with both normalize_actions
and clip_actions
set to False
.
policy.model
doesn't have the attribute policy_model
.SampleBatch.ACTIONS
are the real actions taken while interacting with the environment. For example, for discrete action spaces, SampleBatch.ACTIONS
are integers, not action logits.[-1, +1]
to [low, high]
).Hi !
Thanks a lot for the nice library that I am still discovering.
I am currently trying to save SMAC replays after having successfully trained a MAPPO algorithm on the "3m" map.
However, I can't figure out where the call to SMAC's "save_replay()" should be done when calling the render() method.
Could you help me, and maybe add this point to the documentation as it may be useful for other users ?
Thank you !
It turns out wrong when I run the example:
python marl/main.py --algo_config=maa2c --env_config=mpe with env_args.map_name=simple_adversary
Here the log is:
/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/pettingzoo/utils/conversions.py:91: UserWarning: The observation_spaces
dictionary is deprecated. Use the observation_space
function instead.
warnings.warn(
/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/pettingzoo/utils/conversions.py:105: UserWarning: The action_spaces
dictionary is deprecated. Use the action_space
function instead.
warnings.warn(
use fc encoder
2022-11-10 10:48:46,928 WARNING sample.py:401 -- DeprecationWarning: wrapping <function run_cc.. at 0x7f4908387700> with tune.function() is no longer needed
2022-11-10 10:48:47,099 WARNING worker.py:496 -- ray.get_gpu_ids()
will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
:task_name:bundle_reservation_check_func
:actor_name:MAA2CTrainer
2022-11-10 10:48:47,183 WARNING deprecation.py:38 -- DeprecationWarning: simple_optimizer
has been deprecated. This will raise an error in the future!
2022-11-10 10:48:47,183 INFO trainer.py:770 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
[2022-11-10 10:48:47,194 E 10507 10507] core_worker.cc:1561: Pushed Error with JobID: 01000000 of type: task with message: ray::MAA2CTrainer.init() (pid=10507, ip=10.31.217.80, repr=MAA2CTrainer)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 137, in init
Trainer.init(self, config, env, logger_creator)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 623, in init
super().init(config, logger_creator)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trainable.py", line 107, in init
self.setup(copy.deepcopy(self.config))
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 147, in setup
super().setup(config)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 776, in setup
self._init(self.config, self.env_creator)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 171, in _init
self.workers = self._make_workers(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 858, in _make_workers
return WorkerSet(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 110, in init
self._local_worker = self._make_worker(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 406, in _make_worker
worker = cls(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 584, in init
self._build_policy_map(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1384, in build_policy_map
self.policy_map.create_policy(name, orig_cls, obs_space, act_space,
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/policy/policy_map.py", line 143, in create_policy
self[policy_id] = class(observation_space, action_space,
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/policy/policy_template.py", line 241, in init
dist_class, logit_dim = ModelCatalog.get_action_dist(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/models/catalog.py", line 287, in get_action_dist
raise NotImplementedError("Unsupported args: {} {}".format(
NotImplementedError: Unsupported args: Discrete(5) None at time: 1.66805e+09
2022-11-10 10:48:47,195 ERROR actor.py:746 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::MAA2CTrainer.init() (pid=10507, ip=10.31.217.80)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 137, in init
Trainer.init(self, config, env, logger_creator)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 623, in init
super().init(config, logger_creator)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trainable.py", line 107, in init
self.setup(copy.deepcopy(self.config))
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 147, in setup
super().setup(config)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 776, in setup
self._init(self.config, self.env_creator)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 171, in _init
self.workers = self._make_workers(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 858, in _make_workers
return WorkerSet(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 110, in init
self._local_worker = self._make_worker(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 406, in _make_worker
worker = cls(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 584, in init
self._build_policy_map(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1384, in build_policy_map
self.policy_map.create_policy(name, orig_cls, obs_space, act_space,
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/policy/policy_map.py", line 143, in create_policy
self[policy_id] = class(observation_space, action_space,
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/policy/policy_template.py", line 241, in init
dist_class, logit_dim = ModelCatalog.get_action_dist(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/models/catalog.py", line 287, in get_action_dist
raise NotImplementedError("Unsupported args: {} {}".format(
NotImplementedError: Unsupported args: Discrete(5) None
[2022-11-10 10:48:47,197 E 10507 10507] core_worker.cc:1561: Pushed Error with JobID: 01000000 of type: task with message: ray::MAA2CTrainer.get_auto_filled_metrics()::Exiting (pid=10507, ip=10.31.217.80, repr=MAA2CTrainer)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trainable.py", line 179, in get_auto_filled_metrics
NODE_IP: self._local_ip,
AttributeError: 'MAA2CTrainer' object has no attribute '_local_ip' at time: 1.66805e+09
[2022-11-10 10:48:47,701 E 10507 10507] core_worker.cc:1561: Pushed Error with JobID: 01000000 of type: task with message: ray::MAA2CTrainer.train_buffered()::Exiting (pid=10507, ip=10.31.217.80, repr=MAA2CTrainer)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trainable.py", line 224, in train_buffered
result = self.train()
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 682, in train
raise e
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 668, in train
result = Trainable.train(self)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trainable.py", line 283, in train
result = self.step()
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 206, in step
step_results = next(self.train_exec_impl)
AttributeError: 'MAA2CTrainer' object has no attribute 'train_exec_impl' at time: 1.66805e+09
Traceback (most recent call last):
File "marl/main.py", line 42, in
run_cc(config_dict)
File "/home/zyy/Documents/rl/MARLlib/marl/algos/run_cc.py", line 182, in run_cc
results = POlICY_REGISTRY[config_dict["algorithm"]](config_dict, common_config, env_info_dict, stop)
File "/home/zyy/Documents/rl/MARLlib/marl/algos/scripts/maa2c.py", line 45, in run_maa2c
results = tune.run(MAA2CTrainer,
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/tune.py", line 603, in run
_report_progress(runner, progress_reporter)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/tune.py", line 68, in _report_progress
reporter.report(trials, done, sched_debug_str, executor_debug_str)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/progress_reporter.py", line 520, in report
print(self._progress_str(trials, done, *sys_info))
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/progress_reporter.py", line 279, in _progress_str
user_metrics = self._infer_user_metrics(trials, self._infer_limit)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/progress_reporter.py", line 325, in _infer_user_metrics
if not t.last_result:
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trial.py", line 433, in last_result
self._get_default_result_or_future()
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trial.py", line 409, in _get_default_result_or_future
self._default_result_or_future = ray.get(
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/worker.py", line 1625, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::MAA2CTrainer.get_auto_filled_metrics()::Exiting (pid=10507, ip=10.31.217.80, repr=MAA2CTrainer)
File "/home/zyy/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trainable.py", line 179, in get_auto_filled_metrics
NODE_IP: self._local_ip,
AttributeError: 'MAA2CTrainer' object has no attribute '_local_ip
Hi, does this lib support running jobs on ray cluster, e.g., ray k8s cluster?
I am trying to run MADDPG in mpe(with simple_adversary). However, the algorithm stucks with a log shows the trial is pending.
I saw a similar issue here: ray-project/ray#16425. It says that this is due to the resource allocation.
I am using a 16 core cpu and a GTX 1650s gpu. So I tried to set ray.yaml file as follows:
num_workers: 2
num_gpus: 1
num_cpus_per_worker: 8
num_gpus_per_worker: 0.5
I also tried several different options such as:
num_workers: 2
num_gpus: 1
num_cpus_per_worker: 1
num_gpus_per_worker: 0.3
However, no matter how I alter the setting. The algorithm still stucks with trial pending.
Can the configuration files be changed to make each agent uses different algorithms during training? For example, one agent uses the IPPO algorithm while another agent uses IQL algorithm.
Hello,
We have recently developed a vectorised version of MPE with more environments and robotics scenarios.
https://github.com/proroklab/VectorizedMultiAgentSimulator
It is by default compatible with the VectorEnv RLLib interface.
Would this work straight away in your framework? Are you interested in adding it to the list of supported envs?
Have a look at the project and let me know.
Best,
Matteo
Hi, In the context of my research I made my own environment and I am using RLlib to solve it but still with not much sucess. I came across this project and find it amazing and have two questions:
Can the MARLlib algorithms solve a custom environment? from the documentation it seems that they are only available for specific environment
These algorithms are implemented following the RLlib api? Would the RLlib team have interest in integrating these in the project?
best and thank you
Hello,
In my lab we have created a MARL simulator and benchmarking platform called VMAS:
https://github.com/proroklab/VectorizedMultiAgentSimulator.
It is a vectorized simulator using pytorch which contains all the Multiagent Particle Environments scenarios and an additional set of 12 multi-robot scenarios.
Have a look at the repo, it would be nice to make our environments available in your project and it should be pretty easy since we support the RLLib VectorEnv interface.
Any plan to update the base Ray version in the future?
Great job! I like this comprehensive benchmark
I just developed CoPO, a MARL algorithm explicitly modeling the coordination between self-interested agents, based on latest RLLib ray=2.2.0. If this repo is compatible I would like to contribute my code to enrich this project.
Hello there, I'm really impressed with the work that your project has accomplished so far. However, I noticed that it seems to only support the 15 environments that are currently mentioned in the readme and guide. I'm interested in using your framework for my work and I'm wondering if it would be possible to implement support for additional or custom environments. Alternatively, would it be something that you plan to include in your future work?
Hi,
It seems a lot of the finetuned hyperparameters from the main branch are missing in the sy_dev branch. Are the hyperparameters from the main branch still valid in the new sy_dev branch as well?
Thank you.
After testing HAPPO, I found that in happo_surrogate_loss, no other agents are considered for each self-agent. I wonder if there is any problem?
Got this error after installing MARLLib on a fresh conda environment 😥
The script I ran is the first example in README.md. I have also ran marllib/patch/add_patch.py before.
PS: It can be fixed by manually downgrading gym to <=0.21.0, but I'm not sure it doesn't break something else
Error described as belows:
(marllib) [[email protected] scripts]$ pip install gym==1.21.0
ERROR: Could not find a version that satisfies the requirement gym==1.21.0 (from versions: 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.1.0, 0.1.1, 0.1.2, 0.1.3, 0.1.4, 0.1.5, 0.1.6, 0.1.7, 0.2.0, 0.2.1, 0.2.2, 0.2.3, 0.2.4, 0.2.5, 0.2.6, 0.2.7, 0.2.8, 0.2.9, 0.2.10, 0.2.11, 0.2.12, 0.3.0, 0.4.0, 0.4.1, 0.4.2, 0.4.3, 0.4.4, 0.4.5, 0.4.6, 0.4.8, 0.4.9, 0.4.10, 0.5.0, 0.5.1, 0.5.2, 0.5.3, 0.5.4, 0.5.5, 0.5.6, 0.5.7, 0.6.0, 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.7.4, 0.8.0.dev0, 0.8.0, 0.8.1, 0.8.2, 0.9.0, 0.9.1, 0.9.2, 0.9.3, 0.9.4, 0.9.5, 0.9.6, 0.9.7, 0.10.0, 0.10.1, 0.10.2, 0.10.3, 0.10.4, 0.10.5, 0.10.8, 0.10.9, 0.10.11, 0.11.0, 0.12.0, 0.12.1, 0.12.4, 0.12.5, 0.12.6, 0.13.0, 0.13.1, 0.14.0, 0.15.3, 0.15.4, 0.15.6, 0.15.7, 0.16.0, 0.17.0, 0.17.1, 0.17.2, 0.17.3, 0.18.0, 0.18.3, 0.19.0, 0.20.0, 0.21.0, 0.22.0, 0.23.0, 0.23.1, 0.24.0, 0.24.1, 0.25.0, 0.25.1, 0.25.2, 0.26.0, 0.26.1, 0.26.2)
ERROR: No matching distribution found for gym==1.21.0
Do you mean 0.21.0?
Hi there, I ran the example given in the ReadMe, and the model basically doesn't learn at all? At time step 0 the reward was -116.198 with episode_reward_max -69 and min -191.415, by the end of the training (313 iterations, ts 1001600) the reward min is less (-206.354, the reward max is only marginally higher -60.4847 and the reward is barely less -105.824.
Is this an expected result? I am using the model exactly as provided in the ReadMe file.
plz, guide me about this.
thanks.
Hi! I'm glad to see the latest upgrade of MARLlib with update to date document. But I'm a bit confused about the relationship of these Api-based usage with previous console-based usage. So, I want to make sure something:
main.py
anymore, does this mean that the console-based usage is completely deprecated?# build agent model based on env + algorithms + user preference
model = marl.build_model(env, mappo, {"core_arch": "mlp", "encode_layer": "128-256"})
# start training
mappo.fit(env, model, stop={"timesteps_total": 1000000}, checkpoint_freq=100, share_policy="group")
Is there a full document of what exactly can be configured? Or should I just refer to Ray's document?
Hello,
I tried the method mentioned in one of the issues for inferencing but I am running into issues with loading the environment with the config files saved after training . I did the training of a custom environment with MAPPO and saved the checkpoints. Any help on this would be appreciated?
There is large mismatch between the Readme and the docs. The docs seem to be for an older version of the project? E.g., the example training script from the docs does not work anymore (I believe it has been replaced by the main.py file so that the code can be run from the terminal?).
Also, what version of PettingZoo are you using? I am trying to run the MAgent examples, but by now there is an MAgent2 library which has taken over the PettingZoo[magent] repo. I have tried to find the older version of PettingZoo that matches your code to run the main.py file, but I cannot find it, and I keep getting errors because of this. Can you provide it, please, and also update your docs/readme asap? This is important for the long-term health of your project.
Thank you.
Hi
Could you please add the following into MARLlib
I want to use meta drive environment.
but It shows the same error: "No module named 'metadrive.envs'"
I'm using
gym 0.20.0
metadrive 1.4.9.1
metadrive-simulator 0.2.3 (as written in the document)
Can you give me some help? Thank you.
Hello,
I am trying to run marllib and am running into issues. After following the installation for the PettingZoo MPE environment i am running into the follwing error:
ray.tune.error.TuneError: ('Trials did not complete', [IPPOTrainer_mpe_simple_adversary_227a8_00000])
This is caused by the following error:
(pid=8044) File "path/to/.virtualenvs/venv/lib/python3.8/site-packages/ray/rllib/models/catalog.py", line 287, in get_action_dist
(pid=8044) raise NotImplementedError("Unsupported args: {} {}".format(
(pid=8044) NotImplementedError: Unsupported args: Box(0.0, 1.0, (5,), float32) None
I am running it with the following config:
{
"local_mode":false,
"algorithm":"ppo",
"env":"mpe",
"env_args":{
"map_name":"simple_adversary",
"continuous_actions":true,
"max_cycles":25
},
"algo_args":{
"use_gae":true,
"lambda":1.0,
"kl_coeff":0.2,
"batch_episode":10,
"num_sgd_iter":5,
"vf_loss_coeff":1.0,
"lr":0.0005,
"entropy_coeff":0.01,
"clip_param":0.3,
"vf_clip_param":10.0,
"batch_mode":"complete_episodes"
},
"model_arch_args":{
},
"share_policy":"group",
"evaluation_interval":10,
"framework":"torch",
"num_workers":0,
"num_gpus":0,
"num_cpus_per_worker":1,
"num_gpus_per_worker":0,
"stop_iters":9999999,
"stop_timesteps":2000000,
"stop_reward":999999,
"seed":123,
"mask_flag":false,
"global_state_flag":false,
"opp_action_in_cc":true
}
I am using WSL2 with a virtual environment inside. This is my pip list:
Package Version Editable project location
-------------------- ---------- -------------------------------
aiosignal 1.3.1
asttokens 2.2.1
async-timeout 4.0.2
attrs 22.2.0
certifi 2022.12.7
charset-normalizer 3.0.1
click 8.1.3
cloudpickle 2.2.1
colorama 0.4.6
contourpy 1.0.7
cycler 0.11.0
distlib 0.3.6
dm-tree 0.1.8
executing 1.2.0
filelock 3.9.0
fonttools 4.38.0
frozenlist 1.3.3
grpcio 1.51.3
gym 0.21.0
gym-notices 0.0.8
gymnasium 0.27.1
gymnasium-notices 0.0.1
icecream 2.1.3
idna 3.4
imageio 2.26.0
importlib-metadata 4.13.0
importlib-resources 5.12.0
jax-jumpy 0.2.0
jsonschema 4.17.3
kiwisolver 1.4.4
lz4 4.3.2
marllib 1.0.0 /path/to/MARLlib
matplotlib 3.7.0
msgpack 1.0.4
networkx 3.0
numpy 1.24.2
packaging 23.0
pandas 1.5.3
PettingZoo 1.22.3
Pillow 9.4.0
pip 22.3.1
pkgutil_resolve_name 1.3.10
platformdirs 3.0.0
protobuf 3.20.3
pygame 2.1.3.dev8
Pygments 2.14.0
pyparsing 3.0.9
pyrsistent 0.19.3
python-dateutil 2.8.2
pytz 2022.7.1
PyWavelets 1.4.1
PyYAML 6.0
ray 1.8.0
redis 4.5.1
requests 2.28.2
scikit-image 0.19.3
scipy 1.10.1
setuptools 65.5.1
six 1.16.0
SuperSuit 3.7.1
tabulate 0.9.0
tensorboardX 2.6
tifffile 2023.2.3
torch 1.9.1
typing_extensions 4.5.0
urllib3 1.26.14
virtualenv 20.19.0
wheel 0.38.4
zipp 3.15.0
What could cause this error? It seems to be incompatibility issues with gym.
Thanks in advance!
When I use qmix to the mpe environment like this
python marl/main.py --algo_config=qmix --env_config=mpe with env_args.map_name=simple_spread
it will produce a problem below
ValueError: illegal action space
I checked the value of space_act, the value of it is Discrete(5), it is the instance of Discrete, but the result of isinstance(space_act, Discrete) is False.
Why does this create such a problem?Looking for your help.
Hi,
Brilliant repo on MARL benchmarks!
I encountered issues with action space when run training on MPE maps. On mpe_cooperative
I run python marl/main.py --algo_config="a2c" --finetuned --env_config="mpe" with env_args.map_name="simple_spread"
and get the following error (
I followed the installation instruction at Python 3.8.0
and gym==0.21.0
.):
Failure # 1 (occurred at 2022-12-09_06-28-12)
Traceback (most recent call last):
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 890, in _process_trial
results = self.trial_executor.fetch_result(trial)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 788, in fetch_result
result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/worker.py", line 1627, in get
raise value
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, �[36mray::IA2CTrainer.__init__()�[39m (pid=25756, ip=10.103.0.40)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 137, in __init__
Trainer.__init__(self, config, env, logger_creator)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 623, in __init__
super().__init__(config, logger_creator)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trainable.py", line 107, in __init__
self.setup(copy.deepcopy(self.config))
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 147, in setup
super().setup(config)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 776, in setup
self._init(self.config, self.env_creator)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 171, in _init
self.workers = self._make_workers(
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 858, in _make_workers
return WorkerSet(
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 110, in __init__
self._local_worker = self._make_worker(
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 406, in _make_worker
worker = cls(
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 539, in __init__
policy_dict = _determine_spaces_for_multi_agent_dict(
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1486, in _determine_spaces_for_multi_agent_dict
raise ValueError(
ValueError: `action_space` not provided in PolicySpec for shared_policy and env does not have an action space OR no spaces received from other workers' env(s) OR no `action_space` specified in config!
In addition, I get the following error when running on mpe_mixed
:
Failure # 1 (occurred at 2022-12-09_06-26-45)
Traceback (most recent call last):
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 890, in _process_trial
results = self.trial_executor.fetch_result(trial)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 788, in fetch_result
result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/worker.py", line 1627, in get
raise value
ray.exceptions.RayActorError: The actor died because of an error raised in its creation task, �[36mray::IA2CTrainer.__init__()�[39m (pid=24584, ip=10.103.0.40)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 137, in __init__
Trainer.__init__(self, config, env, logger_creator)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 623, in __init__
super().__init__(config, logger_creator)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/tune/trainable.py", line 107, in __init__
self.setup(copy.deepcopy(self.config))
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 147, in setup
super().setup(config)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 776, in setup
self._init(self.config, self.env_creator)
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 171, in _init
self.workers = self._make_workers(
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 858, in _make_workers
return WorkerSet(
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 110, in __init__
self._local_worker = self._make_worker(
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 406, in _make_worker
worker = cls(
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 584, in __init__
self._build_policy_map(
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1384, in _build_policy_map
self.policy_map.create_policy(name, orig_cls, obs_space, act_space,
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/policy/policy_map.py", line 143, in create_policy
self[policy_id] = class_(observation_space, action_space,
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/policy/policy_template.py", line 241, in __init__
dist_class, logit_dim = ModelCatalog.get_action_dist(
File "/home/yansong/anaconda3/envs/marllib/lib/python3.8/site-packages/ray/rllib/models/catalog.py", line 287, in get_action_dist
raise NotImplementedError("Unsupported args: {} {}".format(
NotImplementedError: Unsupported args: Discrete(5) None
Any thoughts on these?
Great Thanks.
Hello,
We have created our custom environment for and wrapped it in a gym class. After training using MAPPO, we got checkpoint files including params and .pkl files.
We want to now use the trained policy to evaluate certain observation spaces that we feed the policy. How should we go about this?
Could we have the latest version's of MuJoCo. For example Half_Cheetah_v3 since this one has different properties to the previous versions.
Hi,Thank you very much for your work, which has inspired me a lot, but I still have some questions .
###first,I want to deploy this project on Windows, but there are some errors, The error at mul_manager = multiprocessing.manager () in marl\algos\utils\centralized_critic_hetero.py will look like the following:
d:\廖文华\code\MARLlib
Backend TkAgg is interactive backend. Turning interactive mode on.
d:\廖文华\code\MARLlib
Backend TkAgg is interactive backend. Turning interactive mode on.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "d:\廖文华\code\MARLlib\marl\main.py", line 9, in <module>
from marl.algos.run_il import run_il
File "d:\廖文华\code\MARLlib\marl\algos\run_il.py", line 15, in <module>
from marl.algos.scripts import POlICY_REGISTRY
File "d:\廖文华\code\MARLlib\marl\algos\scripts\__init__.py", line 13, in <module>
from marl.algos.scripts.happo import run_happo
File "d:\廖文华\code\MARLlib\marl\algos\scripts\happo.py", line 4, in <module>
from marl.algos.core.CC.happo import HAPPOTrainer
File "d:\廖文华\code\MARLlib\marl\algos\core\CC\happo.py", line 24, in <module>
from marl.algos.utils.centralized_critic_hetero import (
File "d:\廖文华\code\MARLlib\marl\algos\utils\centralized_critic_hetero.py", line 16, in <module>
mul_manager = multiprocessing.Manager()
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\multiprocessing\context.py", line 57, in Manager
m.start()
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\multiprocessing\managers.py", line 579, in start
self._process.start()
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\ProgramData\Anaconda3\envs\muti_agent\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
I dont kown why and how to solve it,could you please give me some advice.
###second,could you please show your pip list or conda list?I want to see some lib's version in your project ,thanks!!!
Would it possible to provide the trained models (weights) as well?
Moreover, when looking at the results, I could not find the results for the MAgent Gather game, though it is mentioned in the docs and in the MAgent env file.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.