Git Product home page Git Product logo

Comments (41)

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

Are you running on server w/o display?

If not then you can try DISPLAY="" python maniskill2_learn/...

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Thanks for your immediate reply! I am running on server with display, and I have checked the value of the DISPLAY="" by

echo $DISPLAY
0.0

But when I try DISPLAY="0.0" python maniskill2_learn/... or DISPLAY="" python maniskill2_learn/... the error is the same:

[2023-04-10 07:15:38.930] [svulkan2] [error] GLFW error: X11: Failed to open display 
[2023-04-10 07:15:38.930] [svulkan2] [warning] Continue without GLFW.
Traceback (most recent call last):
  File "maniskill2_learn/apis/run_rl.py", line 487, in <module>
    main()
  File "maniskill2_learn/apis/run_rl.py", line 452, in main
    run_one_process(0, 1, args, cfg)
  File "maniskill2_learn/apis/run_rl.py", line 374, in run_one_process
    rollout = build_rollout(rollout_cfg)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/builder.py", line 15, in build_rollout
    return build_from_cfg(cfg, ROLLOUTS, default_args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/utils/meta/registry.py", line 136, in build_from_cfg
    return obj_cls(**args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/rollout.py", line 15, in __init__
    self.vec_env = build_vec_env(env_cfg, num_procs, seed=seed, **kwargs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/env_utils.py", line 226, in build_vec_env
    vec_env = VectorEnv(cfgs, **vec_env_kwargs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/vec_env.py", line 318, in __init__
    super(VectorEnv, self).__init__(env_cfgs=env_cfgs, **kwargs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/vec_env.py", line 180, in __init__
    self.env_cfgs, self.single_env, self.num_envs = env_cfgs, build_env(env_cfgs[0]), len(env_cfgs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/env_utils.py", line 205, in build_env
    return build_from_cfg(cfg, ENVS)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/utils/meta/registry.py", line 136, in build_from_cfg
    return obj_cls(**args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/env_utils.py", line 155, in make_gym_env
    env = gym.make(env_name, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 184, in make
    return registry.make(id, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 106, in make
    env = spec.make(**kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 73, in make
    env = self.entry_point(**_kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/utils/registration.py", line 92, in make
    env = env_spec.make(**kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/utils/registration.py", line 34, in make
    return self.cls(**_kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/envs/ms1/base_env.py", line 55, in __init__
    super().__init__(*args, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/envs/sapien_env.py", line 102, in __init__
    self._renderer = sapien.SapienRenderer(**renderer_kwargs)
RuntimeError: vk::PhysicalDevice::createDeviceUnique: ErrorInitializationFailed

but I have already solved the problem of Vulkan when I was training under "PickCube-v0" or "StackCube-v0", so how could I solve it?

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

if you go to interactive python and run

import mani_skill2.envs, gym
env=gym.make('MoveBucket-v0', obs_mode='rgbd')
obs=env.reset()

Does it show initialization failed?

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

When I go to interactive python and run

import mani_skill2.envs, gym
env=gym.make('MoveBucket-v0', obs_mode='rgbd')
obs=env.reset()

it shows like:

>>> import mani_skill2.envs, gym
>>> env=gym.make('MoveBucket-v0', obs_mode='rgbd')
Traceback (most recent call last):
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 150, in spec
    return self.env_specs[id]
KeyError: 'MoveBucket-v0'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 184, in make
    return registry.make(id, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 105, in make
    spec = self.spec(path)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 161, in spec
    raise error.DeprecatedEnv(
gym.error.DeprecatedEnv: Env MoveBucket-v0 not found (valid versions include ['MoveBucket-v1'])
>>> obs=env.reset()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'env' is not defined

does it mean I haven't downloaded the MoveBucket-v0? But I have already downloaded the MoveBucket-v0 before by

python -m mani_skill2.utils.download_asset partnet_mobility_bucket

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

oh sorry, it should be MoveBucket-v1, not v0

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

it shows initialization failed:

>>> import mani_skill2.envs, gym
>>> env=gym.make('MoveBucket-v1', obs_mode='rgbd')
Segmentation fault (core dumped)

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

Try
DISPLAY=“” python

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

It shows like:

>>> DISPLAY=""
>>> 

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

I mean

DISPLAY=“” python

then create movebucket env in the interactive puthon

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

It shows like:

>>> DISPLAY=""
>>> import mani_skill2.envs, gym
>>> env=gym.make('MoveBucket-v1', obs_mode='rgbd')
[2023-04-10 08:38:11.157] [svulkan2] [error] GLFW error: X11: Failed to open display “”
[2023-04-10 08:38:11.157] [svulkan2] [warning] Continue without GLFW.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 184, in make
    return registry.make(id, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 106, in make
    env = spec.make(**kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 73, in make
    env = self.entry_point(**_kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/utils/registration.py", line 92, in make
    env = env_spec.make(**kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/utils/registration.py", line 34, in make
    return self.cls(**_kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/envs/ms1/base_env.py", line 55, in __init__
    super().__init__(*args, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/envs/sapien_env.py", line 102, in __init__
    self._renderer = sapien.SapienRenderer(**renderer_kwargs)
RuntimeError: vk::PhysicalDevice::createDeviceUnique: ErrorInitializationFailed

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

https://github.com/haosulab/ManiSkill2/issues/73#issuecomment-1489473119

linking this as a reference

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

Oh, you need to first exit the interactive python, then

DISPLAY=“” python

to reenter the interactive Python with display env variable set to empty,

then create movebucket env inside the interactive python

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

It shows like:

(sapien) lxt21@ubuntu:~/SAPIEN-master/ManiSkill2-Learn-main$ DISPLAY="" python
Python 3.8.0 (default, Nov  6 2019, 21:49:08) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mani_skill2.envs, gym
>>> env=gym.make('MoveBucket-v1', obs_mode='rgbd')
[2023-04-10 08:43:04.682] [svulkan2] [error] GLFW error: X11: Failed to open display 
[2023-04-10 08:43:04.682] [svulkan2] [warning] Continue without GLFW.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 184, in make
    return registry.make(id, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 106, in make
    env = spec.make(**kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 73, in make
    env = self.entry_point(**_kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/utils/registration.py", line 92, in make
    env = env_spec.make(**kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/utils/registration.py", line 34, in make
    return self.cls(**_kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/envs/ms1/base_env.py", line 55, in __init__
    super().__init__(*args, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/envs/sapien_env.py", line 102, in __init__
    self._renderer = sapien.SapienRenderer(**renderer_kwargs)
RuntimeError: vk::PhysicalDevice::createDeviceUnique: ErrorInitializationFailed

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

Did you ensure that the 3 nvidia json files exist according to https://haosulab.github.io/ManiSkill2/getting_started/installation.html#vulkan

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Yes, the 3 nvidia json files already existed.
image
image
image

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

Do they have the right content?

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

The content differences between them are in api_version:
image
image
After I changed both of them to"api_version" : "1.2.155", the error still occurred:

(sapien) lxt21@ubuntu:~/SAPIEN-master/ManiSkill2-Learn-main$ DISPLAY="" python
Python 3.8.0 (default, Nov  6 2019, 21:49:08) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mani_skill2.envs, gym
>>> env=gym.make('MoveBucket-v1', obs_mode='rgbd')
[2023-04-10 09:11:02.929] [svulkan2] [error] GLFW error: X11: Failed to open display 
[2023-04-10 09:11:02.929] [svulkan2] [warning] Continue without GLFW.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 184, in make
    return registry.make(id, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 106, in make
    env = spec.make(**kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/envs/registration.py", line 73, in make
    env = self.entry_point(**_kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/utils/registration.py", line 92, in make
    env = env_spec.make(**kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/utils/registration.py", line 34, in make
    return self.cls(**_kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/envs/ms1/base_env.py", line 55, in __init__
    super().__init__(*args, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/envs/sapien_env.py", line 102, in __init__
    self._renderer = sapien.SapienRenderer(**renderer_kwargs)
RuntimeError: vk::PhysicalDevice::createDeviceUnique: ErrorInitializationFailed

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Oh, I have found that the error Segmentation fault (core dumped) occurs when there are too many tasks on one GPU. When I stop some tasks, it is solved, but when I rerun:

python maniskill2_learn/apis/run_rl.py configs/mfrl/ppo/maniskill2_pn.py 
--work-dir /data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/Result/MoveBucket-v1/PPO 
--gpu-ids 0 --cfg-options "env_cfg.env_name=MoveBucket-v1" 
"env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200" "env_cfg.control_mode=pd_joint_delta_pos" "env_cfg.reward_mode=dense" 
"rollout_cfg.num_procs=5" "eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=True" "eval_cfg.num_procs=5"

New error occurred:

AssertionError: pd_joint_delta_pos not in supported modes: ['base_pd_joint_vel_arm_pd_joint_vel', 
'base_pd_joint_vel_arm_pd_joint_delta_pos', 'base_pd_joint_vel_arm_pd_joint_pos', 
'base_pd_joint_vel_arm_pd_joint_target_delta_pos', 'base_pd_joint_vel_arm_pd_ee_delta_pos', 
'base_pd_joint_vel_arm_pd_ee_delta_pose', 'base_pd_joint_vel_arm_pd_ee_target_delta_pos', 
'base_pd_joint_vel_arm_pd_ee_target_delta_pose', 'base_pd_joint_vel_arm_pd_joint_pos_vel', 
'base_pd_joint_vel_arm_pd_joint_delta_pos_vel']

So I changed "env_cfg.control_mode=pd_joint_delta_pos" to "env_cfg.control_mode=base_pd_joint_vel_arm_pd_joint_vel" , but another error occurred:

Traceback (most recent call last):
  File "maniskill2_learn/apis/run_rl.py", line 487, in <module>
    main()
  File "maniskill2_learn/apis/run_rl.py", line 452, in main
    run_one_process(0, 1, args, cfg)
  File "maniskill2_learn/apis/run_rl.py", line 374, in run_one_process
    rollout = build_rollout(rollout_cfg)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/builder.py", line 15, in build_rollout
    return build_from_cfg(cfg, ROLLOUTS, default_args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/utils/meta/registry.py", line 136, in build_from_cfg
    return obj_cls(**args)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/rollout.py", line 15, in __init__
    self.vec_env = build_vec_env(env_cfg, num_procs, seed=seed, **kwargs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/env_utils.py", line 226, in build_vec_env
    vec_env = VectorEnv(cfgs, **vec_env_kwargs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/vec_env.py", line 318, in __init__
    super(VectorEnv, self).__init__(env_cfgs=env_cfgs, **kwargs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/vec_env.py", line 188, in __init__
    self.buffers = create_buffer_for_env(self.single_env, self.num_envs, self.SHARED_NP_BUFFER)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/vec_env.py", line 44, in create_buffer_for_env
    obs = env.reset()
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/wrappers.py", line 97, in reset
    obs = self.env.reset(*args, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/wrappers/time_limit.py", line 27, in reset
    return self.env.reset(**kwargs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/wrappers.py", line 222, in reset
    return self.observation(obs)
  File "/data/home-gxu/lxt21/SAPIEN-master/ManiSkill2-Learn-main/maniskill2_learn/env/wrappers.py", line 377, in observation
    base_pose = observation["agent"]["base_pose"]
KeyError: 'base_pose'

How could I solve the error above?

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

I've just fixed ManiSkill2-Learn. Please pull the latest code.

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Hi, I could run my code under "env_name=MoveBucket-v1" with the control_mode=base_pd_joint_vel_arm_pd_joint_delta_pos, However, when I evluated my code locally, it came out:

[2023-04-17 09:17:36.973] [svulkan2] [warning] A second renderer will share the same internal context with the first one. Arguments passed to constructor will be ignored.
Traceback (most recent call last):
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/runpy.py", line 192, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/evaluation/run_evaluation.py", line 151, in <module>
    main()
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/evaluation/run_evaluation.py", line 132, in main
    evaluator.setup(args.env_id, UserPolicy, env_kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/evaluation/run_evaluation.py", line 23, in setup
    super().setup(*args, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/mani_skill2/evaluation/evaluator.py", line 30, in setup
    self.policy = policy_cls(
  File "/data/home-gxu/lxt21/SAPIEN-master/0408_1_/user_solution.py", line 35, in __init__
    env_params = get_env_info(cfg.env_cfg)
  File "/data/home-gxu/lxt21/SAPIEN-master/0408_1_/maniskill2_learn/env/env_utils.py", line 83, in get_env_info
    vec_env = build_vec_env(env_cfg.copy()) if vec_env is None else vec_env
  File "/data/home-gxu/lxt21/SAPIEN-master/0408_1_/maniskill2_learn/env/env_utils.py", line 224, in build_vec_env
    vec_env = SingleEnv2VecEnv(cfgs, **vec_env_kwargs)
  File "/data/home-gxu/lxt21/SAPIEN-master/0408_1_/maniskill2_learn/env/vec_env.py", line 266, in __init__
    self._init_obs_space()
  File "/data/home-gxu/lxt21/SAPIEN-master/0408_1_/maniskill2_learn/env/vec_env.py", line 201, in _init_obs_space
    self.observation_space = convert_observation_to_space(self.reset(idx=np.arange(self.num_envs)))
  File "/data/home-gxu/lxt21/SAPIEN-master/0408_1_/maniskill2_learn/env/vec_env.py", line 279, in reset
    return self._unsqueeze(self._env.reset(*args, **kwargs))
  File "/data/home-gxu/lxt21/SAPIEN-master/0408_1_/maniskill2_learn/env/wrappers.py", line 97, in reset
    obs = self.env.reset(*args, **kwargs)
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/site-packages/gym/wrappers/time_limit.py", line 27, in reset
    return self.env.reset(**kwargs)
  File "/data/home-gxu/lxt21/SAPIEN-master/0408_1_/maniskill2_learn/env/wrappers.py", line 222, in reset
    return self.observation(obs)
  File "/data/home-gxu/lxt21/SAPIEN-master/0408_1_/maniskill2_learn/env/wrappers.py", line 381, in observation
    pose = observation["extra"]["tcp_pose"]
KeyError: 'tcp_pose'

What steps should I take to resolve this issue?(The problem only occurs when I run my code with Articulation tasks)

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

Did you use the latest maniskill2-learn?

Please make sure the maniskill2-learn codebase you are importing from is the latest. I fixed these errors for articulation tasks last week (updated env/wrappers.py)

Also in the latest maniskill2_learn, line 384 in wrappers.py is not pose = observation["extra"]["tcp_pose"].

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Thank you for your generous help, the issue of KeyError: 'tcp_pose' has been solved! However, I am facing another problem: I only can use the GPU[0] for training, when I changed --gpu-ids 0 to --gpu-ids 1 it came with:

RuntimeError: Cannot find cuda device suitable for rendering cuda:1

even when I change to --gpu-ids 0 --sim-gpu-ids 1 or add CUDA_VISIBLE_DEVICES=1 before my traing code, the error still persists. What steps should I take to resolve this issue?

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

Try

DISPLAY="" CUDA_VISIBLE_DEVICES={gpu_id} python {cmd}

Put these into a minimal debug file, and replace {cmd} with the file name.

import mani_skill2.envs, gym
env = gym.make('MoveBucket-v1', obs_mode='rgbd')
obs = env.reset()

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

when I try:

DISPLAY="" CUDA_VISIBLE_DEVICES=2 python debug.py 

it came out:
image

and when I put DISPLAY="" CUDA_VISIBLE_DEVICES=2 before my training script, the error [error] GLFW error: X11: Failed to open display still persists, but it starts training on GPU[2]. Dose the error [error] GLFW error: X11: Failed to open display matter?

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

It doesn't matter.

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Thank you much! By the way, there is some uncertainty around the timing of verifying the list of entries. Specifically, I'm not sure whether I should submit the list of entries after the prize results are published, or confirm the list of entries during the challenge period.

from maniskill.

StoneT2000 avatar StoneT2000 commented on August 23, 2024

@xtli12 if you are referring to the final entry selection on your team page, you must do that before the competition ends.

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Got it, thank you very much.

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Hi, when I train the code under env_name=AssemblingKits-v0 or env_name=PegInsertionSide-v0, the result is always 0. Therefore, I want to modify the backbone. However, when I set a breakpoint in run_rl.py like:
image
And when I debug it, the following error message appears:

Traceback (most recent call last):
  File "/data/home-gxu/lxt21/.pycharm_helpers/pydev/pydevd.py", line 1483, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/data/home-gxu/lxt21/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 11, in execfile
    stream = tokenize.open(file)  # @UndefinedVariable
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/tokenize.py", line 392, in open
    buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/SAPIEN-master/ManiSkill2-Learn-main-old/maniskill2_learn/apis/train_rl.py'
python-BaseException

How can I solve this issue? Alternatively, since one file of the backbone may be attached to many other files, how can I set a breakpoint and debug the backbone in the Maniskill2 project in a simple way?

I have also copied the class ConvMLP(ExtendedModule) into a single file, but even then, when I try to debug it, the error message still persists.

Traceback (most recent call last):
  File "/data/home-gxu/lxt21/.pycharm_helpers/pydev/pydevd.py", line 1483, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/data/home-gxu/lxt21/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 11, in execfile
    stream = tokenize.open(file)  # @UndefinedVariable
  File "/data/home-gxu/lxt21/.conda/envs/sapien/lib/python3.8/tokenize.py", line 392, in open
    buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/SAPIEN-master/ManiSkill2-Learn-main-old/maniskill2_learn/networks/backbones/pointnet_test.py'

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

It says "no such file or directory", so I guess

(1) modify the pycharm debug config file, since the root directory might be wrong
(2) ensure the correct ide and project path
(3) in maniskill2-learn, pass in rollout_cfg.num_procs=1 and eval_cfg.num_procs=1, to avoid debugging parallel envs.

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Hi, does the leaderboard only display 100 past submissions? I uploaded No.101, but it didn't appear on the leaderboard. If it's limited to displaying only 100 past submissions, how can I delete a historical submission?

Also, I'm uncertain about whether the final result is evaluated using a single docker image that can fit all tasks in one track, or if it's a summary of previous submissions based on many individual docker images, as shown on the current leaderboard?

from maniskill.

StoneT2000 avatar StoneT2000 commented on August 23, 2024

This seems to be a bug with the pagination, Ill have that fixed over the weekend @xtli12 thanks for finding this.

And yes you are right, we ask you to put all your best models over two submissions (we will only rigorously evaluate two submissions for second stage). If you find it difficult to pack them all into two docker images we may up the limit.

I will also add a button to clear the final submission selection so you don't need to scroll too far.

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Thanks for your prompt reply! I think it would be better to increase the final submission limit so that it can include all tasks (for example, 14 submissions in the rigid-body track and 6 submissions in the soft-body track), because according to the guidelines, it seems like we cannot train on multiple environments, and I am unsure about how to combine all the best models into one Docker image and evaluate them successfully.

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Hi, has the submission issue been resolved? @StoneT2000

Hi, does the leaderboard only display 100 past submissions? I uploaded No.101, but it didn't appear on the leaderboard. If it's limited to displaying only 100 past submissions, how can I delete a historical submission?

from maniskill.

StoneT2000 avatar StoneT2000 commented on August 23, 2024

Not just yet, sorry some other issues have raised that took precedence. Will aim to fix this weekend

from maniskill.

StoneT2000 avatar StoneT2000 commented on August 23, 2024

@xtli12 new features have been added. Pagination is now fixed so it'll fetch the right data when you go to the next page, and you can sort most columns across all of the database (e.g. sort your submissions by success rate and it will refetch data accordingly). Moreover, you can clear all final submission selections now in one click to avoid having to go through all the pages (we disable sorting by which submissions are final submissions atm since it is not working with the new pagination). Let me know if you have any issues!

Also, we are still discussing about how many docker images to allow for submission so stay tuned.

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Thank you for your kind assistance! I'll keep an eye on the issue regarding how many docker images are allowed for submission. However, there is a small problem with the leaderboard page. When I uploaded the No.101 submission, the No.1 submission disappeared, and the page only shows the latest 100 submissions (No.2-No.101). Therefore, it may be not possible to select the best submission from history when making the final submission. It would be better to add a button to delete historical submissions.

from maniskill.

StoneT2000 avatar StoneT2000 commented on August 23, 2024

you should be able to sort by date, does it not show the oldest one? There should also be a button that lets you look at the next page

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

yes, it can show oldest one now, thank you very much!

from maniskill.

StoneT2000 avatar StoneT2000 commented on August 23, 2024

Thanks for your prompt reply! I think it would be better to increase the final submission limit so that it can include all tasks (for example, 14 submissions in the rigid-body track and 6 submissions in the soft-body track), because according to the guidelines, it seems like we cannot train on multiple environments, and I am unsure about how to combine all the best models into one Docker image and evaluate them successfully.

we have bumped up the max number of final submissions per track to 14 now, so you can have one image per task instead.

from maniskill.

xtli12 avatar xtli12 commented on August 23, 2024

Ok, thank you very much!

from maniskill.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.