Hi, ManiSkill2 is such a wonderful work and I am very interested in it! I'd like to kn

Thanks for bringing this up <a class="user-mention notranslate" data-hovercard-type="u

Do you have exhaustive results on your benchmark tasks? about maniskill HOT 12 CLOSED

HeegerGao commented on August 23, 2024

Do you have exhaustive results on your benchmark tasks?

from maniskill.

Comments (12)

xuanlinli17 commented on August 23, 2024 1

For MS1 envs, if you type in print(env.model_db), it will print out all asset ids:

>>> env=gym.make('OpenCabinetDoor-v1')
>>> print(env.model_db)
{'1000': {'num_target_links': 2, 'partnet_mobility_id': 1000, 'scale': 0.6129399556808358}, '1001': {'num_target_links': 1, 'partnet_mobility_id': 1001, 'scale': 0.512205865781575}, '1002': {'num_target_links': 1, 'partnet_mobility_id': 1002, 'scale': 0.4520109061191429}, ..., '1081': {'num_target_links': 1, 'partnet_mobility_id': 1081, 'scale': 0.43974655521394934}}

Then, you can choose the specific subset of asset ids by creating the environment as follows:

>>> env=gym.make('OpenCabinetDoor-v1', model_ids=['1000', '1001'])

See https://github.com/haosulab/ManiSkill2/blob/main/mani_skill2/envs/ms1/base_env.py#L23 for more details.

For state-based policy, yes, you should only train on one fixed asset.

from maniskill.

StoneT2000 commented on August 23, 2024

Thanks for bringing this up @HeegerGao. Testing many other algorithms is a massive effort that we want to undertake. That being said we currently can't test all the new algorithms. Is there any particular ones you had in mind? We do plan to add more baselines in the future (especially CleanRL single file style ones)

from maniskill.

StoneT2000 commented on August 23, 2024

As for running something like PPO directly we will post more complete benchmark results once v0.5.0 is released which contains the upgrade to gymnasium and some improved reward functions (that we will benchmark on). @xuanlinli17 has more detailed results himself on the current version.

from maniskill.

xuanlinli17 commented on August 23, 2024

PPO should be able to solve PickCube, LiftCube, and StackCube. For most other tasks, success rates are low or near zero (even though state-based policy can solve e.g., PegInsertionSide, PlugCharger when training sufficiently long (>10M), but not for vision-based policies). We'll update the rewards in 0.5.0, which are better suited for RL training purposes.

from maniskill.

HeegerGao commented on August 23, 2024

PPO should be able to solve PickCube, LiftCube, and StackCube. For most other tasks, success rates are low or near zero (even though state-based policy can solve e.g., PegInsertionSide, PlugCharger when training sufficiently long (>10M), but not for vision-based policies). We'll update the rewards in 0.5.0, which are better suited for RL training purposes.

@StoneT2000 @xuanlinli17 Thank you very much for your quick reply! I will wait for your 0.5.0 version.

You said "PPO should be able to solve PickCube, LiftCube, and StackCube." I want to know if I can reproduce these results with the default configs of Maniskill2-Learn, with the env_cfg.obs_frame=ee trick, the env_cfg.n_goal_points=50 trick, and using the pd_ee_delta_pose controller. Are there any other tricks?

from maniskill.

xuanlinli17 commented on August 23, 2024

LiftCube is easy.
For PickCube and StackCube, use a large number of steps (25M+). PickCube will start to succeed after like 10M+. StackCube might need even more steps.

train_cfg.total_steps=2.5e7 or 5e7

You can also modify train_cfg.n_checkpoint

from maniskill.

HeegerGao commented on August 23, 2024

LiftCube is easy. For PickCube and StackCube, use a large number of steps (25M+). PickCube will start to succeed after like 10M+. StackCube might need even more steps.

train_cfg.total_steps=2.5e7 or 5e7

You can also modify train_cfg.n_checkpoint

Thanks!

from maniskill.

sihengz02 commented on August 23, 2024

PPO should be able to solve PickCube, LiftCube, and StackCube. For most other tasks, success rates are low or near zero (even though state-based policy can solve e.g., PegInsertionSide, PlugCharger when training sufficiently long (>10M), but not for vision-based policies). We'll update the rewards in 0.5.0, which are better suited for RL training purposes.

Hi! Thanks for the effort the authors make! Here I want to ask that, whether you have tried state-based policy using PPO/SAC on all tasks (only rigid task is also ok)? Can vanilla PPO algorithm with fully-observed state and the dense reward function you provided solve all the tasks?

from maniskill.

xuanlinli17 commented on August 23, 2024

Yes.

State-based SAC is capable of solving PickCube, LiftCube, StackCube easily;
PegInsertionSide needs >10M to solve;
PlugCharger needs >15M;
For Turnfaucet and PickSingleYCB/EGAD, state-based policy can solve some objects but not all (since it doesn't have vision and thus doesn't understand object geometry), similar for all ManiSkill1 envs;
For AssemblingKits, state-based policy does not achieve >0% performance as far as I know.

from maniskill.

sihengz02 commented on August 23, 2024

Yes.

State-based SAC is capable of solving PickCube, LiftCube, StackCube easily; PegInsertionSide needs >10M to solve; PlugCharger needs >15M; For Turnfaucet and PickSingleYCB/EGAD, state-based policy can solve some objects but not all (since it doesn't have vision and thus doesn't understand object geometry), similar for all ManiSkill1 envs; For AssemblingKits, state-based policy does not achieve >0% performance as far as I know.

Thanks for your timely reply! How about the 4 Mobile Manipulation tasks? And can you give some insights on the SAC/PPO parameters you used to try these tasks?

from maniskill.

xuanlinli17 commented on August 23, 2024

For MS1 envs (mobile manipulation), state-based SAC should be able to solve most of single object envs. You can use the ManiSkill2-Learn parameters.

from maniskill.

sihengz02 commented on August 23, 2024

For MS1 envs (mobile manipulation), state-based SAC should be able to solve most of single object envs. You can use the ManiSkill2-Learn parameters.

Thanks! You mention single object, does this mean I should only train on one fixed Cabinet (for example) mesh, instead of using different random mesh? I notice that every time after env.reset(), the object mesh will have an intra-class change. If so, how can I fix the object?

from maniskill.

Do you have exhaustive results on your benchmark tasks? about maniskill HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent