Git Product home page Git Product logo

Comments (12)

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024 1

For MS1 envs, if you type in print(env.model_db), it will print out all asset ids:

>>> env=gym.make('OpenCabinetDoor-v1')
>>> print(env.model_db)
{'1000': {'num_target_links': 2, 'partnet_mobility_id': 1000, 'scale': 0.6129399556808358}, '1001': {'num_target_links': 1, 'partnet_mobility_id': 1001, 'scale': 0.512205865781575}, '1002': {'num_target_links': 1, 'partnet_mobility_id': 1002, 'scale': 0.4520109061191429}, ..., '1081': {'num_target_links': 1, 'partnet_mobility_id': 1081, 'scale': 0.43974655521394934}}

Then, you can choose the specific subset of asset ids by creating the environment as follows:

>>> env=gym.make('OpenCabinetDoor-v1', model_ids=['1000', '1001'])

See https://github.com/haosulab/ManiSkill2/blob/main/mani_skill2/envs/ms1/base_env.py#L23 for more details.

For state-based policy, yes, you should only train on one fixed asset.

from maniskill.

StoneT2000 avatar StoneT2000 commented on August 23, 2024

Thanks for bringing this up @HeegerGao. Testing many other algorithms is a massive effort that we want to undertake. That being said we currently can't test all the new algorithms. Is there any particular ones you had in mind? We do plan to add more baselines in the future (especially CleanRL single file style ones)

from maniskill.

StoneT2000 avatar StoneT2000 commented on August 23, 2024

As for running something like PPO directly we will post more complete benchmark results once v0.5.0 is released which contains the upgrade to gymnasium and some improved reward functions (that we will benchmark on). @xuanlinli17 has more detailed results himself on the current version.

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

PPO should be able to solve PickCube, LiftCube, and StackCube. For most other tasks, success rates are low or near zero (even though state-based policy can solve e.g., PegInsertionSide, PlugCharger when training sufficiently long (>10M), but not for vision-based policies). We'll update the rewards in 0.5.0, which are better suited for RL training purposes.

from maniskill.

HeegerGao avatar HeegerGao commented on August 23, 2024

PPO should be able to solve PickCube, LiftCube, and StackCube. For most other tasks, success rates are low or near zero (even though state-based policy can solve e.g., PegInsertionSide, PlugCharger when training sufficiently long (>10M), but not for vision-based policies). We'll update the rewards in 0.5.0, which are better suited for RL training purposes.

@StoneT2000 @xuanlinli17 Thank you very much for your quick reply! I will wait for your 0.5.0 version.

You said "PPO should be able to solve PickCube, LiftCube, and StackCube." I want to know if I can reproduce these results with the default configs of Maniskill2-Learn, with the env_cfg.obs_frame=ee trick, the env_cfg.n_goal_points=50 trick, and using the pd_ee_delta_pose controller. Are there any other tricks?

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

LiftCube is easy.
For PickCube and StackCube, use a large number of steps (25M+). PickCube will start to succeed after like 10M+. StackCube might need even more steps.

train_cfg.total_steps=2.5e7 or 5e7

You can also modify train_cfg.n_checkpoint

from maniskill.

HeegerGao avatar HeegerGao commented on August 23, 2024

LiftCube is easy. For PickCube and StackCube, use a large number of steps (25M+). PickCube will start to succeed after like 10M+. StackCube might need even more steps.

train_cfg.total_steps=2.5e7 or 5e7

You can also modify train_cfg.n_checkpoint

Thanks!

from maniskill.

sihengz02 avatar sihengz02 commented on August 23, 2024

PPO should be able to solve PickCube, LiftCube, and StackCube. For most other tasks, success rates are low or near zero (even though state-based policy can solve e.g., PegInsertionSide, PlugCharger when training sufficiently long (>10M), but not for vision-based policies). We'll update the rewards in 0.5.0, which are better suited for RL training purposes.

Hi! Thanks for the effort the authors make! Here I want to ask that, whether you have tried state-based policy using PPO/SAC on all tasks (only rigid task is also ok)? Can vanilla PPO algorithm with fully-observed state and the dense reward function you provided solve all the tasks?

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

Yes.

State-based SAC is capable of solving PickCube, LiftCube, StackCube easily;
PegInsertionSide needs >10M to solve;
PlugCharger needs >15M;
For Turnfaucet and PickSingleYCB/EGAD, state-based policy can solve some objects but not all (since it doesn't have vision and thus doesn't understand object geometry), similar for all ManiSkill1 envs;
For AssemblingKits, state-based policy does not achieve >0% performance as far as I know.

from maniskill.

sihengz02 avatar sihengz02 commented on August 23, 2024

Yes.

State-based SAC is capable of solving PickCube, LiftCube, StackCube easily; PegInsertionSide needs >10M to solve; PlugCharger needs >15M; For Turnfaucet and PickSingleYCB/EGAD, state-based policy can solve some objects but not all (since it doesn't have vision and thus doesn't understand object geometry), similar for all ManiSkill1 envs; For AssemblingKits, state-based policy does not achieve >0% performance as far as I know.

Thanks for your timely reply! How about the 4 Mobile Manipulation tasks? And can you give some insights on the SAC/PPO parameters you used to try these tasks?

from maniskill.

xuanlinli17 avatar xuanlinli17 commented on August 23, 2024

For MS1 envs (mobile manipulation), state-based SAC should be able to solve most of single object envs. You can use the ManiSkill2-Learn parameters.

from maniskill.

sihengz02 avatar sihengz02 commented on August 23, 2024

For MS1 envs (mobile manipulation), state-based SAC should be able to solve most of single object envs. You can use the ManiSkill2-Learn parameters.

Thanks! You mention single object, does this mean I should only train on one fixed Cabinet (for example) mesh, instead of using different random mesh? I notice that every time after env.reset(), the object mesh will have an intra-class change. If so, how can I fix the object?

from maniskill.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.