pku-marl / dexteroushands Goto Github PK

View Code? Open in Web Editor NEW

575.0 575.0 63.0 1.8 GB

This is a library that provides dual dexterous hand manipulation tasks through Isaac Gym

Home Page: https://pku-marl.github.io/DexterousHands/

License: Apache License 2.0

HTML 0.41% Python 99.30% Shell 0.01% CMake 0.28%

deep-reinforcement-learning dexterous-robotic-hand reinforcement-learning

dexteroushands's Introduction

Hugo Research Group Theme

The Research Group Template empowers your research group to easily create a beautiful website with a stunning homepage, news, academic publications, events, team profiles, and a contact form.

️Trusted by 250,000+ researchers, educators, and students. Highly customizable via the integrated no-code, widget-based Wowchemy page builder, making every site truly personalized ⭐⭐⭐⭐⭐

Easily write technical content with plain text Markdown, LaTeX math, diagrams, RMarkdown, or Jupyter, and import publications from BibTeX.

Check out the latest demo of what you'll get in less than 60 seconds, or view the showcase.

The integrated Wowchemy website builder and CMS makes it easy to create a beautiful website for free. Edit your site in the CMS (or your favorite editor), generate it with Hugo, and deploy with GitHub or Netlify. Customize anything on your site with widgets, light/dark themes, and language packs.

👉 Get Started
📚 View the documentation
💬 Chat with the Wowchemy research community or Hugo community
⬇️ Automatically import citations from BibTeX with the Hugo Academic CLI
🐦 Share your new site with the community: @wowchemy @GeorgeCushen #MadeWithWowchemy
🗳 Take the survey and help us improve #OpenSource
🚀 Contribute improvements or suggest improvements
⬆️ Updating? View the Update Guide and Release Notes

We ask you, humbly, to support this open source movement

Today we ask you to defend the open source independence of the Wowchemy website builder and themes 🐧

We're an open source movement that depends on your support to stay online and thriving, but 99.9% of our creators don't give; they simply look the other way.

❤️ Click here to become a GitHub Sponsor, unlocking awesome perks such as exclusive academic templates and widgets

Demo credits

Please replace the demo images with your own.

Female scientist
2 Coders
Cafe
Blog posts
- https://unsplash.com/photos/AndE50aaHn4
- https://unsplash.com/photos/OYzbqk2y26c
Avatars
- https://unsplash.com/photos/5yENNRbbat4
- https://unsplash.com/photos/WNoLnJo7tS8

dexteroushands's People

Contributors

Stargazers

Watchers

dexteroushands's Issues

ShadowHandMetaMT4 Exception: Unrecognized task!

Hi, developers of DexterousHands,

When I try task ShadowHandMetaMT4, it shows Exception: Unrecognized task!
I checked the main branch code, it doesn't have such a task indeed. Please take a look, thanks.

config files for ShadowHandMetaML4 & ShadowHandMetaML20

Hi DexterousHands developers,

Can you please provide the configuration file of tasks ShadowHandMetaML4 & ShadowHandMetaML20?

Best,
Jiahe Xu

Reproducing results

Hello,

I wasn't able to reproduce results for the following tasks using PPO:

ShadowHandCatchAbreast - The policy seems to learn to perform the task. However, the resulting rewards are much lower
Tasks ShadowHandGraspAndPlace and ShadowHandKettle - None of the saved checkpoints resulted in a policy that can perform the task

Thanks

Segmentation fault (core dumped) in Docker

Device: NVIDIA A100 40GB PCIe GPU Accelerator

Method: Docker

Details:

I run

python train.py --task=ShadowHandOver --algo=ppo

and

python train.py --task=ShadowHandOver --algo=happo

in ~\bi-dexhands

In both task the model weights xxx.pt had been saved in ~\bi-dexhands\logs correctly.

However, at the end of these tasks, it shows error in console as following.

Output:

some episodes done, average rewards:  tensor(16.7454, device='cuda:0')
some episodes done, average rewards:  tensor(14.1145, device='cuda:0')
some episodes done, average rewards:  tensor(15.4696, device='cuda:0')
some episodes done, average rewards:  tensor(15.4252, device='cuda:0')
some episodes done, average rewards:  tensor(14.8325, device='cuda:0')
some episodes done, average rewards:  tensor(19.7192, device='cuda:0')
some episodes done, average rewards:  tensor(15.9727, device='cuda:0')

Algo happo Exp check updates 48825/48828 episodes, total num timesteps 49997824/50000000, FPS 1922.

some episodes done, average rewards:  tensor(14.0804, device='cuda:0')
some episodes done, average rewards:  tensor(17.5084, device='cuda:0')
some episodes done, average rewards:  tensor(18.6891, device='cuda:0')
Segmentation fault (core dumped)

Is there any suggestion about dealing with this error?

Thx in advance!

Do you have one-hand task settings?

I would like to do in-hand manipulation with only one robot hand (Like reorientation or pen spinning). I'm wondering if you have already defined such tasks. If so, it would be a great help to me if you could share it.

ModuleNotFoundError Bug

When I tried the command python train.py --task=ShadowHandOver --algo=happo

There is a ModuleNotFoundError occuring——

Traceback (most recent call last):
File "train.py", line 12, in
from bidexhands.utils.config import set_np_formatting, set_seed, get_args, parse_sim_params, load_cfg
ModuleNotFoundError: No module named 'bidexhands'

But I have already installed bidexhands. This is my conda list:

Could you give me some suggestions?
Thanks :)

report a small typo

At README.md, At Plotting subsection, It should be "generate" at the comment, not "geenrate".

RuntimeError

Hi, developers of DexterousHands,

When I tried the command python train.py --task=ShadowHandOver --algo=happo

There is a RuntimeError occuring——

RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: nvrtc: error: invalid value for --gpu-architecture (-arch)

The specific operation situation is as follows ：

Previously, I installed version 0.8 of PyTorch, but an error was reported during runtime due to inconsistency with the Cuda version. After installing version 1.12 of PyTorch, Torch0.8 was automatically uninstalled due to compatibility issues. Later, this error was reported during runtime.

How to simulate eggs and how realistic they are？

Does this have real scenario testing?

Report a typo at README.md

At data collection example, the first command line use -algo=ppo_collection which will lead to a wrong path to retrieve the file.

isaacgym/python/isaacgym/gymdeps.py", line 21, in _import_deps raise ImportError("PyTorch was imported before isaacgym modules. Please import torch after isaacgym modules.") ImportError: PyTorch was imported before isaacgym modules. Please import torch after isaacgym modules.

isaacgym/python/isaacgym/gymdeps.py", line 21, in _import_deps
raise ImportError("PyTorch was imported before isaacgym modules. Please import torch after isaacgym modules.")
ImportError: PyTorch was imported before isaacgym modules. Please import torch after isaacgym modules.

Confusion on control mode

Hi,
I am currently reading your implementation of shadow_hand_push_block.py.
I am kind of confused about the control mode of the control mode. In line 299, you write: asset_options.default_dof_drive_mode = gymapi.DOF_MODE_NONE. But in pre_physics_step, in line 1169, you are using self.gym.set_dof_position_target_tensor_indexed, which infers that you are using POS control. I read for multiple times, still very confused about where did you set the control mode to POS control??

Parameter sharing of MAPPO

In the paper you provide, it is stated that "Each agent i follows a shared policy". However, in the codebase, I only found implementations that resemble MAPPO's "SeperatedBuffer" and "SeperatedRunner", which are designed for non-parameter-sharing scenarios. This might cause a discrepancy in performance if they are not consistent. I would like to know whether the codebase only supports non-parameter-sharing MAPPO at the moment, or if I have overlooked something.

Offline Dataset download links have expired

As the title says.

Issues about visual observation

hi, yuanpei @cypypccpy. Nice work again!
It seems like something is wrong when I try to use point cloud observation in the task.
Specifically, using the camera through isaacgym API may have some bugs. I print the 'camera_handle', whose value is -1.

Could you please help me? Thanks a lot~

resume training.

how to change code so that training can resume from pre-trained weights?

[compute_observations bug] correctness of self.fingertip_another_pos

Dear authors,

In the shadow hand task's compute_observations() func (e.g. link), there is a code block written as:

self.fingertip_state = self.rigid_body_states[:, self.fingertip_handles][:, :, 0:13]
self.fingertip_pos = self.rigid_body_states[:, self.fingertip_handles][:, :, 0:3]
self.fingertip_another_state = self.rigid_body_states[:, self.fingertip_another_handles][:, :, 0:13]
self.fingertip_another_pos = self.rigid_body_states[:, self.fingertip_another_handles][:, :, 0:3]

I think fingertip_another_handles = fingertip_handles, which might lead to the incorrect observation (i.e., the self.fingertip_pos = self.fingertip_another_pos will always holds true)

The reason is that self.fingertip_another_handles and self.fingertip_handles 's definitions are:

self.fingertip_handles = [self.gym.find_asset_rigid_body_index(shadow_hand_asset, name) for name in self.fingertips]
self.fingertip_another_handles = [self.gym.find_asset_rigid_body_index(shadow_hand_another_asset, name) for name in self.a_fingertips]

I think here since their handles all get gym.from find_asset_rigid_body() and the asset is basically the same, self.fingertip_another_handles = self.fingertip_handles.

As I output the variables, my assumption holds.
Could you give me some tips?

ps: I love your work very much!!

Issues with rl_games

Hi, @cypypccpy nice work!
when I try to run train_rlgames.py, I met the error

It seems like incompatible between rlgames and isaacgym, however I don't find any similar situation like mine.
Furthermore, I found the key variant is 'env' which is a 'RLgamesVecTaskPython' class that can't be loaded into 'rl-games Runner'

please help me, thx!!!

RuntimeError: CUDA error: no kernel image is available for execution on the device

Why am I getting this error you see?
python3 joint_monkey.py command works fine. isaacGYM works fine

Segmentation fault

Hi,

When I run experiments with python train.py --task=ShadowHandOver --algo=ppo, it generates the following error:

Algorithm:  ppo
Python
Averaging factor:  0.01
Obs type: full_state
Not connected to PVD
+++ Using GPU PhysX
Physics Engine: PhysX
Physics Device: cuda:0
GPU Pipeline: enabled
WARNING: lavapipe is not a conformant vulkan implementation, testing use only.
...
Unhandled descriptor set 433
Unhandled descriptor set 1788307008
Segmentation fault (core dumped)

I know this might not be an issue of the repo, but the compatibility of Nvidia gpu driver. Just to post here to see if anyone has the solution.

My test GPU is Nvidia A6000 with NVIDIA-SMI 510.85.02 Driver Version: 510.85.02 CUDA Version: 11.6

Potential collision mesh errors in some tasks

Thanks for your great work!
I tested some tasks and found that there seemed to be some mistakes in some environment like ShadowHandGraspAndPlace. According to the collision mesh of the bucket in the environment below, the opening of the bucket is closed and objects cannot be put in. Is this a bug? I think it might be necessary to do convex decomposition on the bucket.

Do we need to use cuda when configuring the environment for this project?

Can you open-source the implementation codes of the DAPG?

Thanks for your great work! I noticed that your paper mentions that DAPG performs well even with point cloud inputs, comparable to full-state PPO baseline. Can you share the source code? I will cite your paper in my research. Thanks in advance!

Use of YCB dataset in DexterousHands?

The object file format downloaded from the YCB dataset is sdf.
The ycb object file format you provided in the DexterousHands is urdf.
How did you obtain the ycb file in urdf format? Or how did you convert the sdf file format to urdf? Thanks!

Issue with assets

Hi, I am using the pot asset contained in this repo in writing my own environment: "mjcf/pot/mobility.urdf". However, I am having trouble in the reset phase.

in the initialization of environment

get root state tensor, useful for obtaining state information about the object

actor_root_state_tensor = self.gym.acquire_actor_root_state_tensor(self.sim)
self.root_state_tensor = gymtorch.wrap_tensor(actor_root_state_tensor)
get dof state tensor

_dof_states = self.gym.acquire_dof_state_tensor(self.sim)
self.dof_states = gymtorch.wrap_tensor(_dof_states)

self.gym.refresh_actor_root_state_tensor(self.sim)
self.gym.refresh_dof_state_tensor(self.sim)
self.saved_dof_states = self.dof_states.clone()
self.saved_root_state_tensor = self.root_state_tensor.clone()

and then when resetting:

self.gym.set_dof_state_tensor(self.sim, gymtorch.unwrap_tensor(self.saved_dof_states)) 
self.gym.set_actor_root_state_tensor(self.sim, gymtorch.unwrap_tensor(self.saved_root_state_tensor))

However, this leads to very weird physics/simulation errors in which all the pots suddenly fly around/glitching out upon reset. I verified that this is not a problem of my environment because everything loads correctly when I don’t reset things. But when I reset the tensors the environment just glitches crazily. Is there a reason why this might be happening?

As another test, I used another pot asset (the xml file): mjcf/pot/pot.xml, and the issue immediately disappears and everything works correctly. I am wondering what might be the difference between the mobility.urdf vs. pot.xml file, and why one of them leads to the glitch?

Thank you!

Offline Dataset download links have expired

Thanks for your contribution.
The dataset of Shadow hand open outward link now cannot be open.

Possible mismatch between paper and code?

Hi,

In your paper, you detail the following action space for the BottleCap task:

As can be seen in the attached table, the indices for ShadowHand base translation & rotation are:

Right hand: 20-25
Left hand: 46-51

In the task's python file, the docstring of the pre_physics_step method agrees with the above table:

        """
        The pre-processing of the physics step. Determine whether the reset environment is needed, 
        and calculate the next movement of Shadowhand through the given action. The 52-dimensional 
        action space as shown in below:
        
        Index   Description
        0 - 19 	right shadow hand actuated joint
        20 - 22	right shadow hand base translation
        23 - 25	right shadow hand base rotation
        26 - 45	left shadow hand actuated joint
        46 - 48	left shadow hand base translation
        49 - 51	left shadow hand base rotation

        Args:
            actions (tensor): Actions of agents in the all environment 
        """

However, it seems that you used different indices in the code:

self.apply_forces[:, 1, :] = actions[:, 0:3] * self.dt * self.transition_scale * 100000
self.apply_forces[:, 1 + 26, :] = actions[:, 26:29] * self.dt * self.transition_scale * 100000
self.apply_torque[:, 1, :] = self.actions[:, 3:6] * self.dt * self.orientation_scale * 1000
self.apply_torque[:, 1 + 26, :] = self.actions[:, 29:32] * self.dt * self.orientation_scale * 1000

From the above code snippet, it seems that the actual indices that are used are:

Right hand: 0-5
Left hand: 26-31

Is it just a mismatch between the paper and the implementation or am I missing something?

How to get offline dataset？

Hi, I'm wondering how did you get the offline training dataset? Did you train an expert policy then copied the replay buffer or did you create it from human demonstration?

Why adding identical asset

Hi,
I have noticed that your mjcf robot is identical.
I don't understand why you loaded each assets.
Does reusing asset has problem in isaac gym?

Packaging the bi-dexhands directory

It would be easier to use if you can rename the folder bi-dexhands to bidexhands and add a __init__.py to turn it into a python package.
This allows directly use this package in the code. Otherwise there will be a ModuleNotFoundError when importing bidexhands.

The setup.py can be updated as follows.

import os
from setuptools import setup, find_packages
import setuptools

def get_version() -> str:
    init = open(os.path.join("bidexhands", "__init__.py"), "r").read().split()
    return init[init.index("__version__") + 2][1:-1]

setup(
    name="bidexhands",
    version=get_version(),
    description="xxx",
    long_description=open("README.md", encoding="utf8").read(),
    long_description_content_type="text/markdown",
    author="xxx",
    author_email="xxx",
    packages=setuptools.find_packages(),
    keywords="xxx",
    python_requires='>=3.7',
)

You also need to modify several import lines inside the package, e.g., change

from tasks.shadow_hand_over import ShadowHandOver

from bidexhands.tasks.shadow_hand_over import ShadowHandOver

If you don't mind, I can open a pull request.

The visualization issue in Docker

I hope to run the BiDexHands in Docker and the code can be executed in the headless mode. However it is hard for visualization.

I tried to call the gym.write_viewer_image_to_file function first. But it seems not work in a none-display environment.
Then I tried to rewrite the code of base_task.py like IsaacGymEnvs did in their vec_task.py, which utilized a virtual display. But it still did not work.

I wonder if there are or you can offer any interfaces for visualization in a non-display environment. Like saving the images or videos from the isaac gym viewer.

Possible bug? (ValueError: The parameter loc has invalid values)

Hello,

First, let me thank you for open-sourcing this great framework. However, I am unable to run the training without getting the following error:

Traceback (most recent call last):
  File "train.py", line 95, in <module>
    train()
  File "train.py", line 47, in train
    sarl.run(num_learning_iterations=iterations, log_interval=cfg_train["learn"]["save_interval"])
  File "/media/data/users/erez/repos/bi-dexhands/main/bi-dexhands/algorithms/rl/ppo/ppo.py", line 142, in run
    actions, actions_log_prob, values, mu, sigma = self.actor_critic.act(current_obs, current_states)
  File "/media/data/users/erez/repos/bi-dexhands/main/bi-dexhands/algorithms/rl/ppo/module.py", line 77, in act
    distribution = MultivariateNormal(actions_mean, scale_tril=covariance)
  File "/home/ubuntu/miniconda3/envs/bidexhands/lib/python3.7/site-packages/torch/distributions/multivariate_normal.py", line 146, in __init__
    super(MultivariateNormal, self).__init__(batch_shape, event_shape, validate_args=validate_args)
  File "/home/ubuntu/miniconda3/envs/bidexhands/lib/python3.7/site-packages/torch/distributions/distribution.py", line 53, in __init__
    raise ValueError("The parameter {} has invalid values".format(param))
ValueError: The parameter loc has invalid values

I've been trying to train using task=ShadowHandPushBlock, and no matter what I tried (various versions of python/pytorch/cuda/nvidia drivers), this error always pops-up in an undeterminitic manner (i.e. not the same iteration).

The environment I was using, that resulted with the attached traceback:

Python 3.7
Pytorch 1.8.1
Torchvision 0.9.1
CudaToolkit 11.1.1

Reproducing re-orientation task

Hi,

I tried to play with the re-orientation task by simply running
"python train.py --task=ShadowHandReOrientation --algo=ppo"
without making any further changes. After 6500 iterations, the training ended. But the trained controller could not perform the task well.

The hand only makes small moves (shaking) which could not re-orient the cube.
Is there anything that I'm doing wrong?

Thanks
Zheyu

How does Happo train heterogeneous environments?

Max steps for each environment and meaning of Environment Steps

Hello! Thank you for this great work. Two questions:

Max steps

I'm having trouble finding the maximum number of steps each environment is rolled out for per episode (during training and eval). Could you direct me to the place I can see that?

Meaning of Environment Steps

Also, I'm curious about the specific meaning of environment steps is in the context of graphs like this one. Does it mean the total number of times env.step() is called during training?

ValueError Bug

When I tried the command python train.py --task=ShadowHandOver --algo=ppo

There is a ValueError occuring——
Traceback (most recent call last):
File "train.py", line 91, in <module>
args = get_args()
File "/home/charlie/PycharmProjects/DexterousHands/bi-dexhands/utils/config.py", line 289, in get_args
raise ValueError("--checkpoint is not supported by rl-pytorch. Please use --resume <iteration number>")
ValueError: --checkpoint is not supported by rl-pytorch. Please use --resume <iteration number>

Could you give me some suggestions?
Thanks :)

Segmentation fault (core dumped) in Docker

Device: NVIDIA A100 40GB PCIe GPU Accelerator

Method: Docker

Details:

I run

python train.py --task=ShadowHandOver --algo=ppo

and

python train.py --task=ShadowHandOver --algo=happo

in ~\bi-dexhands

In both task the model weights xxx.pt had been saved in ~\bi-dexhands\logs correctly.

However, at the end of these tasks, it shows error in console as following.

Output:

some episodes done, average rewards:  tensor(16.7454, device='cuda:0')
some episodes done, average rewards:  tensor(14.1145, device='cuda:0')
some episodes done, average rewards:  tensor(15.4696, device='cuda:0')
some episodes done, average rewards:  tensor(15.4252, device='cuda:0')
some episodes done, average rewards:  tensor(14.8325, device='cuda:0')
some episodes done, average rewards:  tensor(19.7192, device='cuda:0')
some episodes done, average rewards:  tensor(15.9727, device='cuda:0')

Algo happo Exp check updates 48825/48828 episodes, total num timesteps 49997824/50000000, FPS 1922.

some episodes done, average rewards:  tensor(14.0804, device='cuda:0')
some episodes done, average rewards:  tensor(17.5084, device='cuda:0')
some episodes done, average rewards:  tensor(18.6891, device='cuda:0')
Segmentation fault (core dumped)

Is there any suggestion about dealing with this error?

Thx in advance!

Feature Request: migrate from gym to gymnasium

Hi, would it be possible for DexterousHands to be upgraded from gym to gymnasium? Gymnasium is the maintained version of openai gym and is compatible with current RL training libraries (rllib and tianshou have already migrated, and stable-baselines3 will soon).

This repository is currently listed in the gymnasium third party environments but we are cleaning the list up to only include maintained gymnasium-compatible repositories.

For information about upgrading and compatibility, see migration guide and gym compatibility. The main difference is the API has switched to returning truncated and terminated, rather than done, in order to give more information and mitigate edge case issues.

Comprehensive success metrics

Hi folks, kudos on a great benchmark! Do you happen to have success metrics for each task?

pku-marl / dexteroushands Goto Github PK

dexteroushands's Introduction

We ask you, humbly, to support this open source movement

Demo credits

dexteroushands's People

Contributors

Stargazers

Watchers

Forkers

dexteroushands's Issues

Segmentation fault (core dumped) in Docker

in the initialization of environment

get root state tensor, useful for obtaining state information about the object

and then when resetting:

Max steps

Meaning of Environment Steps

Segmentation fault (core dumped) in Docker

Recommend Projects

Recommend Topics

Recommend Org