Git Product home page Git Product logo

penspin's Introduction

Lessons from Learning to Spin “Pens”

This repository contains a reference PyTorch implementation of the paper:

Lessons from Learning to Spin “Pens”
Jun Wang*, Ying Yuan*, Haichuan Che*, Haozhi Qi*, Yi Ma, Jitendra Malik, Xiaolong Wang
[Website] [Paper]

Installation

See installation instructions.

Introduction

Our pen spinning method contains the following four steps.

  1. Learn a oracle policy with privileged information, point-clouds, and tactile sensor output with RL in simulation.
  2. Learn a student policy using the rollout of the oracle policy, also in simulation.
  3. Rollout trajectories generated by the oracle policy in a real robot, with initial state distribution matched. The success trajectories are collected while failures are discarded.
  4. Finetune the student policy in step 2 with the real-world successful trajectories.

The following session only provides example script of our method. For baselines, checkout baselines.

Step 0: Visualize a Pre-trained Oracle Policy

cd outputs/AllegroHandHora
gdown 1LCRFE6lvKSUDPpUfEATOmpDUPDbB7n8d
unzip demo.zip -d ./
cd ../../
scripts/vis_teacher.sh demo

Step 1: Oracle Policy training

To train an oracle policy $f$ with RL, run

# 0 is GPU is
# 42 is experiment seed
scripts/train_teacher.sh 0 42 output_name

After training your oracle policy, you can visualize it as follows:

scripts/vis_teacher.sh output_name

Step 2: Student Policy Pretraining

In this section, we train a proprioceptive student policy by distilling from our trained oracle policy $f$.

Note we use the teacher rollout to train student policy, in contrast to DAgger in previous works.

scripts/train_student_sim.sh train.ppo.is_demon=True train.demon_path=ORACLE_CHECKPOINT_PATH 

We have provided a reference teacher checkpoint in Google Drive.

Step 3: Open-Loop Replay in Real Hardware

To generate open-loop replay data for the student policy $\pi$, run

python real/robot_controller/teacher_replay.py --data-collect --exp=0 --replay_data_dir=REPLAY_DATA_DIR

where REPLAY_DATA_DIR is the directory to save the replay data.

Then process the replay data.

Step 4: Real-world Fine-tuning

To fine-tune the student policy $\pi$ using real data, run

scripts/finetune_ppo.sh --real-dataset-folder=REAL_DATA_PATH --checkpoint-path=YOUR_CHECKPOINTPATH

Real Data Download

Please download the real reference data from Google Drive.

Real data:
  real_data.h5 is in the format of h5 file, which contains the following keys:
  -replay_demon_{idx}: the idx-th replay demonstration data
    - qpos: the current qpos of the robot
    - action: the delta action applied to the robot
    - current_target_qpos: the target qpos of the robot

  real_data_full.h5 is a full version of real_data.h5, which contains the following keys:
  -replay_demon_{idx}: the idx-th replay demonstration data
    - qpos: the current qpos of the robot
    - action: the delta action applied to the robot
    - current_target_qpos: the target qpos of the robot
    - rgb_ori: the original rgb image
    - rgb_c2d: the rgb image after camera2depth image processing
    - depth: the depth image
    - pc: the point cloud
    - obj_ends: the position of object ends 

Acknowledgement

Note: This repository is built based on Hora and IsaacGymEnvs.

Citing

If you find PenSpin or this codebase helpful in your research, please consider citing:

@article{wang2024penspin,
  author={Wang, Jun and Yuan, Ying and Che, Haichuan and Qi, Haozhi and Ma, Yi and Malik, Jitendra and Wang, Xiaolong},
  title={Lessons from Learning to Spin “Pens”},
  journal={arXiv:2405.07391},
  year={2024}
}

penspin's People

Contributors

haozhiqi avatar

Stargazers

Yongming Yue avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.