anthonysimeonov / ndf_robot Goto Github PK

Implementation of the method proposed in the paper "Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation"

License: MIT License

Shell 1.27% Python 98.73%

ndf_robot's Introduction

Neural Descriptor Fields (NDF)

PyTorch implementation for training continuous 3D neural fields to represent dense correspondence across objects, and using these descriptor fields to mimic demonstrations of a pick-and-place task on a robotic system

This is the reference implementation for our paper:

Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation

PDF | Video

Anthony Simeonov*, Yilun Du*, Andrea Tagliasacchi, Joshua B. Tenenbaum, Alberto Rodriguez, Pulkit Agrawal**, Vincent Sitzmann** (*Equal contribution, order determined by coin flip. **Equal advising)

Google Colab

If you want a quickstart demo of NDF without installing anything locally, we have written a Colab. It runs the same demo as the Quickstart Demo section below where a local coordinate frame near one object is sampled, and the corresponding local frame near a new object (with a different shape and pose) is recovered via our energy optimization procedure.

Setup

Clone this repo

git clone --recursive https://github.com/anthonysimeonov/ndf_robot.git
cd ndf_robot

Install dependencies (using a virtual environment is highly recommended):

pip install -e .

Setup additional tools (Franka Panda inverse kinematics -- unnecessary if not using simulated robot for evaluation):

cd pybullet-planning/pybullet_tools/ikfast/franka_panda
python setup.py

Setup environment variables (this script must be sourced in each new terminal where code from this repository is run)

source ndf_env.sh

Quickstart Demo

Download pretrained weights

./scripts/download_demo_weights.sh

Download data assets

./scripts/download_demo_data.sh

Run example script

cd src/ndf_robot/eval
python ndf_demo.py

The code in the NDFAlignmentCheck class in the file src/ndf_robot/eval/ndf_alignment.py contains a minimal implementation of our SE(3)-pose energy optimization procedure. This is what is used in the Quickstart demo above. For a similar implementation that is integrated with our pick-and-place from demonstrations pipeline, see src/ndf_robot/opt/optimizer.py

Training

Download all data assets

If you want the full dataset (~150GB for 3 object classes):

./scripts/download_training_data.sh

If you want just the mug dataset (~50 GB -- other object class data can be downloaded with the according scripts):

./scripts/download_mug_training_data.sh

If you want to recreate your own dataset, see Data Generation section

Run training

cd src/ndf_robot/training
python train_vnn_occupancy_net.py --obj_class all --experiment_name  ndf_training_exp

More information on training here

Evaluation with simulated robot

Make sure you have set up the additional inverse kinematics tools (see Setup section)

Download all the object data assets

./scripts/download_obj_data.sh

Download pretrained weights

./scripts/download_demo_weights.sh

Download demonstrations

./scripts/download_demo_demonstrations.sh

Run evaluation

If you are running this command on a remote machine, be sure to remove the --pybullet_viz flag!

cd src/ndf_robot/eval
CUDA_VISIBLE_DEVICES=0 python evaluate_ndf.py \
        --demo_exp grasp_rim_hang_handle_gaussian_precise_w_shelf \
        --object_class mug \
        --opt_iterations 500 \
        --only_test_ids \
        --rand_mesh_scale \
        --model_path multi_category_weights \
        --save_vis_per_model \
        --config eval_mug_gen \
        --exp test_mug_eval \
        --pybullet_viz

More information on experimental evaluation can be found here.

Data Generation

Download all the object data assets

./scripts/download_obj_data.sh

Run data generation

cd src/ndf_robot/data_gen
python shapenet_pcd_gen.py \
    --total_samples 100 \
    --object_class mug \
    --save_dir test_mug \
    --rand_scale \
    --num_workers 2

More information on dataset generation can be found here.

Collect new demonstrations with teleoperated robot in PyBullet

Make sure you have downloaded all the object data assets (see Data Generation section)

Run teleoperation pipeline

cd src/ndf_robot/demonstrations
python label_demos.py --exp test_bottle --object_class bottle --with_shelf

More information on collecting robot demonstrations can be found here.

Citing

If you find our paper or this code useful in your work, please cite our paper:

@article{simeonovdu2021ndf,
  title={Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation},
  author={Simeonov, Anthony and Du, Yilun and Tagliasacchi, Andrea and Tenenbaum, Joshua B. and Rodriguez, Alberto and Agrawal, Pulkit and Sitzmann, Vincent},
  journal={arXiv preprint arXiv:2112.05124},
  year={2021}
}

Acknowledgements

Parts of this code were built upon the implementations found in the occupancy networks repo and the vector neurons repo. Check out their projects as well!

ndf_robot's People

Contributors

Stargazers

Watchers

ndf_robot's Issues

Reproduce experiment result on DON as showed in the paper

Hi, thanks for releasing the source code of this excellent work!

I am really interested in how your baseline of 2D correspondence from DON performs so poorly on placing task, as described in your paper.

Do you have any plan to release your baseline code with DON?

Thanks a million!

where is "pybullet-planning/pybullet_tools/ikfast/franka_panda"

I did not find the "pybullet-planning/pybullet_tools/ikfast/franka_panda". Where is the "pybullet-planning/pybullet_tools/ikfast/franka_panda".
Thank you!

Generating dataset on new categories

Hi there!

NDF works impressively well on mugs, bowls and bottles. So I'm curious about how it will perform on categories with more complex shapes such as airplanes or chairs.

According to your docs, what I need for training an NDF are generated point clouds and ground truth occupancy values. I've successfully produced the point clouds via https://github.com/anthonysimeonov/ndf_robot/blob/master/src/ndf_robot/data_gen/shapenet_pcd_gen.py. However, it seems that the code to produce ground truth occupancy values is not yet available. Is it possible that you could provide the code? Or maybe could you tell me the procedure to produce the occupancy values?

Thanks a lot!!!

Reconstruction performance of the pretrained weights.

I estimated the occupancy of the object point cloud in the demo data, and got very small values like this.

I also estimated with full object points or with dense grid coordinates (instead of 1500 n_pts used above). The results were weird too. I just forwarded the model with centered object points. Did I miss any steps?
I am using the multi_category_weights.pth now, and I am curious about the reconstruction performance of the pretrained nets.

Sampling query points

Hello!

I want to use NDF, but instead of using the demonstrations to sample query points, I want to do it by manually specifying a point on the object. How should I approach this? Which file do I need to look at? Any guidance will be appreciated.

Demo alignment test with 'ndf_demo.py' leads to optimized points on the wrong side of the Mug

I was running the ndf_alignment checks on some mug demo objects using the ndf_demo.py file. The alignment results show for some runs the optimum to be on the wrong side of the mug, is this the expected current state behavior or could there be a possible bug?

on the wrong side of the mug => I mean for example if the reference points are on the left side of reference mug handle, the optimized results are on the right side of the new instance mug handle.

Shelves intersect with bowls and bottles in the demo data

I visualize the demo data and find the shelves intersect with bowls and bottles? Are they supposed to be like this?

Is the demonstration in the teaser available? I think the rack in the teaser is more complicate than the rack in the simulation demos.

Obtaining a point cloud for the subject object

Hi!

In the simulation, you used pybullet's functionality to acquire segmentation, how did you acquire it for the actual device?
Please let me know how to acquire only the point cloud of the target object with 4 cameras on a real robot.
It would be great if you could publish your method or code.
Best regards.

for real world experiments, what dataset are the networks trained on?

In the paper it mentioned training on pybullet rendered dataset for simulation experiments. But how about the real world? Seems like it requires non-trivial amount of data to train (100,000 objects for training for simulation experiments). How is this dataset collected in real world to train? Or is it sim-to-real (any quantitative analysis on the sim-to-real gap influence)?

What is T in DecoderInner

Hello @anthonysimeonov,

I am sorry to disturb you but in line 241 of file src/ndf_robot/model/vnn_occupancy_net_pointnet_dgcnn.py I do not understand to what correspond the T dimension of p. For my understanding p is a batch of 3D points and therefore we have
Batch_size=batch size, T=?, D=3
Thank you,
Julien.

Question about the paper: the translation equivariance coupled with rotation equivariance

Hi, thanks for sharing this great and interesting work!

I'm a bit curious about how partial is the input observed point cloud, as I understand from the paper, the translation equivariance is achieved through subtracting the centre of mass, but this really depends on how complete the point cloud is.
If it's too partial, the centre of mass will largely shift from the actual object centre. Since the translation and rotation equivariance are always coupled, from the vector neuron perspective, it is learning the representation with centre shift augmentation and might lead to rotation equivariance error.

Thanks!

Using NDF for assembly task

Hi @anthonysimeonov ,
Congrats for your work, I really appreciate the idea.

I was wondering whether the descriptors could be used for a different but related task. I am working on 3D assembly (see the breaking-bad dataset for a visual explanation), which means trying to assemble two (or more) broken parts of an object.
The idea behind using NDF is that grasping is similar to assembly, because we have two complementary part (the grasp from the robot or the broken part of the object). It is also equi-variant to rotation, which is great for assembling (if you rotate all pieces solution is still ok).

So I was playing around with the code to do some experiments, I can get the latent vector and forward it to get the descriptors, but I am unsure how exactly to use them. Instead of sampling one random 'batch' (many points close to each other outside the surface of the mesh) I am sampling around the whole object and creating the descriptors, and now I would want to match the descriptors from one broken object with the descriptors from the other part of the broken object.

Intuitively, I would expect the descriptors to be complementary (also not sure how to mathematically define complementary in this case), but I see in section II (Method) of the paper, Equation 11 says that minimizing the difference between descriptors is the way to get the transformation. But trying to find correspondences in a "standard" way (i was using example from TEASER++ registration) did not work.

In order to understand better, I was trying to investigate further the concept of energy landscapes, which looks very promising. Can you point me to some code (even part of this repo) to watch to get a better understanding of it?

So, a couple of questiosn:

if I have a descriptor on a pointcloud (first broken part) and on a second pointcloud (second broken part, same object) and if they belong to the same point, should the descriptors be the same or the complementary?
how did you manage to create the visualization of the energy landscapes? Where can I look to manage to get one of my own?
do you think there is a way to use NDF for assembly?

Thanks a lot in advance