Git Product home page Git Product logo

uois's Introduction

The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation

This is the implementation of our unseen object instance segmentation network. Our instance segmentation algorithm utilizes a two-stage method to explicitly leverage the strengths of depth and RGB separately for stronger instance segmentation. Surprisingly, our framework is able to learn from synthetic RGB-D data where the RGB is non-photorealistic. Details of the algorithm can be found in our CoRL 2019 paper:

Christopher Xie, Yu Xiang, Arsalan Mousavian, Dieter Fox
The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation
In Proceedings of Conference on Robot Learning - CoRL, 2019

Installation

We highly recommend setting up a virtual environment using Anaconda and pip. Here is an example setup using these tools:

git clone https://github.com/chrisdxie/uois.git
cd uois/
conda env create -f env.yml

In env.yml, we set cudatoolkit=10.0 by default. Set the CUDA toolkit version to match the native CUDA version (in /usr/local/cuda/), since you must compile the hough voting code with corresponding CUDA compiler (nvcc is not provided with the conda cudatoolkit distribution). This can be checked with: nvcc --version.

Note: As of 9/24/19, it seems that PyTorch is not supported by CUDA 10.1. Please use CUDA <= 10.0. The code has been tested with PyTorch 1.0 and 1.2.

Hough Voting Layer

To install the Hough Voting layer (written in CUDA/C++), first install Eigen. Eigen is a C++ template library for linear algebra, consisting ONLY of header files. This can be done from the Github source code and cmake:

git clone https://github.com/eigenteam/eigen-git-mirror.git
cd eigen-git-mirror/
mkdir build/
cd build/
cmake ..
sudo make install

If you do not have sudo access, do not run sudo make install and simply edit setup.py to point to the directory where the Eigen headers live (e.g. <ROOT_DIR>/eigen-git-mirror/).

Note: The code has been tested with commit 4b2884.

Next, run the following line:

conda activate uois
cd <ROOT_DIR>/src/hough_voting/
python setup.py install

Make sure that cudatoolkit=... is set to the native CUDA version (in /usr/local/cuda/) which will be used to compile the CUDA code. Otherwise, the compiled CUDA/C++ code may not be compatible with the conda-installed PyTorch.

Models

You can find the models here. We provide a Depth Seeding Network (DSN) model trained on our synthetic Tabletop Object Dataset (TOD), a Region Refinement Network (RRN) model trained on TOD, and an RRN model trained on real data from the Google Open Images Dataset (OID).

How to run the network

See uois_example.ipynb for an example of how to run the network on example images. In order to run this file, Jupyter Notebook must be installed (this is included in env.yml). If you haven't used Jupyter Notebooks before, here is a tutorial to get you up to speed. This repository provides a few images in the example_images folder.

Notes:

  • Make sure to activate the Anaconda environment before running jupyter. This can be done with conda activate uois; jupyter notebook
  • the notebook should be run in the directory in which it lives (<ROOT_DIR>), otherwise the filepaths must be manually adjusted.
  • After downloading and unzipping the models, make sure to update checkpoint_dir in uois_example.ipynb to point to the directory where the models live.

Train the network

We provide sample training code in train_DSN.ipynb and train_RRN.ipynb. We have not released our dataset; however, we have included data loading and data augmentation code in this release to help users get started training the network on their data.

Citation

Our code is released under GPLv3.

If you find our work helpful in your research, please cite our work.

@inproceedings{xie2019uois,
author    = {Christopher Xie and Yu Xiang and Arsalan Mousavian and Dieter Fox},
title     = {The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation},
booktitle = {Conference on Robot Learning (CoRL)},
year      = {2019}
}

uois's People

Contributors

chrisdxie avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.