Git Product home page Git Product logo

rm3d's Introduction

RM3D

Official Repo of the Project - RM3D: Robust Data-Efficient 3D Scene Parsing via Traditional and Learnt 3D Descriptors-based Semantic Region Merging

OverView

Pseudo Labelling Results

Abstract of this work

This work presents a general and simple framework to tackle point clouds understanding when labels are limited. The first contribution is that we have done extensive methodology comparisons of traditional and learnt 3D descriptors for the task of weakly supervised 3D scene understanding, and validated that our adapted traditional PFH-based 3D descriptors show excellent generalization ability across different domains. The second contribution is that we proposed a learning-based region merging strategy based on the affinity provided by both the traditional/learnt 3D descriptors and learnt semantics. The merging process takes both low-level geometric and high-level semantic feature correlations into consideration. Experimental results demonstrate that our framework has the best performance among the three most important weakly supervised point clouds understanding tasks including semantic segmentation, instance segmentation, and object detection.

Navigation

For the task of 3D Object Detection, please refer to RM3D_Det.

For the task of 3D Semantic Segmentation, please refer to RM3D_Sem_Seg.

For the task of 3D Instance Segmentation, please refer to RM3D_Ins_Seg.

Installation

Please refer to INSTALL.md for the installation of OpenPCDet.

3D Detection

Our codebase of 3D object detection is based on OpenPCDet.

OpenPCDet is a clear, simple, self-contained open source project for LiDAR-based 3D object detection.

It is also the official code release of [PointRCNN], [Part-A^2 net], [PV-RCNN] and [Voxel R-CNN].

Overview

Currently Supported Features

  • Support both one-stage and two-stage 3D object detection frameworks
  • Support distributed training & testing with multiple GPUs and multiple machines
  • Support multiple heads on different scales to detect different classes
  • Support stacked version set abstraction to encode various number of points in different scenes
  • Support Adaptive Training Sample Selection (ATSS) for target assignment
  • Support RoI-aware point cloud pooling & RoI-grid point cloud pooling
  • Support GPU version 3D IoU calculation and rotated NMS

Model Zoo

KITTI 3D Object Detection Baselines

Selected supported methods are shown in the below table. Here we provide the pretrained models which achieves State the 3D detection performance on the val set of KITTI dataset.

  • All models are trained with 4 RTX 2080 Ti GPUs and are available for download.
  • The training time is measured with 4 2080 Ti GPUs and PyTorch 1.5.

Data Efficient Learning with 3% labels

training time Car@R11 Pedestrian@R11 Cyclist@R11 download
PointPillar ~2.66 hours 65.43 45.08 51.88 model_PointPillar
SECOND ~2.75 hours 69.56 43.29 56.66 model_SECOND
SECOND-IoU - 68.28 45.39 57.29 model_SECOND-IoU
PointRCNN ~5.67 hours 64.70 46.62 62.16 model_PointRCNN
PointRCNN-IoU ~6.12 hours 67.54 47.19 60.25 model_PointRCNN-IoU
Part-A^2-Free ~5.98 hours 65.92 57.83 63.18 model_Part-A^2-Free
Part-A^2-Anchor ~7.87 hours 69.22 50.79 58.17 model_Part-A^2-Anchor
PV-RCNN ~8.78 hours 74.24 47.65 60.23 model_PV-RCNN
Voxel R-CNN (Car) ~3.87 hours 76.23 - - model_Voxel_R-CNN
CaDDN ~19.83 hours 19.34 11.86 8.17 model_CaDDN

Waymo Open Dataset Baselines

We provide the setting of DATA_CONFIG.SAMPLED_INTERVAL on the Waymo Open Dataset (WOD) to subsample partial samples for training and evaluation, so you could also play with WOD by setting a smaller DATA_CONFIG.SAMPLED_INTERVAL even if you only have limited GPU resources.

By default, all models are trained with 3% data (~4.8k frames) of all the training samples on 4 2080 Ti GPUs, and the results of each cell here are mAP/mAPH calculated by the official Waymo evaluation metrics on the whole validation set (version 1.2).

Vec_L1 Vec_L2 Ped_L1 Ped_L2 Cyc_L1 Cyc_L2
SECOND 57.15/66.38 49.42/48.67 50.75/40.39 42.18/36.65 47.65/43.98 41.22/40.92
Part-A^2-Anchor 59.82/58.29 53.33/52.82 52.15/44.76 45.12/40.29 56.67/55.21 50.29/50.88
PV-RCNN 59.06/55.38 55.67/63.38 54.23/43.76 44.89/38.28 53.15/50.94 49.87/48.69

We could not provide the above pretrained models due to Waymo Dataset License Agreement, you could easily achieve similar performance by training with the default configs.

Other datasets

More datasets are on the way.

Installation

Please refer to INSTALL.md for the installation of OpenPCDet.

Quick Demo

Please refer to DEMO.md for a quick demo to test with a pretrained model and visualize the predicted results on your custom data or the original KITTI data.

Getting Started

Please refer to GETTING_STARTED.md to learn more usage about this project.

License

RM3D is released under the MIT license.

Contact

For Questions regarding the 3D smenatic segmentation, 3D instance segmentation, and 3D object detction codes of our RM3D, please contact through email ([email protected] or [email protected]).

If you find our work helpful, please feel free to give a star to this repo!

rm3d's People

Contributors

kangchengliu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.