Git Product home page Git Product logo

focalsconv's Introduction

arXiv visitors

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

This is the official implementation of Focals Conv (CVPR 2022), a new sparse convolution design for 3D object detection (feasible for both lidar-only and multi-modal settings). For more details, please refer to:

Focal Sparse Convolutional Networks for 3D Object Detection [Paper]
Yukang Chen, Yanwei Li, Xiangyu Zhang, Jian Sun, Jiaya Jia

News

  • [2022-08-24] The code and example for test-time augmentations have been released here.
  • [2022-07-05] The code for Focals Conv has been marged into the official codebase OpenPCDet.
  • [2022-06-21] The other 3D backbone network design is presented LargeKernel3D [Paper | Github].

Experimental results

KITTI dataset

Car@R11 Car@R40 download
PV-RCNN + Focals Conv 83.91 85.20 Google | Baidu (key: m15b)
PV-RCNN + Focals Conv (multimodal) 84.58 85.34 Google | Baidu (key: ie6n)
Voxel R-CNN (Car) + Focals Conv (multimodal) 85.68 86.00 Google | Baidu (key: tnw9)

nuScenes dataset

mAP NDS download
CenterPoint + Focals Conv (multi-modal) 63.86 69.41 Google | Baidu (key: 01jh)
CenterPoint + Focals Conv (multi-modal) - 1/4 data 62.15 67.45 Google | Baidu (key: 6qsc)

Visualization of voxel distribution of Focals Conv on KITTI val dataset:

Getting Started

Installation

a. Clone this repository

https://github.com/dvlab-research/FocalsConv && cd FocalsConv

b. Install the environment

Following the install documents for OpenPCdet and CenterPoint codebases respectively, based on your preference.

*spconv 2.x is highly recommended instead of spconv 1.x version.

c. Prepare the datasets.

Download and organize the official KITTI and Waymo following the document in OpenPCdet, and nuScenes from the CenterPoint codebase.

*Note that for nuScenes dataset, we use image-level gt-sampling (copy-paste) in the multi-modal training. Please download this dbinfos_train_10sweeps_withvelo.pkl to replace the original one. (Google | Baidu (key: b466))

*Note that for nuScenes dataset, we conduct ablation studies on a 1/4 data training split. Please download infos_train_mini_1_4_10sweeps_withvelo_filter_True.pkl if you needed for training. (Google | Baidu (key: 769e))

d. Download pre-trained models.

If you want to directly evaluate the trained models we provide, please download them first.

If you want to train by yourselvef, for multi-modal settings, please download this resnet pre-train model first, torchvision-res50-deeplabv3.

Evaluation

We provide the trained weight file so you can just run with that. You can also use the model you trained.

For models in OpenPCdet,

NUM_GPUS=8
cd tools 
bash scripts/dist_test.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/voxel_rcnn_car_focal_multimodal.yaml --ckpt path/to/voxelrcnn_focal_multimodal.pth

bash scripts/dist_test.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/pv_rcnn_focal_multimodal.yaml --ckpt ../pvrcnn_focal_multimodal.pth

bash scripts/dist_test.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/pv_rcnn_focal_lidar.yaml --ckpt path/to/pvrcnn_focal_lidar.pth

For models in CenterPoint,

CONFIG="nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal"
python -m torch.distributed.launch --nproc_per_node=${NUM_GPUS} ./tools/dist_test.py configs/nusc/voxelnet/$CONFIG.py --work_dir ./work_dirs/$CONFIG --checkpoint centerpoint_focal_multimodal.pth

Training

For configures in OpenPCdet,

bash scripts/dist_train.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/CONFIG.yaml

For configures in CenterPoint,

python -m torch.distributed.launch --nproc_per_node=${NUM_GPUS} ./tools/train.py configs/nusc/voxelnet/$CONFIG.py --work_dir ./work_dirs/CONFIG
  • Note that we use 8 GPUs to train OpenPCdet models and 4 GPUs to train CenterPoint models.

TODO List

    • Config files and scripts for the test augs (double-flip and rotation) in nuScenes test submission.
    • Results and models of Focals Conv Networks on 3D Segmentation datasets.

Citation

If you find this project useful in your research, please consider citing:

@inproceedings{focalsconv-chen,
  title={Focal Sparse Convolutional Networks for 3D Object Detection},
  author={Chen, Yukang and Li, Yanwei and Zhang, Xiangyu and Sun, Jian and Jia, Jiaya},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2022}
}

Acknowledgement

  • This work is built upon the OpenPCDet and CenterPoint. Please refer to the official github repositories, OpenPCDet and CenterPoint for more information.

  • This README follows the style of IA-SSD.

License

This project is released under the Apache 2.0 license.

Related Repos

  1. spconv GitHub stars
  2. Deformable Conv GitHub stars
  3. Submanifold Sparse Conv GitHub stars

focalsconv's People

Contributors

yukang2017 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.