Git Product home page Git Product logo

pseudo-lidar_e2e's Introduction

End-to-end Pseudo-LiDAR for Image-Based 3D Object Detection

This paper has been accepted by Computer Vision and Pattern Recognition 2020.

End-to-end Pseudo-LiDAR for Image-Based 3D Object Detection

by Rui Qian*, Divyansh Garg*, Yan Wang*, Yurong You*, Serge Belongie, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger and Wei-Lun Chao

Citation

@inproceedings{qian2020end,
  title={End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection},
  author={Qian, Rui and Garg, Divyansh and Wang, Yan and You, Yurong and Belongie, Serge and Hariharan, Bharath and Campbell, Mark and Weinberger, Kilian Q and Chao, Wei-Lun},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5881--5890},
  year={2020}
}

###Abstract

Reliable and accurate 3D object detection is a necessity for safe autonomous driving. Although LiDAR sensors can provide accurate 3D point cloud estimates of the environment, they are also prohibitively expensive for many settings. Recently, the introduction of pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras. PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs. However, so far these two networks have to be trained separately. In this paper, we introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end. The resulting framework is compatible with most state-of-the-art networks for both tasks and in combination with PointRCNN improves over PL consistently across all benchmarks --- yielding the highest entry on the KITTI image-based 3D object detection leaderboard at the time of submission.

Contents

Root
    | PIXOR
    | PointRCNN

We provide end-to-end modification on pointcloud-based detector(PointRCNN) and voxel-based detector(PIXOR).

The PIXOR folder contains implementation of Quantization as described in Section3.1 of the paper. Also it contains our own implementation of PIXOR.

The PointRCNN folder contains implementation of Subsampling as described in Section3.2 of the paper. It is developed based on the codebase of Shaoshuai Shi.

Data Preparation

This repo is based on the KITTI dataset. Please download it and prepare the data as same as in Pseudo-LiDAR++. Please refer to its readme for more details.

Training and evaluation

Please refer to each subfolder for details.

Questions

This repo is currently maintained by Rui Qian and Yurong You. Please feel free to ask any question.

You can reach us by put an issue or email: [email protected], [email protected]

pseudo-lidar_e2e's People

Contributors

mileyan avatar yurongyou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pseudo-lidar_e2e's Issues

code release?

Hi,

I really like this work, congratulations on the great results!
Do you plan to release the code soon? in particular, I'm very interested in the differentiable quantization/voxelization layer.

Thanks!
Z.

experiments with Argoverse

Hi ,
This is an amazing work, congratulations on the great results! Thank you very much for sharing your all your research.
Do you plan to release your code for the experiments with Argoverse, and how you convert the Argoverse dataset into KITTI format?
Thanks!

question about SDN fine-tune on KITTI

All the codes mentioned below are from PointRCNN folder.

I noticed in 'kitti_dataset.py', dataloader reads depth ground truth from "KITTI/training/depth_map" folder. If I understand correctly, we can generate this folder by using the following command from Pseudo_Lidar_V2 repo.
python ./src/preprocess/generate_depth_map.py --data_path path-to-KITTI/ --split_file ./split/trainval.txt

I did so and displayed the file named "001596.npy". The result gives a sparse depth map (shown below). And I think this sparse depth map is used as ground truth to calculate the "Depth loss" in Figure 3.
1693

In Section 4.2 Depth Estimation, you mentioned the pre-trained SDN is refined on KITTI dataset, do you just use similar sparse depth maps as ground truth to refine the SDN? And this means only about 3%-4% depth information is actually tuned?

RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered

Hello!

I am trying to train PointRCNN and I followed all the steps mentioned in the README files. However, I get the following error:

image

and before that, I get this:

image

I tries running the whole process on the CPU, for a more detailed error stack, but nothing happened because I waited for more than a day and the training hadn't even started.

Do you have any idea why this happens? I looked it up and I couldn't find any conclusive answer.

cuda out of memory

Can you please tell me what computing power (devices) and how much memories you are using for training e2dPointRCNN? I'm using two 1080 Ti with 11178 M for each gpu, but still facing the issue of "cuda out of memory" except when I use the training mode "rpn".

RuntimeError: copy_if failed to synchronize: device-side assert triggered

Hi, I am interested in you work. But when I run the eval code of pseudo-LiDAR_e2e(PointRCNN ),I have encountered the following problem,could you help me?

/opt/conda/conda-bld/pytorch_1579027003190/work/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda ->auto::operator()(int)->auto: block: [111,0,0], thread: [94,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1579027003190/work/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda ->auto::operator()(int)->auto: block: [111,0,0], thread: [95,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "eval_rcnn_depth.py", line 933, in
eval_single_ckpt(root_result_dir)
File "eval_rcnn_depth.py", line 789, in eval_single_ckpt
eval_one_epoch(model, depth_model, test_loader, epoch_id, root_result_dir, logger)
File "eval_rcnn_depth.py", line 707, in eval_one_epoch
ret_dict = eval_one_epoch_joint(model, depth_model, dataloader, epoch_id, result_dir, logger)
File "eval_rcnn_depth.py", line 500, in eval_one_epoch_joint
depth_loss, data = depth_model.eval(data, args.max_high)
File "/media/4T/zdl/pseudo_lidar_serize/pseudo-LiDAR_e2e/PointRCNN/tools/../lib/net/depth_net.py", line 192, in eval
cloud = transform(cloud, calib_info, sparse_type='angular_min', start=2.0)
File "/media/4T/zdl/pseudo_lidar_serize/pseudo-LiDAR_e2e/PointRCNN/tools/../lib/net/depth_net.py", line 252, in transform
points = nearest_sparse_angular(points, start)
File "/media/4T/zdl/pseudo_lidar_serize/pseudo-LiDAR_e2e/PointRCNN/tools/../lib/net/depth_net.py", line 381, in nearest_sparse_angular
sparse_points = depth_map[depth_map[:, 0] != -1.0]
RuntimeError: copy_if failed to synchronize: device-side assert triggered
eval: 0%|

CoR on BEV Based 3D Detection models

Hello :D
how to use the approach of CoR on BEV Based 3D object detectors
such as (Complex YOLO, SFA3D)

should i use (quantization or sampling) or something else ?

Pseudo-Lidar Generated point cloud

When I Generate Point Cloud using the code provided in PointRCNN or PIXOR, the point cloud is some how distorted and the labels not aligned with objects, In PseudoLidar++, there was a gdc (I think to correct the point cloud), but in PSeudoLidar E2E, there is no such code for correction, so what is happening?, can some one help and illustrate this to me.

How to view APbev/AP3D in the results

Why is there no AP3D in this “outputs_07.txt” file?

Thank you for participating in our evaluation!
Loading detections...
number of files for evaluation: 3769
done.
save ./saves/pixor_e2e/predicted_label_eval/plot/car_detection.txt
car_detection AP: 0.000000 0.000000 0.000000
Finished 2D bounding box eval.
Going to eval ground for class: car
save ./saves/pixor_e2e/predicted_label_eval/plot/car_detection_ground.txt
car_detection_ground AP: 77.831551 59.772270 54.277523
Finished Birdeye eval.
Finished 3D bounding box eval.
Your evaluation results are available at:
./saves/pixor_e2e/predicted_label_eval

Different mAP results

We used provided evaluation script and pretrained weights and results was less than provided in the paper with 5% ? and we noticed that this results was the same as the PL++, is the weights provided is PL++ or PL-E2E

2021-07-11 17:31:13,277   INFO  Car [email protected], 0.70, 0.70:
bbox AP:95.5236, 85.4079, 78.3538
bev  AP:82.0941, 62.9715, 57.2632
3d   AP:67.8891, 50.4661, 45.3872
aos  AP:94.07, 83.04, 75.91
Car [email protected], 0.50, 0.50:
bbox AP:95.5236, 85.4079, 78.3538
bev  AP:89.8431, 83.7960, 77.5032
3d   AP:89.6241, 78.5735, 75.0044
aos  AP:94.07, 83.04, 75.91

code of RBF and subsampling

hi, i am comfuse about where the code of quantization and subsampling is. can you offer me the exact route of them?

error while training

Traceback (most recent call last):
File "/content/pseudo-LiDAR_e2e/PointRCNN/tools/train_rcnn_depth.py", line 336, in
lr_scheduler_each_iter=(cfg.TRAIN.OPTIMIZER == 'adam_onecycle')
File "/content/pseudo-LiDAR_e2e/PointRCNN/tools/../tools/train_utils/train_utils.py", line 212, in train
loss, tb_dict, disp_dict = self._train_it(batch)
File "/content/pseudo-LiDAR_e2e/PointRCNN/tools/../tools/train_utils/train_utils.py", line 143, in _train_it
loss.backward()
File "/usr/local/lib/python3.7/dist-packages/torch/tensor.py", line 102, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/usr/local/lib/python3.7/dist-packages/torch/autograd/init.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

CUDA Version

Hello, thank you for your works. I encountered a problem when stepping into pointnet2.furthest_point_sampling_wrapper. It returned all zeros. I thought it might be caused by CUDA. I'd be appreciated if you could help me.

KeyError: 'depth_state_dict' when trying to run PIXOR

hello

i downloaded both the pixor_checkpoint.pth.tar and depth_checkpoint.pth.tar and followed the documentation in PIXOR folder

when i runt this evaluation lik this :

CUDA_VISIBLE_DEVICES=0 python3 train_pixor_e2e.py -c configs/fusion.cfg --mode eval --eval_ckpt pixor_checkpoint.pth.tar

i get the following error:

[20-08-27 08:47:40 pixor_e2e] => Evaluation mode
[20-08-27 08:47:44 pixor_e2e] Finish cuda loading in 2.870 s
[20-08-27 08:47:45 pixor_e2e] => loading checkpoint 'pixor_checkpoint.pth.tar'
Traceback (most recent call last):
  File "train_pixor_e2e.py", line 823, in <module>
    depth_model.load_state_dict(checkpoint['depth_state_dict'])
KeyError: 'depth_state_dict'

i added the pre-trained depth model to the run command :

CUDA_VISIBLE_DEVICES=0 python3 train_pixor_e2e.py -c configs/fusion.cfg --mode eval --eval_ckpt pixor_checkpoint.pth.tar --depth_pretrain depth_checkpoint.pth.tar

i still get the same error

i check the keys inside each wight file

for pixor_checkpoint.pth.tar :
dict_keys(['state_dict', 'optimizer', 'scheduler', 'epoch'])

for depth_checkpoint.pth.tar

dict_keys(['epoch', 'arch', 'state_dict', 'best_RMSE', 'scheduler', 'optimizer'])

none of them have the 'depth_state_dict' key

How to get depth data

Thank you for this great work! Can you please make it clear how to get depth data corresponding to the KITTI Object data? It seems that KITTI official website does not offer depth data corresponding to the KITTI Object data.

Trained model?

Would you mind upload the trained depth model and 3D detection model for Pixor and PointRCNN through google drive or any another way?

How can I apply this method to get a sparse tensor?

@mileyan Thanks for your sharing. I can get a dense 3d voxel tensor(DxHxW grid) by the quantization method. But many detectors nowadays are designed to be applied to the sparse tensor which is the input of spconv. So how can I apply your quantization method to get a sparse tensor(voxels, coordinates)? Is it possible?

error when running PRCNN command (--gt_database path)

Hello I am trying to re-train the PointRCNN model using the following command

train_rcnn_depth.py --gt_database "" --cfg_file cfgs/e2e.yaml --batch_size 4 --train_mode end2end --ckpt_save_interval 1 --ckpt ../PointRCNN.pth --epochs 10 --mgpus --finetune

But get an error pointing out argument must be assigned to "--gt_dataset"

python -u /pseudo-LiDAR_e2e/PointRCNN/tools/train_rcnn_depth.py --gt_database "" --cfg_file cfgs/e2e.yaml --batch_size 4 --train_mode end2end --ckpt_save_interval 1 --ckpt ../PointRCNN.pth --epochs 10 --mgpus --finetune
usage: train_rcnn_depth.py [-h] [--cfg_file CFG_FILE] --train_mode TRAIN_MODE
                           --batch_size BATCH_SIZE --epochs EPOCHS
                           [--workers WORKERS]
                           [--ckpt_save_interval CKPT_SAVE_INTERVAL]
                           [--output_dir OUTPUT_DIR] [--mgpus] [--ckpt CKPT]
                           [--rpn_ckpt RPN_CKPT] [--depth_ckpt DEPTH_CKPT]
                           [--gt_database GT_DATABASE]
                           [--rcnn_training_roi_dir RCNN_TRAINING_ROI_DIR]
                           [--rcnn_training_feature_dir RCNN_TRAINING_FEATURE_DIR]
                           [--train_with_eval]
                           [--rcnn_eval_roi_dir RCNN_EVAL_ROI_DIR]
                           [--rcnn_eval_feature_dir RCNN_EVAL_FEATURE_DIR]
                           [--pseudo_lidar] [--finetune] [--fix_bn]
                           [--max_high MAX_HIGH]
train_rcnn_depth.py: error: argument --gt_database: expected one argument

Process finished with exit code 2

I believe gt_dataset folder contains generated ground-truth database after augmentation. And won't be used in this PointRCNN implementation. How can I solve this ?

About code and image fusion

Thank you for your contribution!I have several questions about the code and paper.
your code contains image-based fusion in PIXOR,which is not mentioned in the paper.
And iI wonder whether the experimental results in the paper is produced with this implementation?
Or have you tried vanilla PIXOR with the method?is it still work?

Implementation of bev_index, img_index

Hi, thank you for your great work.
I want to ask for the functions of bev_index, img_index in your PIXOR implementation (

bev_index, img_index = fusion.Fusion(depth_map, calib, shift)
). I saw that you use these bev_index, img_index to map the image feature and point cloud feature in the forward() of PixorNet_Fusion. However, these variables are generated by the groundtruth depth map, even in the evaluation. I have regenerated these variables using the estimated depth map and the 3D detection performance is quite low. Could you check and illustrate this problem?

missing Lib file in PointRCNN

Thanks for your amazing work! I tried to run your code, but I ran into some trouble. Is the code in PointRCNN complete?

Package Version not specified

Hello, thank you for your works. I am interested to try your code, but actually I got some problem during pre-installation of the dependency, especially in torch_scatter package installation.

My problem is:
It always have a conflict regarding with incompatible issue. I already tried install the packages in separated environment using different version of Python listed below:

  1. Python version 3.8
  2. Python version 3.6.10
  3. Python version 3.7.7
  4. Python version 3.5.6

Based on this issue, may I know which version of python that you used? and can you provide the version of any packages that you used? Thank you very much and I am really appreciated for your answer.

Here is the screenshot during the installation of torch scatter in python 3.5.6:
python35

About the location of the code

May I ask where the 'Change of Representation' module of the paper is reflected in e2e (RCNN)?
Where is the interface between point cloud and 3D object detection in the code(e2e_RCNN)?
I can't find it at the moment. Can you give me a hint?
Thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.