mileyan / pseudo_lidar Goto Github PK

View Code? Open in Web Editor NEW

967.0 46.0 215.0 5.61 MB

(CVPR 2019) Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

Home Page: https://mileyan.github.io/pseudo_lidar/

License: MIT License

Python 22.67% Jupyter Notebook 77.33%

pseudo_lidar's Introduction

Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

This paper has been accpeted by Conference on Computer Vision and Pattern Recognition (CVPR) 2019.

Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

by Yan Wang, Wei-Lun Chao, Divyansh Garg, Bharath Hariharan, Mark Campbell and Kilian Q. Weinberger

Citation

@inproceedings{wang2019pseudo,
  title={Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving},
  author={Wang, Yan and Chao, Wei-Lun and Garg, Divyansh and Hariharan, Bharath and Campbell, Mark and Weinberger, Kilian},
  booktitle={CVPR},
  year={2019}
}

Update

2nd July 2020: Add a jupyter script to visualize point cloud. It is in ./visualization folder.
29th July 2019: submission.py will save the disparity to the numpy file, not png file. And fix the generate_lidar.py.
I have modifed the official avod a little bit. Now you can directly train and test pseudo-lidar with avod. Please check the code https://github.com/mileyan/avod_pl.

Introduction
Usage
Results
Contacts

Introduction

3D object detection is an essential task in autonomous driving. Recent techniques excel with highly accurate detection rates, provided the 3D input data is obtained from precise but expensive LiDAR technology. Approaches based on cheaper monocular or stereo imagery data have, until now, resulted in drastically lower accuracies --- a gap that is commonly attributed to poor image-based depth estimation. However, in this paper we argue that data representation (rather than its quality) accounts for the majority of the difference. Taking the inner workings of convolutional neural networks into consideration, we propose to convert image-based depth maps to pseudo-LiDAR representations --- essentially mimicking LiDAR signal. With this representation we can apply different existing LiDAR-based detection algorithms. On the popular KITTI benchmark, our approach achieves impressive improvements over the existing state-of-the-art in image-based performance --- raising the detection accuracy of objects within 30m range from the previous state-of-the-art of 22% to an unprecedented 74%. At the time of submission our algorithm holds the highest entry on the KITTI 3D object detection leaderboard for stereo image based approaches.

Usage

1. Overview

We provide the guidance and codes to train stereo depth estimator and 3D object detector using the KITTI object detection benchmark. We also provide our pre-trained models.

2. Stereo depth estimation models

We provide our pretrained PSMNet model using the Scene Flow dataset and the 3,712 training images of the KITTI detection benchmark.

Pretrained PSMNet

We also directly provide the pseudo-LiDAR point clouds and the ground planes of training and testing images estimated by this pre-trained model.

We also provide codes to train your own stereo depth estimator and prepare the point clouds and gound planes. If you want to use our pseudo-LiDAR data for 3D object detection, you may skip the following contents and directly move on to object detection models.

2.1 Dependencies

Python 3.5+
numpy, scikit-learn, scipy
KITTI 3D object detection dataset

2.2 Download the dataset

You need to download the KITTI dataset from here, including left and right color images, Velodyne point clouds, camera calibration matrices, and training labels. You also need to download the image set files from here. Then you need to organize the data in the following way.

KITTI/object/
    
    train.txt
    val.txt
    test.txt 
    
    training/
        calib/
        image_2/ #left image
        image_3/ #right image
        label_2/
        velodyne/ 

    testing/
        calib/
        image_2/
        image_3/
        velodyne/

The Velodyne point clouds (by LiDAR) are used ONLY as the ground truths to train a stereo depth estimator (e.g., PSMNet).

2.3 Generate ground-truth image disparities

Use the script./preprocessing/generate_disp.py to process all velodyne files appeared in train.txt. This is our training ground truth. Or you can directly download them from disparity. Name this folder as disparity and put it inside the training folder.

python generate_disp.py --data_path ./KITTI/object/training/ --split_file ./KITTI/object/train.txt

2.4. Train the stereo model

You can train any stereo disparity model as you want. Here we give an example to train the PSMNet. The modified code is saved in the subfolder psmnet. Make sure you follow the README inside this folder to install the correct python and library. I strongly suggest using conda env to organize the python environments since we will use Python with different versions. Download the psmnet model pretrained on Sceneflow dataset from here.

# train psmnet with 4 TITAN X GPUs.
python ./psmnet/finetune_3d.py --maxdisp 192 \
     --model stackhourglass \
     --datapath ./KITTI/object/training/ \
     --split_file ./KITTI/object/train.txt \
     --epochs 300 \
     --lr_scale 50 \
     --loadmodel ./pretrained_sceneflow.tar \
     --savemodel ./psmnet/kitti_3d/  --btrain 12

2.5 Predict the point clouds

Predict the disparities.

# training
python ./psmnet/submission.py \
    --loadmodel ./psmnet/kitti_3d/finetune_300.tar \
    --datapath ./KITTI/object/training/ \
    --save_path ./KITTI/object/training/predict_disparity
# testing
python ./psmnet/submission.py \
    --loadmodel ./psmnet/kitti_3d/finetune_300.tar \
    --datapath ./KITTI/object/testing/ \
    --save_path ./KITTI/object/testing/predict_disparity

Convert the disparities to point clouds.

# training
python ./preprocessing/generate_lidar.py  \
    --calib_dir ./KITTI/object/training/calib/ \
    --save_dir ./KITTI/object/training/pseudo-lidar_velodyne/ \
    --disparity_dir ./KITTI/object/training/predict_disparity \
    --max_high 1
# testing
python ./preprocessing/generate_lidar.py  \
    --calib_dir ./KITTI/object/testing/calib/ \
    --save_dir ./KITTI/object/testing/pseudo-lidar_velodyne/ \
    --disparity_dir ./KITTI/object/testing/predict_disparity \
    --max_high 1

If you want to generate point cloud from depth map (like DORN), you can add --is_depth in the command.

2.6 Generate ground plane

If you want to train an AVOD model for 3D object detection, you need to generate ground planes from pseudo-lidar point clouds.

#training
python ./preprocessing/kitti_process_RANSAC.py \
    --calib ./KITTI/object/training/calib/ \
    --lidar_dir  ./KITTI/object/training/pseudo-lidar_velodyne/ \
    --planes_dir /KITTI/object/training/pseudo-lidar_planes/
#testing
python ./preprocessing/kitti_process_RANSAC.py \
    --calib ./KITTI/object/testing/calib/ \
    --lidar_dir  ./KITTI/object/testing/pseudo-lidar_velodyne/ \
    --planes_dir /KITTI/object/testing/pseudo-lidar_planes/

3. Object Detection models

AVOD model

Download the code from https://github.com/kujason/avod and install the Python dependencies.

Follow their README to prepare the data and then replace (1) files in velodyne with those in pseudo-lidar_velodyne and (2) files in planes with those in pseudo-lidar_planes. Note that you should still keep the folder names as velodyne and planes.

Follow their README to train the pyramid_cars_with_aug_example model. You can also download our pretrained model and directly evaluate on it. But if you want to submit your result to the leaderboard, you need to train it on trainval.txt.

pretrained AVOD (trained only on train.txt)

Frustum-PointNets model

Download the code from https://github.com/charlesq34/frustum-pointnets and install the Python dependencies.

Follow their README to prepare the data and then replace files in velodyne with those in pseudo-lidar_velodyne. Note that you should still keep the folder name as velodyne.

Follow their README to train the v1 model. You can also download our pretrained model and directly evaluate on it.

pretrained Frustum_V1 (trained only on train.txt)

Results

The main results on the validation dataset of our pseudo-LiDAR method.

You can download the avod validation results from HERE.

Contact

If you have any question, please feel free to email us.

Yan Wang ([email protected]), Harry Chao([email protected]), Div Garg([email protected])

pseudo_lidar's People

Stargazers

Watchers

Forkers

collector-m shlpu namemzy yangdaiyu123 lwjtyzh liuckind ml-lab liushuchun div99 zhouzhuol labimage gaoqiangwu suriyanitt wrccrwx wslerry esmaeilinia chunyu-lin-bjtu xiaohuangya317 leo-xxx areslp dstarer hyperspec-ai jovialio jackymail rozgo sjylegend xiejianming123 jiachens kelvinson nnu-gisa tonycmu githubfragments zhengheshangguan kejingjing88212 scud3r1a jimaldon peterzs niklas-yan godspeed1989 xjsong99 qaazii bulayun ghm0819 mengshili chisyliu ymaocdc alan8677 kveps umvdl kannanrn samghk fengkai11 tkkim93 kackers peakszhang goodlucktian rsingh2083 tangmhmhmh cmsebast jianqiang03 recreatemyself lilianjiang djever23 hugallant bubuqiaqia1 sourav22899 zhhongsh yosshi999 cdiazruiz moonwolf9067 qiaoxingli maohaili dbeker wu199710 hau-zy swdev1202 dengjianbo3 yongyongli123 nmboyd cacao1987 zivzone anurag1paul brainhsh tom-cat-god miracle0814 dreamandgo-123 yizhuami ankitwahane497 mengdejiang argosdh codexing8 manila95 joranwang namdinhrobotics drchanging deetungsten ngkel lexili24 tesla-self-driving-car rj2019

pseudo_lidar's Issues

pseudo lidar gives wrong converted point cloud

Hello. This is a KITTI image:

And this is its depth map obtained using MADNet

But this is its point cloud when generated using your code:

Please tell me what could be going wrong here. The depth map looks reasonably good but the generated point cloud doesn't (I used preprocessing/generate_lidar.py)

Can you make your code work with Python 3?

Hi. To run your code and get the proper depth map we have to switch to Python 2 otherwise the depth map is completely incorrect. Can you make this compatible with Python 3? Thanks!

The test results

Hi,
It is a very nice work and thanks for releasing the code!!!
I have a question about the test results in the leaderboard. I find you submitted two results, namely Pseudo-LiDAR (the result in the paper) and Pseudo-LiDAR-V2. Compared with Pseudo-LiDAR, the second version improves a lot, so I am really interested in what you did to make the performance better.
Thanks a lot.

Dear sir, we are looking forward your code, please

Can you provide the test result (all txt files) on the validation set?

I want to carry out a more detailed comparison with your method. If you can provide these detection txt files on the validation set. It would be very helpful.

Thank you. Hope you receive more citations.

Poor 3D object detection results using MONODEPTH depth estimation module

Hi,
First of all,thanks for sharing this wonderful work.
I am using monodepth model to directly predict disparity from a single image and I am using that model output to create pseudolidar from numpy files after normalising with image width.I have used your frustrum pointnets model v1 pretrained model weights for object detection evaluation but unfortunately results are very poor(10-15% max). Can you please help me understand the issue.
Thanks,
Hari

How can i visualize a bin file?

Currently, the .npy file of predict_disparity is visualized and checked visually, but i can't figure out how to visualize pseudo-lidar_velodyne's bin file.

Can you tell me how to visualize?

your pre-trained AVOD model doesn't work

I ran the AVOD code but I get a "no checkpoint error". I did everything as was explained on the AVOD github. Please tell me how to make your AVOD pre-trained checkpoint work. This issue is also encountered here: #19 and remains unanswered

Generate bad disparity map

Hi, Yan.
Thanks for your great work.

I am using pytorch 1.1.0.

I follow the readme to generate disparity and point cloud.
First, i download your pretrained PSMNet model.
Then, i run following cmd to generate disparity.

$ python ./psmnet/submission.py --loadmodel ./finetune_300.tar --datapath /mine/KITTI_DAT/training/ --save_path ./training_predict_disparity/
Number of model parameters: 5224768
/usr/lib/python3.7/site-packages/torch/nn/functional.py:2457: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
/usr/lib/python3.7/site-packages/torch/nn/functional.py:2539: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
/usr/lib/python3.7/site-packages/torch/nn/functional.py:2539: UserWarning: Default upsampling behavior when mode=trilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
time = 1.18
000000.png
time = 0.42
000001.png
time = 0.42
000002.png
...

Next, i run following cmd to generate point cloud *.bin

$ python ./preprocessing/generate_lidar.py --calib_dir /mine/KITTI_DAT/training/calib/ --save_dir ./training_predict_velodyne/ --disparity_dir ./training_predict_disparity/ --max_high 1
Finish Depth 000000
Finish Depth 000001
Finish Depth 000002
Finish Depth 000003
Finish Depth 000004
...

However, i got disparity map for training/000002 like this:

and point cloud like this:

The RGB image is:

I think the generated disparity is bad. But i can't figure out the reason.
Thank you

Why pseudo point cloud results in this proposed method seem too far different from LIDAR?

In original LIDAR point cloud, there are several cars on the highway. But cant tell these items in this codebase using pretrained models.

pseudo lidar

Hi, could you please provide the generated pseudo lidar of training and testing data in another way? The file is too big to not pull from Google drive. Or you can place it in Baiduyundisk?
Sorry, I come from China, except from Baiduyundisk, I don't have better option that can download such large data.

Thanks again for your great job!

have trouble in Train the stereo model

Hi @mileyan,

Thanks a lot for your code. I get the error when I run the finetune_3d.py：RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58.And even I change the batch_size to 1 (btrain 1)
I run it on GTX 1050 Ti, CUDA 9.0. Hoping you can give me some advices.
Thanks!

Full error log:
(psm) longyan@longyan:~/pseudo_lidar-master$ python ./psmnet/finetune_3d.py --maxdisp 192 --model stackhourglass --datapath /media/longyan/longyan/dataset/KITTI/object/training/ --split_file /media/longyan/longyan/dataset/KITTI/object/train.txt --epochs 300 --lr_scale 50 --loadmodel ./pretrained_sceneflow.tar --savemodel ./psmnet/kitti_3d/ --btrain 12
./psmnet/kitti_3d/training.log
[2020-07-21 17:19:19 finetune_3d.py:77] INFO load model ./pretrained_sceneflow.tar
Number of model parameters: 5224768
THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "./psmnet/finetune_3d.py", line 211, in
main()
File "./psmnet/finetune_3d.py", line 175, in main
loss = train(imgL_crop, imgR_crop, disp_crop_L)
File "./psmnet/finetune_3d.py", line 107, in train
output1, output2, output3 = model(imgL, imgR)
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 112, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/longyan/pseudo_lidar-master/psmnet/models/stackhourglass.py", line 111, in forward
refimg_fea = self.feature_extraction(left)
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/longyan/pseudo_lidar-master/psmnet/models/submodule.py", line 123, in forward
output_skip = self.layer4(output)
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/longyan/pseudo_lidar-master/psmnet/models/submodule.py", line 36, in forward
out = self.conv2(out)
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/modules/batchnorm.py", line 49, in forward
self.training or not self.track_running_stats, self.momentum, self.eps)
File "/home/longyan/anaconda3/envs/psm/lib/python2.7/site-packages/torch/nn/functional.py", line 1194, in batch_norm
training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58

Confusion in GT Disparity Map

Hi authors,

Thank you so much for providing the code for your paper. I have a query regarding the GT Disparity map that you generate using ./preprocessing/generate_disp.py

I downloaded the GT Disparity that you provided. However, it looks more like a LIDAR point cloud than a disparity image. Is this how it is supposed to be? I plotted the below GT using matplotlib

RGB image for the above GT disparity map is given below

Please let me know. Thank you!

AVOD pre-trained weights for mono pseudo-lidar

Hi @mileyan,

Thanks a lot for your code. I was trying out the monocular pipeline using your code and so wanted to know if I could use the pre-trained AVOD weights you have mentioned could be used in monocular case as well, just as mentioned in your avod_pl repository or I would need to train the AVOD model separately for the outputs from monocular depth and pseudo-lidar?

Thanks

Evaluation with pre-trained frustum pointnet

I have a huge memory error when evaluating with pre-trained frustum pointnet.
I downloaded the PseudoLiDAR pointclouds from this repo, as well as the pre-trained FrustumPointNet model.
I used a conda environment to have Python 2.7 to ensure working compatibility with the Frustum PointNet repository. I'm using an Ubuntu 18.04 server with 1 GPU.

For the evaluation of Frustum-PointNet, they make it necessary to generate a pickle file of the data.
When running their script for generating that, my program gets killed for OOM.
On further inspection, I find that while the pickle file for the original LiDAR data is 996MB, for just the first 1000 samples, my frustum_carpedcyc_train.pickle file produces 4.8GB.

The input_list in case of the original data has a shape (1526, 4) for the first data sample, while the input_list in case of PseudoLiDAR data has a shape of (18604, 4). This means that the veoldyne pointclouds are huge in PseudoLiDAR in my understanding.

It would be great if the authors could provide details on how they exactly evaluated on the PseudoLiDAR dataset, or provide the pickle files for the same so that we need not generate those huge files.

Pseudo-LiDAR for training AVOD using monodepth2

Hello
First thank you for sharing the amazing works through github, I have several question related with my problem.

I generate the pseudo lidar from your repository by using depth information (NPY files) from monodepth2, but I am not sure whether my pseudo lidar data is correct or not. My pseudo lidar is shown like this:

Do you think that my pseudo lidar is correct or not? because when I try to train it using your AVOD repository, it has error in generating the minibatch.

The AVOD error is like this:
File "/media/ferdyan/NewDisk/ITRI_3D/avod_pl-master/wavedata/wavedata/tools/core/voxel_grid_2d.py", line 102, in voxelize_2d
unique_indices[-1])
IndexError: index -1 is out of bounds for axis 0 with size 0
All Done (Parallel)

Thank you very much. Have a nice day.

data source for training depth model

I think may be you use more data to train depth model , especially the data from val set.

Calibration data for test data

Are you using the calibration parameters for test data too, like how you used for point cloud generation? < generate_lidar >
Also , could you give an idea as to what the 'Tr_velo_to_cam' parameter is ? <kitti_util >

Thank You.

AssertionError when run "./preprocessing/generate_disp.py"

Hello, I am confused about the AssertionError. But I have the "KITTI" file in Home and organize it well.

Can someone tell me what the reason about it? Thanks!

The related code is following (in generate_disp.py) :
assert os.path.isdir(args.data_path)
lidar_dir = args.data_path + '/velodyne/'
calib_dir = args.data_path + '/calib/'
image_dir = args.data_path + '/image_2/'
disparity_dir = args.data_path + '/disparity/'

Camera Calibration for generating Pseudo Lidar Image from Depth Image

Hello,

Firstly, thank you for the amazing work with this repo.

I have a custom image from a monocular camera for which I generated the depth map using the following repo:
https://github.com/nianticlabs/monodepth2

The depth results look good, but now I'm trying to generate the pseudo lidar points for that image. I understand I need camera calibration parameters for each image so I calibrated my camera by using the checkered board calibration technique. Based on the KITTI and the NYU datasets, I observed that we need the calibration file for every single image. Is my assumption correct? If yes, I'm not sure how we're generating such a calibration file for every single image? My understanding was I would calibrate the camera once and generate one set of relevant calibration parameters for that camera. I was then planning to use them along with the previously obtained depth image for generating the Psuedo Lidar Image.

Let me know if my understanding is not correct. Thank you.

How to display point cloud in image plane?

Hello, with the default kitti velodyne data, I was able to project the point cloud onto the image plane using this code:

   vld = np.fromfile(os.path.join(args.kitti_loc, 'velodyne', velo_files[seg_counter]), dtype=np.float32)
   vld = vld.reshape((-1, 4)).T

    # Reshape calibration files
    P2 = clb['P2']
    R0 = np.eye(4)
    R0[:-1, :-1] = clb['R0_rect']
    Tr = np.eye(4)
    Tr[:-1, :] = clb['Tr_velo_to_cam']

    # Prepare 3d points
    pts3d = vld[:, vld[-1, :] > 0].copy()
    pts3d[-1, :] = 1

    # Project 3d points
    pts3d_cam = np.matmul(np.matmul(R0, Tr), pts3d)
    mask = pts3d_cam[2, :] >= 0  # Z >= 0
    pts2d_cam = np.matmul(P2, pts3d_cam[:, mask])
    pts2d = (pts2d_cam / pts2d_cam[2, :])[:-1, :].T

    # draw points on img
    height, width, channels = mask_img.shape
    blank_image = np.zeros((height, width, channels), np.uint8)
    for x in range(pts2d.shape[0]):
        xcolor = int(pts2d[x, 0])
        ycolor = int(pts2d[x, 1])
        if 0 <= xcolor < img.shape[1] and 0 <= ycolor < img.shape[0]:
            blank_image[ycolor, xcolor, :] = colors['pink']

    cv2.imshow('', blank_image)
    cv2.waitKey(100)

This displays the point cloud correctly on the image plane:

However, when I use the same code to display your point cloud, I get this:

I know that the depth map was generated correctly (and shockingly accurate might I add) so can you please tell me how to project your point cloud onto the image plane? Thanks!

predicted depth for training and val

Hello, Could you please release the predicted depth data? Thanks

About the calibration problem between the true location and the Pseudo-LiDAR

Hello, Firstly, thank you for the amazing work with this repo.
May I ask how you calibrate between the true location and the Pseudo-LiDAR?
Through the guidance of your program, I have realized the generation of Pseudo-LiDAR, but when I tried to mark the true location of the object, I found it was not correct.

As shown in the following experiment:
The true location: ./KITTI/object/training/label_2/000000.txt
Pedestrian 0.00 0 -0.20 712.40 143.00 810.73 307.92 1.89 0.48 1.20 1.84 1.47 8.41 0.01
The true location of the object in this annotation file is: 1.89 0.48 1.20

The experimental results are shown in the figure above. The red box represents the picture I tested; the blue box represents the generated Pseudo-LiDAR. It can be seen that the generated Pseudo-LiDAR is fine. The yellow box represents the true location I marked (1.89 0.48 1.20).
From the figure, it is obvious that there is a difference between the Pseudo-LiDAR and the true location. How do you calibrate in actual operation?

question about dataloader

hello,is is possible not to generate the huge pseuso-lidar point file(.bin). I want to done this process when load the image.I have just write a function which transform the image to point cloud with the help of SOTA monocular-depth estimator. where should I put this function to avoid load the pretrained weight mutiple times when load the image-data by training.

Unable to run pre-trained model on GTX 1050Ti

Hello. I get the error:
CUDA out of memory. Tried to allocate 352.00 MiB (GPU 0; 4.00 GiB total capacity; 2.04 GiB already allocated; 302.80 MiB free; 593.57 MiB cached)
when I try to predict disparities. Is there any change I can make to your code, to make it possible to predict disparities?
Full error log:

C:\Users\sarim\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\functional.py:2457: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
C:\Users\sarim\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\functional.py:2539: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
C:\Users\sarim\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\functional.py:2539: UserWarning: Default upsampling behavior when mode=trilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
Traceback (most recent call last):
  File "C:/Users/sarim/PycharmProjects/thesis/DPC/psmnet/submission.py", line 125, in <module>
    main()
  File "C:/Users/sarim/PycharmProjects/thesis/DPC/psmnet/submission.py", line 112, in main
    pred_disp = test(imgL,imgR)
  File "C:/Users/sarim/PycharmProjects/thesis/DPC/psmnet/submission.py", line 83, in test
    output = model(imgL,imgR)
  File "C:\Users\sarim\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\sarim\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\parallel\data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "C:\Users\sarim\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\sarim\PycharmProjects\thesis\DPC\psmnet\models\stackhourglass.py", line 159, in forward
    pred3 = disparityregression(self.maxdisp)(pred3)
  File "C:\Users\sarim\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\sarim\PycharmProjects\thesis\DPC\psmnet\models\submodule.py", line 62, in forward
    disp = self.disp.repeat(x.size()[0],1,x.size()[2],x.size()[3])
RuntimeError: CUDA out of memory. Tried to allocate 352.00 MiB (GPU 0; 4.00 GiB total capacity; 2.04 GiB already allocated; 302.80 MiB free; 593.57 MiB cached)

Process finished with exit code 1

Edit: I got it to work by changing your code in stackhourglass.py (in class PSMNet):

    def forward(self, left, right):

        refimg_fea = self.feature_extraction(left)
        targetimg_fea = self.feature_extraction(right)

        # matching
        cost = Variable(
            torch.FloatTensor(refimg_fea.size()[0], refimg_fea.size()[1] * 2, self.maxdisp // 4, refimg_fea.size()[2],
                              refimg_fea.size()[3]).zero_()).cuda()

        for i in range(self.maxdisp // 4):
            if i > 0:
                cost[:, :refimg_fea.size()[1], i, :, i:] = refimg_fea[:, :, :, i:]
                cost[:, refimg_fea.size()[1]:, i, :, i:] = targetimg_fea[:, :, :, :-i]
            else:
                cost[:, :refimg_fea.size()[1], i, :, :] = refimg_fea
                cost[:, refimg_fea.size()[1]:, i, :, :] = targetimg_fea
        cost = cost.contiguous()
        del refimg_fea
        del targetimg_fea

        cost0 = self.dres0(cost)
        del cost
        cost0 = self.dres1(cost0) + cost0

        out1, pre1, post1 = self.dres2(cost0, None, None)
        out1 = out1 + cost0

        out2, pre2, post2 = self.dres3(out1, pre1, post1)
        del post1
        del pre2
        out2 = out2 + cost0

        out3, pre3, post3 = self.dres4(out2, pre1, post2)
        del post3
        del pre3
        del post2
        del pre1
        out3 = out3 + cost0
        del cost0

        cost1 = self.classif1(out1)
        del out1
        cost2 = self.classif2(out2) + cost1
        del out2
        cost3 = self.classif3(out3) + cost2
        del out3

        if self.training:
            cost1 = F.upsample(cost1, [self.maxdisp, left.size()[2], left.size()[3]], mode='trilinear')
            cost2 = F.upsample(cost2, [self.maxdisp, left.size()[2], left.size()[3]], mode='trilinear')

            cost1 = torch.squeeze(cost1, 1)
            pred1 = F.softmax(cost1, dim=1)
            pred1 = disparityregression(self.maxdisp)(pred1)

            cost2 = torch.squeeze(cost2, 1)
            pred2 = F.softmax(cost2, dim=1)
            pred2 = disparityregression(self.maxdisp)(pred2)

        del cost1
        del cost2
        cost3 = F.upsample(cost3, [self.maxdisp, left.size()[2], left.size()[3]], mode='trilinear')
        cost3 = torch.squeeze(cost3, 1)
        pred3 = F.softmax(cost3, dim=1)
        del cost3
        pred3 = disparityregression(self.maxdisp)(pred3)

        if self.training:
            return pred1, pred2, pred3
        else:
            return pred3

Disparity map

Disparity does not seem to be performing well.
Despite using the finetune_300 provided, it does not give good performance disparity.
what's wrong?

Stereo depth

Have you attempted to use the depth map output by AnyNet to generate Psuedo LiDAR?

Related to #12

Just curious if your team has attempted to use the depth map output by AnyNet to generate Psuedo LiDAR?

PSMNet's performance is great but it's also quite slow on a TX2 module even at 1242x375.

I am going to trying this myself tomorrow but just curious if your team has run any experiments with this?

I was able to get AnyNet trained but without spn. Still need to write up the cpp_extension part just have not gotten to it yet.

Finally, are there any plans to release the AnyNet pre-trained model? Re: AnyNet #8

Many thanks for the great documentation and excellent work!

Why not add steps for monocular 3D detection in readme?

Hi, I notice that the code can be transformed to monocular easily as the paper said.Why not adding the approach for mono in readme?

Unable to download your pretrained models or anything that links to Google Drive

Hello. As explained, I am unable to download your pre-trained models from Google Drive. Please have a look and update as necessary.

finetune_300.tar not loading, but pretrained_model_KITTI2015.tar works

Hi, I got runtime error while running PSMNet using the included pretrained finetune_300.tar, however, there was no error with the official PSMNet release pretrained_model_KITTI2015.tar. I'm wondering why could cause that?

My system is Python 2.7, Pytorch 0.4, torchvision 0.2.
Thanks.

Error message:

File "./psmnet/submission.py", line 69, in <module>
    model.load_state_dict(state_dict['state_dict'])
  File "/home/jhuang/.virtualenvs/psmnet/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 721, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
        Unexpected key(s) in state_dict: "module.feature_extraction.firstconv.0.1.num_batches_tracked", xxx

Unable to unzip the the pseudo-LiDAR point clouds on Google drive

On trying, read error is reported.

fakepath issue with your AVOD code

Hello. When I try to visualize the results by running demos/show_predictions_2d.py, I get an error:
FileNotFoundError: Dataset path does not exist: /home/sarim/fakepath
Can you tell me where exactly I need to make a change to deal with fakepath? What exactly I should replace fakepath with? I searched your entire AVOD project for instances of fakepath and found that it is used in the following three lines:

avod/protos/kitti_dataset.proto (line11)
avoid/protos/kitti_dataset_pb2.py (line 23 and 48)

Please tell me what I should replace fakepath with (if I need to)

In the mono image, how to transform depth to point cloud ?

In the monocular image, how to get the point cloud from depth image?

Your depth map formula requires the baseline. So, how can you obtain depth map for mono-images (which are obtained without stereo cameras)

Hello. I noticed in your research paper, the formula you use to get the z value (depth value) requires the baseline. But you also claim you are able to obtain point clouds for mono depth maps. But, in that case, there is no baseline. Then how are you able to do that?

Using PL-AVOD pre train to infer on MOT dataset

Hello,

Your code works perfectly and I have been able to evaluate your pre trained model. However, I want to use your Pseudo-LiDar pre trained model on AVOD to generate 3D detections on KITTI MOT dataset rather than the Object detection dataset, so question

Do I need to train from scratch in order to generate the new Pseudo Lidar and planes for the new dataset?

I thought I could do as usual and just provide the different images and that would be enough, but I get asked for the planes folder but do not know if using the previous one is correct, even tho I am using a pre trained model and not training.
Thanks.

finetune_300.tar

Unable to decompress file finetune_300.tar
If i try to decompress, it is not a compressed file.
What should i do if i try to decompress?

If you need PSMNet but only want to go with python 3, check this repo then

Here is the repo. https://github.com/Cli98/PSMNsupLIDAR.

Do exactly same thing as what you want to do. And PSMNet has been modified to support python 3.

So why you need this?

There might be the case you need this. When you conduct inference (by calling submission.py) under py3 you will see no error but the inference result is totally incorrect. You may then waste lots of time to check why and find out some surprising reasons (Actually a lot of issues but I will not discuss at here). Yet by using this one (PSMNsupLIDAR), the result will be correct then, and you save time to debug.

How to use:

Git clone and move your data folder to root of this repo. And go ahead for what you will do.

Feel free to have a try if you want to save your time. Or go with python2, if you prefer.

Good luck.

ModuleNotFoundError: No module named 'submodule'

I followed all your instructions but I still get the above error when I run the following command in Anaconda
python ./psmnet/submission.py --loadmodel ./psmnet/kitti_3d/finetune_300.tar --datapath ./media/sarim/7AF2CEB6F2CE7643/datasets/kitti/object/training/ --save_path ./media/sarim/7AF2CEB6F2CE7643/datasets/kitti/object/training/predict_disparity

The full error log is:

Traceback (most recent call last):
  File "./psmnet/submission.py", line 20, in <module>
    from models import *
  File "/home/sarim/pseudo_lidar/psmnet/models/__init__.py", line 1, in <module>
    from .basic import PSMNet as basic
  File "/home/sarim/pseudo_lidar/psmnet/models/basic.py", line 8, in <module>
    from submodule import *
ModuleNotFoundError: No module named 'submodule'

Can not install pytorch with Python 2.7

Is there a solution to conda install pytorch==0.4.1 torchvision==0.2.1 cuda92 -c pytorch with Python 2.7 on Windows10?
If not, how can I run the disparity step 2.5 (https://github.com/mileyan/pseudo_lidar)? It is possible to install conda install pytorch==0.4.1 torchvision==0.2.1 cuda92 -c pytorch under python 3.6 but then I get a lot of errors when I try to run submission.py (see below) since code is written for python 2.7 I think.

Traceback (most recent call last):
File "submission.py", line 57, in
test_left_img, test_right_img = DA.dataloader(args.datapath)
File "\KITTI\psmnet\dataloader\KITTI_submission_loader.py", line 26, in dataloader
image = [img for img in os.listdir(filepath+left_fold)]
FileNotFoundError: [WinError 3] The system cannot find the path specified: '/scratch/datasets/kitti2015/testing/image_2/

Training for my own custom data

Hello, I am going to get 3D bounding box from only one single image.
I have been suggested to use this repository for my aim.
And then, I have little knowledge about this repository.
Could anyone help me with my question?
Is this repository suitable for my aim? That is, can I generate 3D bounding box using this repository?
If yes, how can I make training dataset from only images dataset and how can I train them?
Thank you in advance.

Error(s) in loading state_dict for PSMNet:

Hi,
i had faced with next issue while running step 2.4. Train the stereo model:

python3 ./psmnet/finetune_3d.py --maxdisp 192 \
>      --model stackhourglass \
>      --datapath ./KITTI/object/training/ \
>      --split_file ./KITTI/object/train.txt \
>      --epochs 300 \
>      --lr_scale 50 \
>      --loadmodel ./pretrained_sceneflow.tar \
>      --savemodel ./psmnet/kitti_3d/  --btrain 12
./psmnet/kitti_3d/training.log
[2019-09-20 15:17:27 finetune_3d.py:77] INFO     load model ./pretrained_sceneflow.tar
Traceback (most recent call last):
  File "./psmnet/finetune_3d.py", line 79, in <module>
    model.load_state_dict(state_dict['state_dict'])
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 845, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for PSMNet:
        Missing key(s) in state_dict: "feature_extraction.firstconv.0.0.weight", "feature_extraction.firstconv.0.1.weight", "feature_extraction.firstconv.0.1.bias", "feature_extraction.firstconv.0.1.running_mean", "feature_extraction.firstconv.0.1.running_var", "feature_extraction.firstconv.2.0.weight", "feature_extraction

Disparity results.

Thanks for sharing this nice work with code.
Can you share the results of the disparity / depth accuracy from your newly modified PSMnet (using KITTI Stereo Benchmark)??
I download your code and achieve the metrics but it is too bad so I am not sure whether the pre-trained weight is correct.

So far, the disparity and depth metric that I obtained are below,

Getting no point clouds when converting disparity map of monocular image

Hello. My monocular image looks like this:

Following is my P2 matrix:

[[2.083091e+03 0.000000e+00 9.572938e+02 0.000000e+00]
 [0.000000e+00 2.083091e+03 6.505698e+02 0.000000e+00]
 [0.000000e+00 0.000000e+00 1.000000e+00 0.000000e+00]]

Following is my V2C matrix:

[[ 0.9999152   0.0083809   0.00996961 -1.565187  ]
 [-0.00828053  0.999915   -0.01006629  0.05696378]
 [-0.01005313  0.00998289  0.9998996  -2.099987  ]]

Following is my C2V matrix:

[[ 0.9999152  -0.00828053 -0.01005313  1.54441452]
 [ 0.0083809   0.999915    0.00998289 -0.02287734]
 [ 0.00996961 -0.01006629  0.9998996   2.11595389]]

Following is my R0 matrix:

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

I run your code in preprocessing/generate_lidar.py but I get no point clouds in return (I get an empty np.array). Can you please have a look at my above data and see what could be wrong here. The extrinsic and intrinsic parameters are from the Waymo dataset.

Waymo uses the vehicle reference axis where the y-axis is positive in the opposite direction (gets positive from left to right instead of being positive from right to left as in kitti)

Hi, I am kinda confused about pseudo lidar points from depth map ?

can I use depth map from any architecture to create pseudo lidar points? GT lidar points are required to train model to predict pseudo lidar points from generated depth map?

There is an imbalance between your GPUs

When executing the step "train psmnet with 4 TITAN X GPUs" there's an annoying warning caused by the necessity of using Python2.7 and pytorch, in full:

torch/nn/parallel/data_parallel.py:25: UserWarning:
There is an imbalance between your GPUs. You may want to exclude GPU 1 which
has less than 75% of the memory or cores of GPU 0. You can do so by setting
the device_ids argument to DataParallel, or by setting the CUDA_VISIBLE_DEVICES
environment variable.

I suspect it could prevent the usage of multi-gpu. Luckily, there's an easy fix for it. Go to the file in question causing the warning (torch/nn/parallel/data_parallel.py), and at the top of the file, add the following line:

from __future__ import division

Try again. In case you run into the following issue:

ImportError: No module named future

Just install future, e.g. using one of the following commands:

conda install future
pip3 install future

When will the code be released?

Your research is very interesting! So I wanna talk about it in the presentation of my auto drive class. Of course, it will be better if I can get the code and reproduce it. Thx!

Evaluating based on the pretrained AVOD weights

Please forgive my naivety here!

Based on the README, there's a link to the weights you have trained from AVOD using the pseudo-LiDAR point clouds.

However, the code provided by the AVOD team only has ways to evaluate or run inference using checkpoints.

Am I missing something here? Or is it just not possible to run their experiments using your weights.

Meaning I would have to either train their model, or write code to load the weights and run the inference that way, rather than the way they've coded it?

Am I missing something?

cannot generate poinclouds from disparity

Hi,

I can generate disparity images in .png.npy format using submission.py script. However when i try to convert those disparity files into poinclouds using generate_lidar.py it is not reading the files. The script is exiting without error but it is not reading the files. I don't know whether the file format is wrong or the script.

Any help will be appreciated.

pytorch

Can you tell me which version of the pytorch you are using

mileyan / pseudo_lidar Goto Github PK

pseudo_lidar's Introduction

Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

Citation

Update

Contents

Introduction

Usage

1. Overview

2. Stereo depth estimation models

2.1 Dependencies

2.2 Download the dataset

2.3 Generate ground-truth image disparities

2.4. Train the stereo model

2.5 Predict the point clouds

Predict the disparities.

Convert the disparities to point clouds.

2.6 Generate ground plane

3. Object Detection models

AVOD model

Frustum-PointNets model

Results

Contact

pseudo_lidar's People

Stargazers

Watchers

Forkers

pseudo_lidar's Issues

Recommend Projects

Recommend Topics

Recommend Org