driving-behavior / dbnet Goto Github PK

DBNet: A Large-Scale Dataset for Driving Behavior Learning, CVPR 2018

License: Apache License 2.0

Python 100.00%

autonomous-driving benchmark point-cloud cvpr2018 dbnet vehicle-speed steering-wheel driving-behavior

dbnet's Introduction

DBNet is a large-scale driving behavior dataset, which provides large-scale high-quality point clouds scanned by Velodyne lasers, high-resolution videos recorded by dashboard cameras and standard drivers' behaviors (vehicle speed, steering angle) collected by real-time sensors.

Extensive experiments demonstrate that extra depth information helps networks to determine driving policies indeed. We hope it will become useful resources for the autonomous driving research community.

Created by Yiping Chen*, Jingkang Wang*, Jonathan Li, Cewu Lu, Zhipeng Luo, HanXue and Cheng Wang. (*equal contribution)

The resources of our work are available: [paper], [code], [video], [website], [challenge], [prepared data]

Introduction
Requirements
Quick Start
Baseline
Contributors
Citation
License

Introduction

This work is based on our research paper, which appears in CVPR 2018. We propose a large-scale dataset for driving behavior learning, namely, DBNet. You can also check our dataset webpage for a deeper introduction.

In this repository, we release demo code and partial prepared data for training with only images, as well as leveraging feature maps or point clouds. The prepared data are accessible here. (More demo models and scripts are released soon!)

Requirements

Tensorflow 1.2.0
Python 2.7
CUDA 8.0+ (For GPU)
Python Libraries: numpy, scipy and laspy

The code has been tested with Python 2.7, Tensorflow 1.2.0, CUDA 8.0 and cuDNN 5.1 on Ubuntu 14.04. But it may work on more machines (directly or through mini-modification), pull-requests or test report are well welcomed.

Quick Start

Training

To train a model to predict vehicle speeds and steering angles:

python train.py --model nvidia_pn --batch_size 16 --max_epoch 125 --gpu 0

The names of the models are consistent with our paper. Log files and network parameters will be saved to logs folder in default.

To see HELP for the training script:

python train.py -h

We can use TensorBoard to view the network architecture and monitor the training progress.

tensorboard --logdir logs

Evaluation

After training, you could evaluate the performance of models using evaluate.py. To plot the figures or calculate AUC, you may need to have matplotlib library installed.

python evaluate.py --model_path logs/nvidia_pn/model.ckpt

Prediction

To get the predictions of test data:

python predict.py

The results are saved in results/results (every segment) and results/behavior_pred.txt (merged) by default. To change the storation location:

python predict.py --result_dir specified_dir

The result directory will be created automatically if it doesn't exist.

Baseline

Method	Setting		Accuracy	AUC	ME	AE	AME
nvidia-pn	Videos + Laser Points	angle	70.65% (<5)	0.7799	29.46	4.23	20.88
nvidia-pn	Videos + Laser Points	speed	82.21% (<3)	0.8701	18.56	1.80	9.68

This baseline is run on dbnet-2018 challenge data and only nvidia_pn is tested. To measure difficult architectures comprehensively, several metrics are set, including accuracy under different thresholds, area under curve (AUC), max error (ME), mean error (AE) and mean of max errors (AME).

The implementations of these metrics could be found in evaluate.py.

Contributors

DBNet was developed by MVIG, Shanghai Jiao Tong University* and SCSC Lab, Xiamen University* (alphabetical order).

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{DBNet2018,
  author = {Yiping Chen and Jingkang Wang and Jonathan Li and Cewu Lu and Zhipeng Luo and HanXue and Cheng Wang},
  title = {LiDAR-Video Driving Dataset: Learning Driving Policies Effectively},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2018}
}

License

Our code is released under Apache 2.0 License. The copyright of DBNet could be checked here.

dbnet's People

Contributors

Stargazers

Watchers

dbnet's Issues

Restore resnet152_pn's, Error: Key not found

Hi! When I run the evaluate.py,
python evaluate.py --model_path logs_Res_G/resnet152_pn/model.ckpt --test True
Try to restore the resnet152_pn's model,but I get the error:
NotFoundError (see above for traceback): Key conv1/bn/conv1/bn/moments/Squeeze_1/ExponentialMovingAverage not found in checkpoint

Thanks!

Wheel Angle Problem

Hi DBNet team,
thanks for the great dataset and here comes some question.

According to common sense and Figure 4 in your paper, the wheel angle should be between -90 and 90. However, the angle of the behavior.csv in the dataset are distributed between -180 and 180. It is difficult to predict such a wide range, could you tell me the reason for that?
And the lstm results are far from the paper.

Cheers!

How to get a high frame rate picture

After downloading and viewing your open source data set, I would like to ask you the following two questions:

When reading the paper, I saw that you use a camera frame rate of 30 frames per second, but download a data set frame rate of 1 frame per second. Will you provide higher frame rate image data?
Download data set found that a large part of the data for the cropped 66x200 resolution image, where can I download the original data set?
Thank you！

the output is strange after training on my own dataset

Hi,friend, it's me again. <_>
I use my own data to train my network. Because I only recorded the steer angle, so I modified it to output only one float, the steer angle. However something strange happened. The mean loss in training stage stays high. Then I check the output on the val dataset and find the values are all positive. Though my dataset is not as big as yours, it still contains thousands samples from 4 scenes. Some of my outputs are in the picture. I tried to normalized the data, but it did not work. What should I do to solve the problem?

Annotations besides behaviors

Hello DBNet team,
Thanks for releasing this dataset.

May I know if there are annotations in the dataset for object detection?
Like semantic label or bounding box for each point cloud frame.

Many thanks

DNN-LSTM problem

Thanks for the paper and your job.
I have a question about the DNN-LSTM structure. You said in the paper that you stacked LSTM design will capture the temporal information and outperform DNN-only setting. But what I see from the code is that you shuffle the data in the provider file, which means the temporal information is lost. Also I didn't see you feed time sequence data into the network. It makes me confused. So I want to know how your cnn_lstm_block works exactly.
Hope for reply. Thanks.

About the recurrenting the paper

Hi dbnet team, thanks for the great paper and here comes some question.

It is said that NVIDIA net ([3,3] Convolution size of last two layers) is used in the paper but the network you provided in https://github.com/driving-behavior/DBNet/blob/master/models/nvidia_io.py is not the same ([5,5] Convolution size of last two layers). Could you tell me the reason for that?
May I confirm that did you use all the raw datas in http://www.dbehavior.net/download.aspx for the network training in the paper?

Cheers!

Cannot download dataset from site<ftp://user1:[email protected]>

Just as the title, i try ping to this host <121.192.180.185> and it always return timeout.
Another site is available now?

Many thanks.

Broken image in the data

Hi!
66.jpg-119.jpg in dbnet-2018\train\48\dvr_1920x1080 of ftp://user1:[email protected]/dataD/dbnet-2018.zip could not be load. Could you plz check that? Thanks a lot for that!

The Prepared Data given doesn‘t have all the data

Hi, thanks for your wonderful research which teach me a lot but i cannot find the test dataset in the Prepared Data zip file. Could you provide us with them? thanks a lot.

problem about your lstm block

Hi, I read your code and found a problem about your lstm block. Usually we don't shuffle the input data of lstm, but in your provider model, you shuffle the file names in function provider.read_from().

Won't shuffle break the data sequence or I miss something?
Thank u~

Something about DBNet/tools/las2fmap.py

Hi!
DBNet/tools/las2fmap.py line123
y = int((point[1] - X_min) / GSD)
Is X_min supposed to be Y_min here?

problem about your pointnet code

Hi, I notice some differences between your code of PointNet and the raw one. It seems that you didnot use theT-Net. From the paper of PointNet, the T-Net is important , so why you make that change?

not enough values to unpack

Traceback (most recent call last):
File "C:/Users/叶志伟/PycharmProjects/untitled/DBNet-master/predict.py", line 140, in
predict()
File "C:/Users/叶志伟/PycharmProjects/untitled/DBNet-master/predict.py", line 56, in predict
data_input = provider.Provider()
File "C:\Users\叶志伟\PycharmProjects\untitled\DBNet-master\provider.py", line 34, in init
self.read()
File "C:\Users\叶志伟\PycharmProjects\untitled\DBNet-master\provider.py", line 66, in read
self.read_from(train_sub, filename, "train")
File "C:\Users\叶志伟\PycharmProjects\untitled\DBNet-master\provider.py", line 106, in read_from
self.X_train1, self.X_train2, self.Y_train1, self.Y_train2 = zip(*c)
ValueError: not enough values to unpack (expected 4, got 0)

Since you had the pcd2las，why you still use the CC viewer？

Question about DBNet/tools/las2fmap.py

Hi, it's me again and here comes other two question in it.
1.
line164> W_ij_XOY = 1.414 * GSD / (Z_ij)
Which is to get the distance to the center of the cell(i,j), method from Automated Extraction of Road Markings from Mobile Lidar Point Clouds

but the method in it says that there should be D_kij instead of Z_ij in line 164. Z_ij is just the height of the point in cell(i,j), also we didn’t find the code for calculating D_kij in las2fmap.py.

line165> W_ij_H = H_ij * (h_min - z_min) / (z_max - h_max) / (Z_ij)
In Automated Extraction of Road Markings from Mobile Lidar Point Clouds,

, there isn’t divided by (Z_ij).

Could you help me with them?
Thanks.

About the coordinates in the las files

Hi, DBnet team!
In the las2fmap.py(line 30-33)
def lasReader(filename): """ Read xyz points from single las file :param filename: path of single point cloud """ f = File(filename, mode='r') x_max, x_min = np.max(f.x), np.min(f.x) y_max, y_min = np.max(f.y), np.min(f.y) z_max, z_min = np.max(f.z), np.min(f.z) return np.transpose(np.asarray([f.x, f.y, f.z])), \ [(x_min, x_max), (y_min, y_max), (z_min, z_max)], f.header
you use the "x y z" of the las file as the coordinate information.
But in the pointnet input script provide.py (line 142-143)
infile = laspy.file.File(self.X_train2[i]) data = np.vstack([infile.X, infile.Y, infile.Z]).transpose()
you use the "X Y Z" of the las file as the coordinate information.

Could you tell me the different between "x y z" & "X Y Z" because the number of them are not the same ("x y z"<200 and "X Y Z" are around 10^9)

Thanks!

Request driver behavior data download with high sampling rate

Hi dbnet team, thank you for your patience with my questions earlier. But now I have a new question here.

I downloaded the data set of the original video, but there's no data about the driver's behavior.
Can you open the download of driver behavior data with high sampling rate.

Thank you very much.

Error when using las2fmap.py

Hi, thank for fixing the BUGs and it works great with examples.las.
But when I use the las file you prvides in ftp://user1:[email protected]/dataD/dbnet-2018.zip, which is the prepared data( I use dbnet-2018\train\1\points_16384\0.las for test), it comes the error:

Traceback (most recent call last): File "J:/DBNet-master/tools/las2fmap.py", line 290, in <module> main() File "J:/DBNet-master/tools/las2fmap.py", line 274, in main if get_fmap(p): File "J:/DBNet-master/tools/las2fmap.py", line 233, in get_fmap rotate_about_center(fmap, 180, 1.0)) File "J:/DBNet-master/tools/las2fmap.py", line 98, in rotate_about_center return cv2.warpAffine(src, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv2.INTER_LANCZOS4) cv2.error: OpenCV(3.4.4) C:\projects\opencv-python\opencv\modules\imgproc\src\imgwarp.cpp:2611: error: (-215:Assertion failed) src.cols > 0 && src.rows > 0 in function 'cv::warpAffine'
which means I could not make fmap by myself, could you please help me?

After downloading and training the quick start on your GitHub, here comes 2 problem:

In the val data, form \dbnet-2018 \val\18\dvr_66x200 to \dbnet-2018 \val\20\dvr_66x200, the car is totally stopping, but in the
Behaviors.csv, the speed is not zero but about around 40km/s. I don’t think that it can be such a huge measurement error.
I have done the quick training you provide and the val loss is smaller that train loss, could you tell me the reason for it? I also try the code of io-nvidia, it is also the same.

if you modify the dataset and the baseline, hope for reply.
thanks

driving-behavior / dbnet Goto Github PK

dbnet's Introduction

Contents

Introduction

Requirements

Quick Start

Training

Evaluation

Prediction

Baseline

Contributors

Citation

License

dbnet's People

Contributors

Stargazers

Watchers

Forkers

dbnet's Issues

Recommend Projects

Recommend Topics

Recommend Org