Git Product home page Git Product logo

dbnet's Introduction

db-prediction

DBNet is a large-scale driving behavior dataset, which provides large-scale high-quality point clouds scanned by Velodyne lasers, high-resolution videos recorded by dashboard cameras and standard drivers' behaviors (vehicle speed, steering angle) collected by real-time sensors.

Extensive experiments demonstrate that extra depth information helps networks to determine driving policies indeed. We hope it will become useful resources for the autonomous driving research community.

Created by Yiping Chen*, Jingkang Wang*, Jonathan Li, Cewu Lu, Zhipeng Luo, HanXue and Cheng Wang. (*equal contribution)

The resources of our work are available: [paper], [code], [video], [website], [challenge], [prepared data]

Contents

  1. Introduction
  2. Requirements
  3. Quick Start
  4. Baseline
  5. Contributors
  6. Citation
  7. License

Introduction

This work is based on our research paper, which appears in CVPR 2018. We propose a large-scale dataset for driving behavior learning, namely, DBNet. You can also check our dataset webpage for a deeper introduction.

In this repository, we release demo code and partial prepared data for training with only images, as well as leveraging feature maps or point clouds. The prepared data are accessible here. (More demo models and scripts are released soon!)

Requirements

  • Tensorflow 1.2.0
  • Python 2.7
  • CUDA 8.0+ (For GPU)
  • Python Libraries: numpy, scipy and laspy

The code has been tested with Python 2.7, Tensorflow 1.2.0, CUDA 8.0 and cuDNN 5.1 on Ubuntu 14.04. But it may work on more machines (directly or through mini-modification), pull-requests or test report are well welcomed.

Quick Start

Training

To train a model to predict vehicle speeds and steering angles:

python train.py --model nvidia_pn --batch_size 16 --max_epoch 125 --gpu 0

The names of the models are consistent with our paper. Log files and network parameters will be saved to logs folder in default.

To see HELP for the training script:

python train.py -h

We can use TensorBoard to view the network architecture and monitor the training progress.

tensorboard --logdir logs

Evaluation

After training, you could evaluate the performance of models using evaluate.py. To plot the figures or calculate AUC, you may need to have matplotlib library installed.

python evaluate.py --model_path logs/nvidia_pn/model.ckpt

Prediction

To get the predictions of test data:

python predict.py

The results are saved in results/results (every segment) and results/behavior_pred.txt (merged) by default. To change the storation location:

python predict.py --result_dir specified_dir

The result directory will be created automatically if it doesn't exist.

Baseline

MethodSettingAccuracyAUCMEAEAME
nvidia-pnVideos + Laser Pointsangle70.65% (<5)0.7799 29.464.2320.88
speed82.21% (<3)0.870118.561.809.68

This baseline is run on dbnet-2018 challenge data and only nvidia_pn is tested. To measure difficult architectures comprehensively, several metrics are set, including accuracy under different thresholds, area under curve (AUC), max error (ME), mean error (AE) and mean of max errors (AME).

The implementations of these metrics could be found in evaluate.py.

Contributors

DBNet was developed by MVIG, Shanghai Jiao Tong University* and SCSC Lab, Xiamen University* (alphabetical order).

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{DBNet2018,
  author = {Yiping Chen and Jingkang Wang and Jonathan Li and Cewu Lu and Zhipeng Luo and HanXue and Cheng Wang},
  title = {LiDAR-Video Driving Dataset: Learning Driving Policies Effectively},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2018}
}

License

Our code is released under Apache 2.0 License. The copyright of DBNet could be checked here.

dbnet's People

Contributors

wangjksjtu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dbnet's Issues

Restore resnet152_pn's, Error: Key not found

Hi! When I run the evaluate.py,
python evaluate.py --model_path logs_Res_G/resnet152_pn/model.ckpt --test True
Try to restore the resnet152_pn's model,but I get the error:
NotFoundError (see above for traceback): Key conv1/bn/conv1/bn/moments/Squeeze_1/ExponentialMovingAverage not found in checkpoint

Thanks!

Wheel Angle Problem

Hi DBNet team,
thanks for the great dataset and here comes some question.

According to common sense and Figure 4 in your paper, the wheel angle should be between -90 and 90. However, the angle of the behavior.csv in the dataset are distributed between -180 and 180. It is difficult to predict such a wide range, could you tell me the reason for that?
And the lstm results are far from the paper.

Cheers!

How to get a high frame rate picture

After downloading and viewing your open source data set, I would like to ask you the following two questions:

  1. When reading the paper, I saw that you use a camera frame rate of 30 frames per second, but download a data set frame rate of 1 frame per second. Will you provide higher frame rate image data?
  2. Download data set found that a large part of the data for the cropped 66x200 resolution image, where can I download the original data set?
    Thank you!

the output is strange after training on my own dataset

Hi,friend, it's me again. <_>
I use my own data to train my network. Because I only recorded the steer angle, so I modified it to output only one float, the steer angle. However something strange happened. The mean loss in training stage stays high. Then I check the output on the val dataset and find the values are all positive. Though my dataset is not as big as yours, it still contains thousands samples from 4 scenes. Some of my outputs are in the picture. I tried to normalized the data, but it did not work. What should I do to solve the problem?
image

Annotations besides behaviors

Hello DBNet team,
Thanks for releasing this dataset.

May I know if there are annotations in the dataset for object detection?
Like semantic label or bounding box for each point cloud frame.

Many thanks

DNN-LSTM problem

Thanks for the paper and your job.
I have a question about the DNN-LSTM structure. You said in the paper that you stacked LSTM design will capture the temporal information and outperform DNN-only setting. But what I see from the code is that you shuffle the data in the provider file, which means the temporal information is lost. Also I didn't see you feed time sequence data into the network. It makes me confused. So I want to know how your cnn_lstm_block works exactly.
Hope for reply. Thanks.

About the recurrenting the paper

Hi dbnet team, thanks for the great paper and here comes some question.

  1. It is said that NVIDIA net ([3,3] Convolution size of last two layers) is used in the paper but the network you provided in https://github.com/driving-behavior/DBNet/blob/master/models/nvidia_io.py is not the same ([5,5] Convolution size of last two layers). Could you tell me the reason for that?

  2. May I confirm that did you use all the raw datas in http://www.dbehavior.net/download.aspx for the network training in the paper?

Cheers!

problem about your lstm block

Hi, I read your code and found a problem about your lstm block. Usually we don't shuffle the input data of lstm, but in your provider model, you shuffle the file names in function provider.read_from().

Won't shuffle break the data sequence or I miss something?
Thank u~

problem about your pointnet code

Hi, I notice some differences between your code of PointNet and the raw one. It seems that you didnot use theT-Net. From the paper of PointNet, the T-Net is important , so why you make that change?

not enough values to unpack

Traceback (most recent call last):
File "C:/Users/叶志伟/PycharmProjects/untitled/DBNet-master/predict.py", line 140, in
predict()
File "C:/Users/叶志伟/PycharmProjects/untitled/DBNet-master/predict.py", line 56, in predict
data_input = provider.Provider()
File "C:\Users\叶志伟\PycharmProjects\untitled\DBNet-master\provider.py", line 34, in init
self.read()
File "C:\Users\叶志伟\PycharmProjects\untitled\DBNet-master\provider.py", line 66, in read
self.read_from(train_sub, filename, "train")
File "C:\Users\叶志伟\PycharmProjects\untitled\DBNet-master\provider.py", line 106, in read_from
self.X_train1, self.X_train2, self.Y_train1, self.Y_train2 = zip(*c)
ValueError: not enough values to unpack (expected 4, got 0)

Question about DBNet/tools/las2fmap.py

Hi, it's me again and here comes other two question in it.
1.
line164> W_ij_XOY = 1.414 * GSD / (Z_ij)
Which is to get the distance to the center of the cell(i,j), method from Automated Extraction of Road Markings from Mobile Lidar Point Clouds
file1
but the method in it says that there should be D_kij instead of Z_ij in line 164. Z_ij is just the height of the point in cell(i,j), also we didn’t find the code for calculating D_kij in las2fmap.py.

line165> W_ij_H = H_ij * (h_min - z_min) / (z_max - h_max) / (Z_ij)
In Automated Extraction of Road Markings from Mobile Lidar Point Clouds,
file2
, there isn’t divided by (Z_ij).

Could you help me with them?
Thanks.

About the coordinates in the las files

Hi, DBnet team!
In the las2fmap.py(line 30-33)
def lasReader(filename): """ Read xyz points from single las file :param filename: path of single point cloud """ f = File(filename, mode='r') x_max, x_min = np.max(f.x), np.min(f.x) y_max, y_min = np.max(f.y), np.min(f.y) z_max, z_min = np.max(f.z), np.min(f.z) return np.transpose(np.asarray([f.x, f.y, f.z])), \ [(x_min, x_max), (y_min, y_max), (z_min, z_max)], f.header
you use the "x y z" of the las file as the coordinate information.
But in the pointnet input script provide.py (line 142-143)
infile = laspy.file.File(self.X_train2[i]) data = np.vstack([infile.X, infile.Y, infile.Z]).transpose()
you use the "X Y Z" of the las file as the coordinate information.

Could you tell me the different between "x y z" & "X Y Z" because the number of them are not the same ("x y z"<200 and "X Y Z" are around 10^9)

Thanks!

Request driver behavior data download with high sampling rate

Hi dbnet team, thank you for your patience with my questions earlier. But now I have a new question here.

I downloaded the data set of the original video, but there's no data about the driver's behavior.
Can you open the download of driver behavior data with high sampling rate.

Thank you very much.

Error when using las2fmap.py

Hi, thank for fixing the BUGs and it works great with examples.las.
But when I use the las file you prvides in ftp://user1:[email protected]/dataD/dbnet-2018.zip, which is the prepared data( I use dbnet-2018\train\1\points_16384\0.las for test), it comes the error:

Traceback (most recent call last): File "J:/DBNet-master/tools/las2fmap.py", line 290, in <module> main() File "J:/DBNet-master/tools/las2fmap.py", line 274, in main if get_fmap(p): File "J:/DBNet-master/tools/las2fmap.py", line 233, in get_fmap rotate_about_center(fmap, 180, 1.0)) File "J:/DBNet-master/tools/las2fmap.py", line 98, in rotate_about_center return cv2.warpAffine(src, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv2.INTER_LANCZOS4) cv2.error: OpenCV(3.4.4) C:\projects\opencv-python\opencv\modules\imgproc\src\imgwarp.cpp:2611: error: (-215:Assertion failed) src.cols > 0 && src.rows > 0 in function 'cv::warpAffine'
which means I could not make fmap by myself, could you please help me?

Request of camera parameters

Dear DBnet team:
I am using other tools to process the image data in the data set, but I need the internal and external parameters of the camera. Can you provide them? Thank you

problem about the prepares dataset

Dear db-net team,

Thanks for your great paper of dbnet. Sorry for sent an email with the same content to you because it is really improtent for my experiments.

After downloading and training the quick start on your GitHub, here comes 2 problem:

  1. In the val data, form \dbnet-2018 \val\18\dvr_66x200 to \dbnet-2018 \val\20\dvr_66x200, the car is totally stopping, but in the
    Behaviors.csv, the speed is not zero but about around 40km/s. I don’t think that it can be such a huge measurement error.
  2. I have done the quick training you provide and the val loss is smaller that train loss, could you tell me the reason for it? I also try the code of io-nvidia, it is also the same.

if you modify the dataset and the baseline, hope for reply.
thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.