Hi, this repo looks really nice! I am wondering if you have any stat

In your experiment for mAP > 0.4 <p dir

We are using the <a href="https://github.com/waymo-research/waymo-open-da

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Any reference result with this converter? about waymo_kitti_converter HOT 13 CLOSED

caizhongang commented on August 21, 2024

Any reference result with this converter?

from waymo_kitti_converter.

Comments (13)

caizhongang commented on August 21, 2024

Hi, thank you for your question.

We are able to get overall mAPH > 0.4 for vehicle.

Although how to use the converted data is out of the scope of this repo, I would suggest visualizing the converted data with tools/kitti_label_visualizer.py (still under development but should work properly) and compared converted data to other datasets.

Wish you all the best.

from waymo_kitti_converter.

tianweiy commented on August 21, 2024

Thanks for your reply. That's great to hear. I will try the convertor and train one model and see...

from waymo_kitti_converter.

xjjs commented on August 21, 2024

Hi, thank you for your question.

We are able to get overall mAPH > 0.4 for vehicle.

Although how to use the converted data is out of the scope of this repo, I would suggest visualizing the converted data with tools/kitti_label_visualizer.py (still under development but should work properly) and compared converted data to other datasets.

Wish you all the best.

In your experiment for mAP > 0.4, how many frames are sampled in each segment? Do you use all the training set and validation set?

from waymo_kitti_converter.

caizhongang commented on August 21, 2024

In your experiment for mAP > 0.4

We got mAPH > 0.4.

how many frames are sampled in each segment?

We used all frames in each segment.

Do you use all the training set and validation set?

We used only the training set.

from waymo_kitti_converter.

xjjs commented on August 21, 2024

In your experiment for mAP > 0.4

We got mAPH > 0.4.

how many frames are sampled in each segment?

We used all frames in each segment.

Do you use all the training set and validation set?

We used only the training set.

The result in paper "Scalability in Perception for Autonomous Driving: Waymo Open Dataset"

Do you use the same mAPH in the paper?

from waymo_kitti_converter.

caizhongang commented on August 21, 2024

Do you use the same mAPH in the paper?

We are using the official tool as well as submitting to the leaderboard. I believe they should be the same as the paper.

from waymo_kitti_converter.

xjjs commented on August 21, 2024

We are using the official tool as well as submitting to the leaderboard. I believe they should be the same as the paper.

The performance of mAPH > 0.4 is far from the reported APH > 55 in the paper?
In addition, have you tried the kitti metrics? Actually, with the converted data (label_0/image_0/ first return of top lidar) of two training segment, I only got mAP 1.5~4.5 with the PointRCNN Project.

from waymo_kitti_converter.

caizhongang commented on August 21, 2024

The performance of mAPH > 0.4 is far from the reported APH > 55 in the paper?

As we are working on a paper, please allow me not to disclose our results just yet. But we are achieving much higher mAPH than 0.4.

In addition, have you tried the kitti metrics? Actually, with the converted data (label_0/image_0/ first return of top lidar) of two training segment, I only got mAP 1.5~4.5 with the PointRCNN Project.

There can be many reasons, such as the range of point cloud for voxelization, the voxel resolution, etc. Also, some gt bounding box only contain 2nd return points (the converter gets all return points by default now). Also, note that in KITTI, the lidar frame is on top of the self-driving car, but in waymo, it is at the bottom of the car (this follows the Waymo official conventions, but I'm thinking to convert this to be aligned with KITTI as well, what do you think?).

mAP 1.5~4.5 with the PointRCNN Project

mAP should be 0-1, I guess it is a typo here?

from waymo_kitti_converter.

xjjs commented on August 21, 2024

There can be many reasons, such as the range of point cloud for voxelization, the voxel resolution, etc. Also, some gt bounding box only contain 2nd return points (the converter gets all return points by default now). Also, note that in KITTI, the lidar frame is on top of the self-driving car, but in waymo, it is at the bottom of the car (this follows the Waymo official conventions, but I'm thinking to convert this to be aligned with KITTI as well, what do you think?).

According to the visualization results, the converted point clouds show well with the kitti tool. The origin at the top or at the bottom doesn't affect the relative positions between points and labels. The converted_y should < 0 for the origin is at the bottom and the y pointing down, but in the label files the converted_y >0, so the origin of the camera frame is also at the top?

mAP 1.5~4.5 with the PointRCNN Project

mAP should be 0-1, I guess it is a typo here?

It's AP 1.5~4.5 with the PointRCNN, and I think the training data is insufficient.

from waymo_kitti_converter.

tianweiy commented on August 21, 2024

@xjjs I also did some visualization and it looks quite well for the label / lidar points / my detector's output, so I think at least the format is correct. However, I guess there are some difference between waymo and nuScenes or KITTI that you need to figure out to get a good model. Probably, the anchor definition, training schedule, ... I am also struggling to get a good model on Waymo( currently in the 20-30 maph range) I would prefer to wait for some other people's referenced implementation on Waymo. If I remember correctly, PVRCNN's author said he will release code after the challenge here

from waymo_kitti_converter.

caizhongang commented on August 21, 2024

According to the visualization results, the converted point clouds show well with the kitti tool. The origin at the top or at the bottom doesn't affect the relative positions between points and labels. The converted_y should < 0 for the origin is at the bottom and the y pointing down, but in the label files the converted_y >0, so the origin of the camera frame is also at the top?

In KITTI, the labels are in "reference image frame", which does not exist in WOD. Hence, we create a "virtual reference image frame" (same as front camera frame) to allow the calib files and label files to be generated in alignment with KITTI. I will give a detailed explanation in README if I have time. You can also take a look at the code if you are interested.

It's AP 1.5~4.5 with the PointRCNN, and I think the training data is insufficient.

Do you mean 1.5%-4.5%? That indicates something is seriously wrong. A quick guess is the threshold for output box is set too high when the model is not properly trained with limited data, hence most boxes are filtered.

from waymo_kitti_converter.

xjjs commented on August 21, 2024

@xjjs However, I guess there are some difference between waymo and nuScenes or KITTI that you need to figure out to get a good model. Probably, the anchor definition, training schedule, ... I am also struggling to get a good model on Waymo( currently in the 20-30 maph range)

With PointRCNN and two training segment, I only got AP 1.5~4.5, maybe it's come from the lack of data, and the difference between kitti and waymo.

from waymo_kitti_converter.

xjjs commented on August 21, 2024

In KITTI, the labels are in "reference image frame", which does not exist in WOD. Hence, we create a "virtual reference image frame" to allow the calib files and label files to be generated in alignment with KITTI. I will give a detailed explanation in README if I have time. You can also take a look at the code if you are interested.

I have read the converted codes and comments, and I realized that the labels are in the camera frame, and the origin of the "front camera frame" is at the top.

It's AP 1.5~4.5 with the PointRCNN, and I think the training data is insufficient.
Do you mean 1.5%-4.5%? That indicates something is seriously wrong. A quick guess is the threshold for output box is set too high when the model is not properly trained with limited data, hence most boxes are filtered.

Yes, it should mean 1.5%-4.5%, the threshold is set to 0.7. With a low threshold such as 0.1, the recall rate can come 30% probably. And the visualization results show that the location is good but the heading direction is poor.

from waymo_kitti_converter.

Any reference result with this converter? about waymo_kitti_converter HOT 13 CLOSED

Comments (13)

Related Issues (12)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent