Hi, I am planning to do a fusion with Yolov4 and Seconds/PointPillar. Would you be pro

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Preparation for CLOCs about clocs HOT 12 CLOSED

pangsu0613 commented on August 10, 2024

Preparation for CLOCs

from clocs.

Comments (12)

pangsu0613 commented on August 10, 2024

Hello, for SECOND-V1.5 (for newer version of SECOND, it should be very similar), check the file 'voxelnet.py' https://github.com/traveller59/second.pytorch/blob/v1.5/second/pytorch/models/voxelnet.py, line 377, batch_box_preds = preds_dict["box_preds"], they are the raw output (encodings of bounding boxes before NMS) from the SECOND network. First, you need to decode them (line 387), the decoded expressions are [x, y, z, w, l, h, r] in lidar coordinate, if you need them in camera coordinate, use functions in https://github.com/traveller59/second.pytorch/blob/v1.5/second/pytorch/core/box_torch_ops.py to transform them. 'box_torch_ops.py' also provides many other useful 2D/3D bounding box and coordinate transformation functions.
As for Yolov4, I am not very familiar with the Yolov4 codebase, but based on my experience, just setting NMS score threshold to 0 (if you are using sigmoid score, 0 means no threshold), the output also works fine.

from clocs.

CodeDragon18 commented on August 10, 2024

Hi,
when do you update your code? i am waiting to try.

from clocs.

shaunkheng97 commented on August 10, 2024

Currently I do have a trained yolov4 model with BDD dataset and a trained Seconds v1.6 with Kitti dataset. My questions are:

If I am planning to evaluate CLOCs on Kitti dataset, is it recommended to train yolov4 with Kitti dataset to optimize the performance?
From my understanding, I do not need to retrain the network but would have to do an inference without NMS on a dataset as the input for CLOCs, am I correct?

from clocs.

pangsu0613 commented on August 10, 2024

@shaunkheng97

Yes, it is recommended to train the 2d detector (for here it is your yolov4) with Kitti dataset. If the 2D detector performs poorly on Kitti, it would spoil the fusion.
Yes, you are right, you don't need to re-train the network, just do the inference without NMS or no NMS score thresholding, the point is to get more raw outputs from the network.

from clocs.

pangsu0613 commented on August 10, 2024

@CodeDragon18
Thank you for your interests, I have been really busy these days, but I have started working on it, I will upload an early version as soon as possible.

from clocs.

shaunkheng97 commented on August 10, 2024

Alright I’ll work on it in the meantime! Thanks!

from clocs.

shaunkheng97 commented on August 10, 2024

Hi, just curious and confused about the training.

I have used 90% for training and 10% for validation on 7480 Kitti training dataset. If I were to run inference without NMS, I would have to reuse the training dataset as the inference set? Would it be contradictory if I am using the same dataset for training and inference?

from clocs.

pangsu0613 commented on August 10, 2024

Yes, you are right. Ideally, one should divide the dataset into 3 parts, part 1 for training the 3D and 2D detector, part 2 for training CLOCs, and part 3 for validation only. But for KITTI, first, the 3712 frames mini-training and 3769 frames validation split is so popular, many researchers use that split for their experiments, it would be good to show results on the 3769 validation set for comparison; second, KITTI is a relatively small dataset, I think it is too small to divide it into 3 parts. So, I just use the popular 3712 frames mini-training set to train 3D/2D detectors and CLOCs, and doing validation on 3769 frames validation set. This is NOT the best and reasonable way for training, but even with this, I still do get some improvements. I think for other larger datasets (such as nuScene, Waymo and Argoverse), dividing it into 3 parts would be a better choice.

from clocs.

shaunkheng97 commented on August 10, 2024

So for now, should I retrain yolov4 with any random 3712 frames, and run inference again on all 7480 frames for CLOCs input? How was Seconds' training like? I believe it is more ideal to train yolov4 similarly with the Seconds that you've used.

Might try to train on nuScenes if I am able to successfully train CLOCs on Kitti.

from clocs.

pangsu0613 commented on August 10, 2024

Yes, it would be better to train YOLO-V4 with the 3712 frames if 3712 frames are enough to train YOLO-V4.
Also, the 3712 + 3769 split is NOT random, it is a fixed well known convention split used by many researchers, so people could compare different networks on the same validation set, I remembered the split was proposed in a year 2015 paper named '3D object proposals for accurate object class detection'. You can find the split under /CLOCs/second/data/ImageSet, there are multiple text files there, for 'train.txt', it contains all the frame numbers for the 3712 mini-training set; for 'val.txt', it contains all the frame numbers for the 3769 validation set. SECOND uses the 3712 mini-training set for training.

from clocs.

shaunkheng97 commented on August 10, 2024

Alright. I'll attempt to train yolov4 with the 3712 mini-training set first, and will get back to CLOCs soon!

from clocs.

FaFaLiu commented on August 10, 2024

Did you succeed？I also want to do this...

from clocs.

Preparation for CLOCs about clocs HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent