Git Product home page Git Product logo

Comments (12)

pangsu0613 avatar pangsu0613 commented on August 10, 2024

Hello, for SECOND-V1.5 (for newer version of SECOND, it should be very similar), check the file 'voxelnet.py' https://github.com/traveller59/second.pytorch/blob/v1.5/second/pytorch/models/voxelnet.py, line 377, batch_box_preds = preds_dict["box_preds"], they are the raw output (encodings of bounding boxes before NMS) from the SECOND network. First, you need to decode them (line 387), the decoded expressions are [x, y, z, w, l, h, r] in lidar coordinate, if you need them in camera coordinate, use functions in https://github.com/traveller59/second.pytorch/blob/v1.5/second/pytorch/core/box_torch_ops.py to transform them. 'box_torch_ops.py' also provides many other useful 2D/3D bounding box and coordinate transformation functions.
As for Yolov4, I am not very familiar with the Yolov4 codebase, but based on my experience, just setting NMS score threshold to 0 (if you are using sigmoid score, 0 means no threshold), the output also works fine.

from clocs.

CodeDragon18 avatar CodeDragon18 commented on August 10, 2024

Hi,
when do you update your code? i am waiting to try.

from clocs.

shaunkheng97 avatar shaunkheng97 commented on August 10, 2024

Currently I do have a trained yolov4 model with BDD dataset and a trained Seconds v1.6 with Kitti dataset. My questions are:

  1. If I am planning to evaluate CLOCs on Kitti dataset, is it recommended to train yolov4 with Kitti dataset to optimize the performance?
  2. From my understanding, I do not need to retrain the network but would have to do an inference without NMS on a dataset as the input for CLOCs, am I correct?

from clocs.

pangsu0613 avatar pangsu0613 commented on August 10, 2024

@shaunkheng97

  1. Yes, it is recommended to train the 2d detector (for here it is your yolov4) with Kitti dataset. If the 2D detector performs poorly on Kitti, it would spoil the fusion.
  2. Yes, you are right, you don't need to re-train the network, just do the inference without NMS or no NMS score thresholding, the point is to get more raw outputs from the network.

from clocs.

pangsu0613 avatar pangsu0613 commented on August 10, 2024

@CodeDragon18
Thank you for your interests, I have been really busy these days, but I have started working on it, I will upload an early version as soon as possible.

from clocs.

shaunkheng97 avatar shaunkheng97 commented on August 10, 2024

Alright I’ll work on it in the meantime! Thanks!

from clocs.

shaunkheng97 avatar shaunkheng97 commented on August 10, 2024

Hi, just curious and confused about the training.

I have used 90% for training and 10% for validation on 7480 Kitti training dataset. If I were to run inference without NMS, I would have to reuse the training dataset as the inference set? Would it be contradictory if I am using the same dataset for training and inference?

from clocs.

pangsu0613 avatar pangsu0613 commented on August 10, 2024

Yes, you are right. Ideally, one should divide the dataset into 3 parts, part 1 for training the 3D and 2D detector, part 2 for training CLOCs, and part 3 for validation only. But for KITTI, first, the 3712 frames mini-training and 3769 frames validation split is so popular, many researchers use that split for their experiments, it would be good to show results on the 3769 validation set for comparison; second, KITTI is a relatively small dataset, I think it is too small to divide it into 3 parts. So, I just use the popular 3712 frames mini-training set to train 3D/2D detectors and CLOCs, and doing validation on 3769 frames validation set. This is NOT the best and reasonable way for training, but even with this, I still do get some improvements. I think for other larger datasets (such as nuScene, Waymo and Argoverse), dividing it into 3 parts would be a better choice.

from clocs.

shaunkheng97 avatar shaunkheng97 commented on August 10, 2024

So for now, should I retrain yolov4 with any random 3712 frames, and run inference again on all 7480 frames for CLOCs input? How was Seconds' training like? I believe it is more ideal to train yolov4 similarly with the Seconds that you've used.

Might try to train on nuScenes if I am able to successfully train CLOCs on Kitti.

from clocs.

pangsu0613 avatar pangsu0613 commented on August 10, 2024

Yes, it would be better to train YOLO-V4 with the 3712 frames if 3712 frames are enough to train YOLO-V4.
Also, the 3712 + 3769 split is NOT random, it is a fixed well known convention split used by many researchers, so people could compare different networks on the same validation set, I remembered the split was proposed in a year 2015 paper named '3D object proposals for accurate object class detection'. You can find the split under /CLOCs/second/data/ImageSet, there are multiple text files there, for 'train.txt', it contains all the frame numbers for the 3712 mini-training set; for 'val.txt', it contains all the frame numbers for the 3769 validation set. SECOND uses the 3712 mini-training set for training.

from clocs.

shaunkheng97 avatar shaunkheng97 commented on August 10, 2024

Alright. I'll attempt to train yolov4 with the 3712 mini-training set first, and will get back to CLOCs soon!

from clocs.

FaFaLiu avatar FaFaLiu commented on August 10, 2024

Did you succeed?I also want to do this...

from clocs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.