Have you tested train depth and 3d object detection jointly in monocular image by DSGN

monocular experiment about dsgn HOT 12 CLOSED

dvlab-research commented on July 17, 2024

monocular experiment

from dsgn.

Comments (12)

chenyilun95 commented on July 17, 2024

Yes, we have compared it in our method and at least in our setting, it gets better performance than OFT. Please take a look at Table 3 in the paper.

from dsgn.

fangchengji commented on July 17, 2024

Yes, we have compared it in our method and at least in our setting, it gets better performance than OFT. Please take a look at Table 3 in the paper.

OK, I see. AP_BEV 19.92 ranks 3rd in KITTI BEV leaderboards compared with the other monocular methods. What a good performance. Could you also make this experiment code open sourced? Thanks!

from dsgn.

chenyilun95 commented on July 17, 2024

As mentioned in evluation metric section in paper, the reported results in validation set are evaluated under the old evaluation metric, which seems to (can) be higher than the new metric. This is a old project here. As I remembered, the monocular experiment is ran by simply dropping the right view and projecting the features to BEV directly. The projection code is available in the code. Currently I have other projects at hand. I suggest you try it yourself.

from dsgn.

fangchengji commented on July 17, 2024

As mentioned in evluation metric section in paper, the reported results in validation set are evaluated under the old evaluation metric, which seems to (can) be higher than the new metric. This is a old project here. As I remembered, the monocular experiment is ran by simply dropping the right view and projecting the features to BEV directly. The projection code is available in the code. Currently I have other projects at hand. I suggest you try it yourself.

I didn't notice the old metric used in the ablation study. It may decrease about 4% ~ 5% with the new metric. Thanks!

from dsgn.

fangchengji commented on July 17, 2024

As mentioned in evluation metric section in paper, the reported results in validation set are evaluated under the old evaluation metric, which seems to (can) be higher than the new metric. This is a old project here. As I remembered, the monocular experiment is ran by simply dropping the right view and projecting the features to BEV directly. The projection code is available in the code. Currently I have other projects at hand. I suggest you try it yourself.

One more questions. How do you apply depth supervision to 3D Volume when there is no Plane Sweep Volume?

from dsgn.

chenyilun95 commented on July 17, 2024

Its implementation is a bit tricky since only the sparse ray is visiable when supervising the voxel grid. You need to project back the depth map to point cloud and ignore the voxels that are not within the projection ray.

from dsgn.

fangchengji commented on July 17, 2024

Its implementation is a bit tricky since only the sparse ray is visiable when supervising the voxel grid. You need to project back the depth map to point cloud and ignore the voxels that are not within the projection ray.

I mean we can not apply Softmax to 3D Volume along the z axis like Plane Sweep Volume to get the depth because of the camera frustum.
You apply binary classification to the 3D Volume in occupancy manner. Firstly, projecting the image to point cloud according to sparse depth map like pseudo-lidar does. Secondly, if there is a 3D point, setting the voxel to positive. If there is no 3D point, setting the voxel to negetive. Is it?

from dsgn.

chenyilun95 commented on July 17, 2024

Right. But you need to ignore some voxels as they have no depth.

from dsgn.

fangchengji commented on July 17, 2024

Right. But you need to ignore some voxels as they have no depth.

Ok. Only set the label along the sparse lidar rays. Most voxels are ignored. I think it is a little complicate to calculate the 3D Volume's occupancy depth label.
Can I apply some 3D Covs after the 3D Volume and add a depth head? So I can reduce the z axis to get a (H', W') feature map and do depth supervision in this head. How do you think about this implement? I hope the depth supervision can improve the 3D detection performance.

from dsgn.

chenyilun95 commented on July 17, 2024

I think it is OK. Good to know your progress.

from dsgn.

fangchengji commented on July 17, 2024

I think it is OK. Good to know your progress.

Thanks for your detail reply.

from dsgn.

fangchengji commented on July 17, 2024

I found an interesting paper which well answered my question. https://arxiv.org/pdf/2103.01100.pdf

from dsgn.

monocular experiment about dsgn HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent