supermhp / gupnet Goto Github PK

View Code? Open in Web Editor NEW

128.0 128.0 22.0 8.15 MB

License: MIT License

Python 64.85% Shell 0.28% C++ 34.87%

gupnet's People

Stargazers

Watchers

gupnet's Issues

evaluation scripts?

Thansks for your work, could you give me a evaluation scripts of AP40? I don't know where to download it?

Zero score on detetion

Hey!
Great work!

I was trying the train_val.py in the evaluation mode using the model you provided. However. I see that all the detections are all 0. Any idea why this might be happening?

Thanks!

vs code 调试时会卡死是什么情况？

按csdn上某个回答的说法，为调试python3.6的程序，已将vscode里的python extension降级至2022.08了，但是这个调试别的python3.6程序可以
调试这个仓库，每次计算pre-training loss时就会卡死，连vscode也没法关，也做不了其他操作，只能强制关机处理，有谁遇到么？

Can you provide the visualization code

Excuse me, could you please provide me with the code for visualizing the results in your paper through online storage or other forms?

why not h3d_std * X + h3d_mean ?

GUPNet/code/lib/models/gupnet.py

Line 165 in f4e2660

size_3d = (self.mean_size[cls_ids[mask].long()]+size3d_offset)

according to

h3d should be equal to

h3d = self.mean_size[cls_ids[mask].long()] + size3d_offset * h3d_std

thx

关于在test上只关注car的结果

作者您好，我想请问一下，我只想关心car在test上的结果，可不可以在trainval上只训练car,然后向kitti官方提供需要的文件。还是说必须在trainval上训练car ped cyc三个，然后只关心kitti官方给的结果中的car?
另：您提供的代码设置的阈值是不是默认的0.7，如果我想看0.5阈值下的car等结果，是不是需要修改代码？

urllib.error.HTTPError: HTTP Error 404: Not Found

Thanks for your work! I try to implement your code on windows but I have some problems in the process

Is val mAP=16.46 reported in paper only trained with 'car' category?

@SuperMHP
Hi, thanks to your work.
I want to know if map=16.46 only uses car category during training

I trained my GUPNet by ['Car','Pedestrian','Cyclist'].
Unfortunately, I tried to train three times and could only reach up to 15.5!
So, I doubt how you can train out model with mAP=16.46. (Is the kitti training error of more than 1 point too large?)
Whising for your reply.

关于实验可重复性的问题

我使用相同的训练参数配置在多次实验中都获得了不同的实验结果，代码没有被修改，请问该如何解决这种情况

Question about train log in validation test

I have tried to reproduce the result with batch size 8 but it is hard to get closer to your provided result on released chpt. Can I ask for your train log with this checkpoint for research purpose? I'm new so please tell me if i did something wrong. Thank you.

Calibration.flip()函数的原理

作者你好，在数据增强的水平翻转操作中，若图像进行了翻转，那么相机的相关标定信息会发生变化。在您的代码中体现在kitti.py中的calib.flip(img_size)这一操作，但是我不是很理解函数中为什么要构造cos_matrix这个矩阵以及用奇异值分解来求解相关系数，因此想向您请教一下该函数的相关数学原理出处，期待回复，非常感谢!!!

关于数据增强的一些疑问

The scale augmentation is not suitable for 3d regression

when randomcrop happends, gupnet still regress 3d information.Is it a bug?

关于GUPnet ++和GUPnet的问题

看了GUPnet和GUPnet ++的文章以及GUPnet的部分代码，有几点疑问：

关于IOUnC的解释，文中提到可以理解为是对当前3d bbox是否为TP的一种估计，th设为0.7是因为Car的类别在KITTI里的阀值为0.7吗？那对于Pedistrian和Cyclist的类别，是否th设为0.5结果会更好？那nuscenes中并没有使用IOU作为评估的机制，IOUnC是否应该相应的修改为Distance 相关的UnC？
对于GUPnet 的代码中，除了Horizontal FlipLR的增强，还使用了哪些增强？我看好像类似是multi-scale ? 不知道理解是否正确

truncated case

depth_geo = size_3d[:,0]/box2d_height.squeeze()*roi_calibs[:,0,0]

当截断时, 例如公交/卡车, 此时box2d_height和实际的投影高度差挺多的, 网络能泛化出来吗?

Questions about evaluate results

HI, @SuperMHP
Thanks for your codes!
When evaluate the mAP|40, I use the cpp script you provided to get results.
As is shown in the following figure, I wonder what is the difference between car_detection_ground and car_detection?

Question about the Calibration.flip/new_calib_matrix-P2

I think the P[2,2] component of new_calib_matrix-P2 should be equal to 1. When I print this element it is 0 instead of 1.

关于nn.AdaptiveAvgPool2d，3d网络中加自适应平均池化的作用

why not the bias depth is a negative number?

bias_depth = 1. / (depth_net_out[:,0:1].sigmoid() + 1e-6) - 1. ∈(0, +inf)
the bias_depth need be negative number if the geo_depth was bigger than real_depth.

Question About the AP in test set

Hello , Thanks for your great work! And we hope to follow your work.
We retrained your code on the KITTI train split (3dop), and evaluate it on the val set, get car's AP in val set: [22.698555, 15.741446, 13.477293], and it is close to your paper's report.
Then we just use this checkpoint(trained on train split) to infer in KITTI's test set, and submit the result to KITTI benchmark, get the following result in test set:

The 3D AP is very low(Car's 3D AP: 14.93%, 10.15%, 8.22%). I wonder if it is normal considering the model is trained on train split(not in trainval).

And we want to reproduce your paper's report on KITTI test set, close to car‘s AP（22.26%, 15.02%, 13.12% as your released checkpoint or 20.11%, 14.20%, 11.77% in your original paper）. In addition to setting to the trainval set, what else do we need to do?
And if I set to trainval set, how do I choose the best checkpoint in case that the val set is included in the trainval set?
Thank you very much for your reply！

i was wonder how to use evaluate code

i use the evaluate code you provide but got an error below ,did anyone meet this issiue before,and how to fix it

Thank you for participating in our evaluation!
Loading detections...
number of files for evaluation: 3769
ERROR: Couldn't read: 005218.txt of ground truth. Please write me an email!
An error occured while processing your results.

question about affine transformation？

Hi, @SuperMHP
I met one question about affine transformation in

GUPNet/code/lib/datasets/kitti_utils.py

Line 398 in f4e2660

dst[1, :] = np.array([dst_w * 0.5, dst_h * 0.5], np.float32) + dst_dir

To get affine matrix, we need to know three pixel pairs.
why dst_dir = np.array([0, dst_w * -0.5], np.float32) rather than dst_dir = np.array([dst_w * -0.5, 0], np.float32).
Namely, why use [width-axis,height-axis] to add [height-axis, width-axis]? This is too strange!!!
In my opinion, the second piexl is the left piexl, so we need np.array([dst_w * 0.5, dst_h * 0.5], np.float32) + np.array([dst_w * -0.5, 0], np.float32).
I am not sure if I misunderstand this transformation and wishing for your reply!

Is the network pretrained on a depth set?

Hello, good job.
I notice many methods use a depth pre-training for better performance,
is your network pre-trained on a depth set first?

About code for GUPNet++

非常出色的工作，想问一下GUPNet++目前是否还有开源的计划，论文上的地址https://github.com/SuperMHP/GUPNet_Plus
已经无法打开了。

size3d_loss 负数

您好，复现您的代码，看log发现size3d_loss在大约7~8epoch的时候就开始为负，后面一直是负数，这个没问题吗？

Question about the code releasing time

Hello, @SuperMHP
Thanks for your great work on mono3d detection and congrats for its ICCV2021 acceptance. I am wondering when will you release the code?

Much lower AP_3D compared to AP_BEV

Thanks for your great work!

I use the released code to retrain the network, while the results are strange.
I obtain:
[email protected] [29.223935810025136, 21.975801299792906, 19.0762136467218]
[email protected] [17.863160821111062, 12.961739635817185, 10.802839248636912]

where the AP_BEV is OK, but the AP_3D is considerably low. I tried three times and get similar results.

关于backbone和neck

有试过backbone->resnet及配套的neck么，老铁？效果如何

Hi, is there a way to get best checkpoint?

How to visualize detected results?

The difference between original paper and released ckpt

Dear author,

Thanks for your wonderful work. I am following your repo to build a 3D detection framework. Do you mind telling me the difference between the results of original paper and released ckpt (especially on the test set) . And what causes the performance gap between them?

有两个问题，麻烦解答一下~

为什么用2d bbox来算深度，而不是用2d keypoints的检测来算深度
根据小孔成像原理，深度 = 物体的实际高度* f /物体的pixel高度，但是论文中却用的是物体的2d bbox来代替物体的pixel高度，这样不会导致误差吗？所以论文中对于深度的计算结果加了一个offset，是这样的吗？为什么不直接用keypoints的检测，然后利用keypoints算出pixel的高度，再来算深度呢？按照GUPnet这样的计算相比于利用keypoints检测的方法有什么优势？
论文中提到的三种提升(GeP, UnC, GeU)中的GeP到底是指什么？
UnC很好理解，就是用uncertainty来作confidence，和heatmap的confidence结合起来算物体的置信度，GeU可以说是整篇文章结合到一起就是GeU吧（利用了uncerttainty 计算的几何投影），但是GeP是指什么没明白。因为你的Paper里并不像MonoFlex或MonoCon那样有直接预测深度的分支，那就必须靠投影，所以GeP难道不是必须的吗？为什么Ablation study里面还专门列了一项GeP对该方法的提升？

麻烦解答，多谢~

Hi, I am not sure about the structure about the KITTI dataset

May I ask how to build KITTI dataset structure and what folder to construct this structure?

Question about the Calibration.flip

Hello, you have done a great job!
I am some confused about the lib.datasets.kitti_utils.Calibration.flip function, which is used in your kitti dataloader. I am not sure about its function and result. when I try to use the flipped calib to back project some points from fliped image to the camera coordinate, I got very wrong result.
I wonder the lib.datasets.kitti_utils.Calibration class is written by yourself or referenced from some orther codebase, Thank you very much!!!
Thanks!

Question about the release time

Great work！I am wondering when you will release your code？Thx

多卡训练报错

question about the uncertainty loss

Hi, I have some questions about the uncertainty loss. Will the uncertainty loss be negative?

什么时候对depth进行缩放？

encoding depth

depth[i] = objects[i].pos[-1]

monodle

encoding depth

depth[i] = objects[i].pos[-1] * aug_scale

为什么不对深度进行比例缩放呢？monodle中对depth进行了缩放，这是为什么呢

size3d_loss是负数

您好，复现您的代码时发现size3d_loss后面一直是负数，这个会有影响吗？为什么会出现负数的情况？

About training under different version of pytorch and cuda

Thanks for your great work!
I am now training the code under pytorch 1.10 and cuda 11.0, because I don't have a proper GPU that satisfies the environment in README. However, I got a much lower result in AP40 moderate: 13.69, compared to the given ckpt 16.23.
Do you have some ideas about why the performance deteriorate sharply under different environments?
Thanks very much

supermhp / gupnet Goto Github PK

gupnet's People

Stargazers

Watchers

Forkers

gupnet's Issues

encoding depth

encoding depth

Recommend Projects

Recommend Topics

Recommend Org