supermhp / gupnet Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Thansks for your work, could you give me a evaluation scripts of AP40? I don't know where to download it?
Hey!
Great work!
I was trying the train_val.py in the evaluation mode using the model you provided. However. I see that all the detections are all 0. Any idea why this might be happening?
Thanks!
按csdn上某个回答的说法,为调试python3.6的程序,已将vscode里的python extension降级至2022.08了,但是这个调试别的python3.6程序可以
调试这个仓库,每次计算pre-training loss时就会卡死,连vscode也没法关,也做不了其他操作,只能强制关机处理,有谁遇到么?
请问关于GUPNet++的代码大概什么时候可以开源?
GUPNet/code/lib/models/gupnet.py
Line 165 in f4e2660
h3d = self.mean_size[cls_ids[mask].long()] + size3d_offset * h3d_std
thx
作者您好,我想请问一下,我只想关心car在test上的结果,可不可以在trainval上只训练car,然后向kitti官方提供需要的文件。还是说必须在trainval上训练car ped cyc三个,然后只关心kitti官方给的结果中的car?
另:您提供的代码设置的阈值是不是默认的0.7,如果我想看0.5阈值下的car等结果,是不是需要修改代码?
@SuperMHP
Hi, thanks to your work.
I want to know if map=16.46
only uses car
category during training
I trained my GUPNet by ['Car','Pedestrian','Cyclist']
.
Unfortunately, I tried to train three times and could only reach up to 15.5!
So, I doubt how you can train out model with mAP=16.46. (Is the kitti training error of more than 1 point too large?)
Whising for your reply.
我使用相同的训练参数配置在多次实验中都获得了不同的实验结果,代码没有被修改,请问该如何解决这种情况
I have tried to reproduce the result with batch size 8 but it is hard to get closer to your provided result on released chpt. Can I ask for your train log with this checkpoint for research purpose? I'm new so please tell me if i did something wrong. Thank you.
作者你好,在数据增强的水平翻转操作中,若图像进行了翻转,那么相机的相关标定信息会发生变化。在您的代码中体现在kitti.py中的calib.flip(img_size)这一操作,但是我不是很理解函数中为什么要构造cos_matrix这个矩阵以及用奇异值分解来求解相关系数,因此想向您请教一下该函数的相关数学原理出处,期待回复,非常感谢!!!
when randomcrop happends, gupnet still regress 3d information.Is it a bug?
看了GUPnet和GUPnet ++的文章以及GUPnet的部分代码,有几点疑问:
depth_geo = size_3d[:,0]/box2d_height.squeeze()*roi_calibs[:,0,0]
当截断时, 例如公交/卡车, 此时box2d_height和实际的投影高度差挺多的, 网络能泛化出来吗?
HI, @SuperMHP
Thanks for your codes!
When evaluate the mAP|40, I use the cpp script you provided to get results.
As is shown in the following figure, I wonder what is the difference between car_detection_ground
and car_detection
?
I think the P[2,2] component of new_calib_matrix-P2 should be equal to 1. When I print this element it is 0 instead of 1.
bias_depth = 1. / (depth_net_out[:,0:1].sigmoid() + 1e-6) - 1.
∈(0, +inf)
the bias_depth need be negative number if the geo_depth was bigger than real_depth.
Hello , Thanks for your great work! And we hope to follow your work.
We retrained your code on the KITTI train split (3dop), and evaluate it on the val set, get car's AP in val set: [22.698555, 15.741446, 13.477293], and it is close to your paper's report.
Then we just use this checkpoint(trained on train split) to infer in KITTI's test set, and submit the result to KITTI benchmark, get the following result in test set:
The 3D AP is very low(Car's 3D AP: 14.93%, 10.15%, 8.22%). I wonder if it is normal considering the model is trained on train split(not in trainval).
And we want to reproduce your paper's report on KITTI test set, close to car‘s AP(22.26%, 15.02%, 13.12% as your released checkpoint or 20.11%, 14.20%, 11.77% in your original paper). In addition to setting to the trainval set, what else do we need to do?
And if I set to trainval set, how do I choose the best checkpoint in case that the val set is included in the trainval set?
Thank you very much for your reply!
i use the evaluate code you provide but got an error below ,did anyone meet this issiue before,and how to fix it
Thank you for participating in our evaluation!
Loading detections...
number of files for evaluation: 3769
ERROR: Couldn't read: 005218.txt of ground truth. Please write me an email!
An error occured while processing your results.
Hi, @SuperMHP
I met one question about affine transformation in
GUPNet/code/lib/datasets/kitti_utils.py
Line 398 in f4e2660
To get affine matrix, we need to know three pixel pairs.
why dst_dir = np.array([0, dst_w * -0.5], np.float32)
rather than dst_dir = np.array([dst_w * -0.5, 0], np.float32)
.
Namely, why use [width-axis,height-axis] to add [height-axis, width-axis]? This is too strange!!!
In my opinion, the second piexl is the left piexl, so we need np.array([dst_w * 0.5, dst_h * 0.5], np.float32) + np.array([dst_w * -0.5, 0], np.float32)
.
I am not sure if I misunderstand this transformation and wishing for your reply!
Hello, good job.
I notice many methods use a depth pre-training for better performance,
is your network pre-trained on a depth set first?
非常出色的工作,想问一下GUPNet++目前是否还有开源的计划,论文上的地址https://github.com/SuperMHP/GUPNet_Plus
已经无法打开了。
您好,复现您的代码,看log发现size3d_loss在大约7~8epoch的时候就开始为负,后面一直是负数,这个没问题吗?
Hello, @SuperMHP
Thanks for your great work on mono3d detection and congrats for its ICCV2021 acceptance. I am wondering when will you release the code?
Thanks for your great work!
I use the released code to retrain the network, while the results are strange.
I obtain:
[email protected] [29.223935810025136, 21.975801299792906, 19.0762136467218]
[email protected] [17.863160821111062, 12.961739635817185, 10.802839248636912]
where the AP_BEV is OK, but the AP_3D is considerably low. I tried three times and get similar results.
有试过backbone->resnet及配套的neck么,老铁?效果如何
Dear author,
Thanks for your wonderful work. I am following your repo to build a 3D detection framework. Do you mind telling me the difference between the results of original paper and released ckpt (especially on the test set) . And what causes the performance gap between them?
为什么用2d bbox来算深度,而不是用2d keypoints的检测来算深度
根据小孔成像原理,深度 = 物体的实际高度* f /物体的pixel高度,但是论文中却用的是物体的2d bbox来代替物体的pixel高度,这样不会导致误差吗?所以论文中对于深度的计算结果加了一个offset,是这样的吗?为什么不直接用keypoints的检测,然后利用keypoints算出pixel的高度,再来算深度呢?按照GUPnet这样的计算相比于利用keypoints检测的方法有什么优势?
论文中提到的三种提升(GeP, UnC, GeU)中的GeP到底是指什么?
UnC很好理解,就是用uncertainty来作confidence,和heatmap的confidence结合起来算物体的置信度,GeU可以说是整篇文章结合到一起就是GeU吧(利用了uncerttainty 计算的几何投影),但是GeP是指什么没明白。因为你的Paper里并不像MonoFlex或MonoCon那样有直接预测深度的分支,那就必须靠投影,所以GeP难道不是必须的吗?为什么Ablation study里面还专门列了一项GeP对该方法的提升?
麻烦解答,多谢~
May I ask how to build KITTI dataset structure and what folder to construct this structure?
Hello, you have done a great job!
I am some confused about the lib.datasets.kitti_utils.Calibration.flip function, which is used in your kitti dataloader. I am not sure about its function and result. when I try to use the flipped calib to back project some points from fliped image to the camera coordinate, I got very wrong result.
I wonder the lib.datasets.kitti_utils.Calibration class is written by yourself or referenced from some orther codebase, Thank you very much!!!
Thanks!
Great work!I am wondering when you will release your code?Thx
Hi, I have some questions about the uncertainty loss. Will the uncertainty loss be negative?
depth[i] = objects[i].pos[-1]
monodle
depth[i] = objects[i].pos[-1] * aug_scale
为什么不对深度进行比例缩放呢?monodle中对depth进行了缩放,这是为什么呢
您好,复现您的代码时发现size3d_loss后面一直是负数,这个会有影响吗?为什么会出现负数的情况?
Thanks for your great work!
I am now training the code under pytorch 1.10 and cuda 11.0, because I don't have a proper GPU that satisfies the environment in README. However, I got a much lower result in AP40 moderate: 13.69, compared to the given ckpt 16.23.
Do you have some ideas about why the performance deteriorate sharply under different environments?
Thanks very much
想問一下 在gupnet 中 train 在 KITTI 上的model 能否更改projection matrix 後 inference 在其他dataset 上像是nuScenes 或者waymo dataset 上
Hello, can you provide nuscenes pre-trained model?
I found that only KITTI on the repo, could you provide nuscenes version?
thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.