Git Product home page Git Product logo

Comments (8)

yexiguafuqihao avatar yexiguafuqihao commented on August 11, 2024

As described in the paper, A proposal generally need to infer 2 different detections, and there are 2 different combination choices among the detections and their targets, namely det_loss((B_0, G_0)) + det_loss((B_1, G_1))), det_loss((B_0, G_1)) + det_loss((B_1, G_0)), then a minimum operation performs on those losses to select the minimum one for updating parameters. But we hadn't noticed your hypothesis that a detection head may produce a loss largely higher than the other one. It can be attributed to the reason that a normalizer will scale the loss what is identical to the general detection loss functions.

However, what we have observed is that sometimes the first head will produce the majority of true positives while the second head infers the supplemental true positives, and sometimes the second one will predict the majority of true positives, and the first one supplements the rest of true positives. For the reason that the number of ground-truth for training the first head and the second head is different during the label assignment strategy, but the EMD loss has the property of permutation-invariance.

from crowddet.

Eagle104fred avatar Eagle104fred commented on August 11, 2024

As described in the paper, A proposal generally need to infer 2 different detections, and there are 2 different combination choices among the detections and their targets, namely det_loss((B_0, G_0)) + det_loss((B_1, G_1))), det_loss((B_0, G_1)) + det_loss((B_1, G_0)), then a minimum operation performs on those losses to select the minimum one for updating parameters. But we hadn't noticed your hypothesis that a detection head may produce a loss largely higher than the other one. It can be attributed to the reason that a normalizer will scale the loss what is identical to the general detection loss functions.

However, what we have observed is that sometimes the first head will produce the majority of true positives while the second head infers the supplemental true positives, and sometimes the second one will predict the majority of true positives, and the first one supplements the rest of true positives. For the reason that the number of ground-truth for training the first head and the second head is different during the label assignment strategy, but the EMD loss has the property of permutation-invariance.

所以关键部分就是如何分配G_0和G_1我看您是通过iou评分来划分的, G_0为iou评分最高的gt框而G_1为iou次高的gt框, 但是在yolov5系列算法中, gt框是直接通过wh的比值来确定去留的, 直观的分出两类gt。我还有一个疑问, 直接通过iou来划分G_0和G_1的话是否也会存在同一个label同时被两个头预测的情况呢?我想此系统在Deepsort上工作, 重叠的检测框会带给我困扰

from crowddet.

yexiguafuqihao avatar yexiguafuqihao commented on August 11, 2024

在同类遮挡的情况下,两个检测头会预测相同的label,这种情况下2个检测头预测的label虽然相同,但是检测框分别对应2个不同的同类物体.如果不存在遮挡情况,那么只会有一个检测头预测一个label,剩下的一个检测头对应的label是背景;

from crowddet.

Eagle104fred avatar Eagle104fred commented on August 11, 2024

在同类遮挡的情况下,两个检测头会预测相同的label,这种情况下2个检测头预测的label虽然相同,但是检测框分别对应2个不同的同类物体.如果不存在遮挡情况,那么只会有一个检测头预测一个label,剩下的一个检测头对应的label是背景;

但是按照setNMS的机制如果同一层的一个框被选中, 另一个背景框岂不是也会被强制复活?

from crowddet.

yexiguafuqihao avatar yexiguafuqihao commented on August 11, 2024

所以在进行set NMS的时候会卡一个置信度阈值0.05,在一般情况下另一个背景框会被移除.如果背景框在卡阈值阶段无法被移除,最后也只能被强行复活成为FP.所以我们设计了refinement module.但这种情况一般很少出现.

from crowddet.

Eagle104fred avatar Eagle104fred commented on August 11, 2024

所以在进行set NMS的时候会卡一个置信度阈值0.05,在一般情况下另一个背景框会被移除.如果背景框在卡阈值阶段无法被移除,最后也只能被强行复活成为FP.所以我们设计了refinement module.但这种情况一般很少出现.

了解了,十分感谢您解答我的疑问,谢谢

from crowddet.

xarryon avatar xarryon commented on August 11, 2024

如果一个proposal的label assignment按照这种分配方式的话,会不会存在一种潜在的问题呢?就是proposal的第一部分的特征和输出结果其实与GT2相关度比较高,但是分给他的label是GT1,从而影响了模型训练

from crowddet.

Eagle104fred avatar Eagle104fred commented on August 11, 2024

from crowddet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.