Git Product home page Git Product logo

Comments (12)

yizt avatar yizt commented on June 18, 2024 1

@hcnhatnam 我的实现逻辑不是这样的;Ground Truth本身也做了分割;如果一个GT的x轴坐标[x1,x2]分别为[5.3,68.7],则会被分割(spilt)为如下5个GT: [5.3,16.] 、[16.,32.]、[32.,48.]、[48.,64]、 [64.,68.7] ;对于匹配中间3个GT的anchors,side-refinement 回归目标为0;只有匹配左右两边的gt才有side-refinement 回归目标;分别为
dx= ((5.3+16)-(0+16))/16; 对于匹配[5.3,16.]的anchors
dx= ((64.+68.7)-(64+72))/16; 对于匹配[64.,68.7]的anchors

from keras-ctpn.

hcnhatnam avatar hcnhatnam commented on June 18, 2024 1

sorry but Ground Truth is[x1,y1,x2,y2] why Ground Truth is [5.3,68.7]?i don't understand

from keras-ctpn.

yizt avatar yizt commented on June 18, 2024 1

@hcnhatnam 对, Ground Truth是四边形,坐标为[lt_x, lt_y, rt_x, rt_y, rb_x, rb_y, lb_x, lb_y];side-refinement只与x轴坐标相关,所以省略了y轴坐标

from keras-ctpn.

yizt avatar yizt commented on June 18, 2024 1

@hcnhatnam 你说的没错;这里的实现不是完全按照论文中的逻辑。
个人理解:论文中说x_side是预测与当前anchor最近的水平边x坐标;本身是比较模糊的,最邻近边可能是左边,也可能是右边;逻辑较为复杂,也比较绕。所以我按照中心点回归的**,直接将anchor的中心向GT的中心方向偏移;逻辑更简单,更一致;偏移的距离(anchor_cx-gt_cx) * 2; anchor_cx - gt_cx是中心点偏移的距离, (anchor_cx-gt_cx) * 2就是anchor移动到与gt重合的距离。最终尺度不变的回归目标就是
dx=(anchor_cx-gt_cx) * 2/w 恒等于 ((anchor_x1+anchor_x2) - (gt_x1 + gt_x2))/w

from keras-ctpn.

yizt avatar yizt commented on June 18, 2024

@hcnhatnam 感谢指正,已更正;翻译为"侧边细化"

from keras-ctpn.

hcnhatnam avatar hcnhatnam commented on June 18, 2024

I asked but not yet answered, did you implement side-refinement like that?
Screenshot from 2019-04-08 11-23-57

from keras-ctpn.

hcnhatnam avatar hcnhatnam commented on June 18, 2024

i understaned.I really appreciate you.But i think dx= ((64.+68.7)-(64+70))/16 not 72

from keras-ctpn.

yizt avatar yizt commented on June 18, 2024

@hcnhatnam 应该是dx= ((64.+68.7)-(64+80))/16 ;(* ̄︶ ̄)

from keras-ctpn.

hcnhatnam avatar hcnhatnam commented on June 18, 2024

ohh... ok ok

from keras-ctpn.

hcnhatnam avatar hcnhatnam commented on June 18, 2024

@yizt I think you were a bit confused.
Screenshot from 2019-04-08 21-07-38
In the paper: we are considering O*

  • dx(of first anchor)=O*=(5.3-8)/16 ;(8=(0+16)/2=Cax= center of anchor in x-axis)
  • dx(of last anchor)=O*=(68.7-56)/16 ;(72=(64-80)/2=Cax= center of anchor in x-axis)

from keras-ctpn.

NamNguyenThanh avatar NamNguyenThanh commented on June 18, 2024

Hi @yizt,
I understood what you implemented for side-refinement. But in your result on ICDAR 2015, I think that not only effect on the head and tail anchors of text line ground truth (refine < 16 pixels) but also more than 16 pixels (ex: below picture)
56456128_1988716384756644_1062434923161321472_n

from keras-ctpn.

yizt avatar yizt commented on June 18, 2024

@NamNguyenThanh 感谢您的反馈!有两个方面原因:
a) 虽然x坐标真正的偏移应该在(-16,16); 训练样本的回归目标都是这样的,所以理论上超出16个像素的概率应该很小。但是网络并没有增加明确约束限制在16个像素内;所以预测时有可能超出16个像素。
b) 网络的输入是720*720; 这里可视化使用pyplot保存后图像是1600*1600; 宽度16也是对于720*720来说的, 所以例子中图像偏移应该也没有超过16

from keras-ctpn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.