Comments (12)
@hcnhatnam 我的实现逻辑不是这样的;Ground Truth本身也做了分割;如果一个GT的x轴坐标[x1,x2]分别为[5.3,68.7],则会被分割(spilt)为如下5个GT: [5.3,16.] 、[16.,32.]、[32.,48.]、[48.,64]、 [64.,68.7] ;对于匹配中间3个GT的anchors,side-refinement 回归目标为0;只有匹配左右两边的gt才有side-refinement 回归目标;分别为
dx= ((5.3+16)-(0+16))/16; 对于匹配[5.3,16.]的anchors
dx= ((64.+68.7)-(64+72))/16; 对于匹配[64.,68.7]的anchors
from keras-ctpn.
sorry but Ground Truth is[x1,y1,x2,y2] why Ground Truth is [5.3,68.7]?i don't understand
from keras-ctpn.
@hcnhatnam 对, Ground Truth是四边形,坐标为[lt_x, lt_y, rt_x, rt_y, rb_x, rb_y, lb_x, lb_y];side-refinement只与x轴坐标相关,所以省略了y轴坐标
from keras-ctpn.
@hcnhatnam 你说的没错;这里的实现不是完全按照论文中的逻辑。
个人理解:论文中说x_side是预测与当前anchor最近的水平边x坐标;本身是比较模糊的,最邻近边可能是左边,也可能是右边;逻辑较为复杂,也比较绕。所以我按照中心点回归的**,直接将anchor的中心向GT的中心方向偏移;逻辑更简单,更一致;偏移的距离(anchor_cx-gt_cx) * 2; anchor_cx - gt_cx是中心点偏移的距离, (anchor_cx-gt_cx) * 2就是anchor移动到与gt重合的距离。最终尺度不变的回归目标就是
dx=(anchor_cx-gt_cx) * 2/w 恒等于 ((anchor_x1+anchor_x2) - (gt_x1 + gt_x2))/w
from keras-ctpn.
@hcnhatnam 感谢指正,已更正;翻译为"侧边细化"
from keras-ctpn.
I asked but not yet answered, did you implement side-refinement like that?
from keras-ctpn.
i understaned.I really appreciate you.But i think dx= ((64.+68.7)-(64+70))/16 not 72
from keras-ctpn.
@hcnhatnam 应该是dx= ((64.+68.7)-(64+80))/16 ;(* ̄︶ ̄)
from keras-ctpn.
ohh... ok ok
from keras-ctpn.
@yizt I think you were a bit confused.
In the paper: we are considering O*
- dx(of first anchor)=O*=(5.3-8)/16 ;(8=(0+16)/2=Cax= center of anchor in x-axis)
- dx(of last anchor)=O*=(68.7-56)/16 ;(72=(64-80)/2=Cax= center of anchor in x-axis)
from keras-ctpn.
Hi @yizt,
I understood what you implemented for side-refinement. But in your result on ICDAR 2015, I think that not only effect on the head and tail anchors of text line ground truth (refine < 16 pixels) but also more than 16 pixels (ex: below picture)
from keras-ctpn.
@NamNguyenThanh 感谢您的反馈!有两个方面原因:
a) 虽然x坐标真正的偏移应该在(-16,16); 训练样本的回归目标都是这样的,所以理论上超出16个像素的概率应该很小。但是网络并没有增加明确约束限制在16个像素内;所以预测时有可能超出16个像素。
b) 网络的输入是720*720; 这里可视化使用pyplot保存后图像是1600*1600; 宽度16也是对于720*720来说的, 所以例子中图像偏移应该也没有超过16
from keras-ctpn.
Related Issues (20)
- debug code HOT 2
- I load pretrained model and met error: ValueError: Layer #142 (named "gru_forward"), weight <tf.Variable 'gru_forward_2/kernel:0' shape=(512, 192) dtype=float32_ref> has shape (512, 192), but the saved weight has shape (1024, 192). HOT 2
- Result when evaluate Pre-training model(you shared in drive link) HOT 1
- Are you missing a convolution 3x3 layer in network? HOT 5
- 多GPU训练 HOT 1
- predict HOT 7
- 请问跑了多少个epoch,最后loss是多少呢 HOT 1
- 关于数据集所用的标注工具 HOT 1
- Can i change the shape of input image? This model still work well, right? HOT 2
- Change config.py file
- 后处理速度非常慢,有没有什么优化的地方 HOT 1
- 求助
- 看源码时有一个疑问 HOT 2
- 关于多分类的问题 HOT 1
- 训练模型时的问题
- 预测图片生成结果 HOT 1
- 能否提供模型的国内下载方式?谢谢 HOT 2
- text_proposals.py文件apply_regress函数的侧边精调代码是不是有问题?第38行 HOT 2
- target.py正样本问题?不同的gt选择同一个anchor? HOT 2
- 请求requirements.txt HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from keras-ctpn.