Comments (13)
@tonghe90 I have another question. If I use ICDAR2015, how to generate the data about "mask_gt" and "mask_iou_angle". Looking forward to your reply.
from textspotter.
@tonghe90 看完代码我发现文本识别部分的文本标签也包含在gt_bbox中。对于某个gt_bbox,其前8个元素表示bbox坐标,第9个元素表示文本标签的长度,从10开始的label_len个元素表示文本标签,这里完整的标签文本分为单个的元素,其类型是什么,是如何转换的?
layer {
name: 'iou_maps_angles'
type: 'Python'
bottom: 'gt_bbox'
top: 'rois'
top: 'sample_gt_cont'
top: 'sample_gt_label_input'
top: "sample_gt_label_output"
......
}
from textspotter.
mask_gt is generated only for dataset having character level annotation : Synthtext. Check section 2.3 of paper for training strategy.
mask_iou_angle is generated from output of East proposals in case of rbox (rotated rectangle bounding box) - Output of east is distances of pixel from sides of quadrilateral and angle in 5 channels.
sample_gt_cont is vector of shape of gt labels having zeroes and ones, used for continuity of hidden state of lstm : multiply 0 to hidden state, when start of predict new box, rest values 1.
sample_gt_label_input : one hot encoding or character embedding of each label from groundtruth - shape also used to pad max length of sequence when less than 25 .
sample_gt_label_output : similar as above but for during inference time. used to keep track of how many decoder samples to predict as fed into previous input.
Please correct me if i'm wrong ??
from textspotter.
@crazysal Thanks for your reply. I think you are right, and it helps me a lot.
from textspotter.
@crazysal Could you tell me how to deal with text labels, and what's the format of text label in gt_bbox?
from textspotter.
@crazysal 有没有成功复现训练部分的代码,我基于@tonghe的代码尝试复现训练部分的代码,但遇到segmentation fault的问题,
from textspotter.
@chunhui999 @crazysal 细看代码发现, 前面8个是坐标,第十个是标签长度, 第九个没用上,不知是不是我弄错了;python 里面元素下标从0开始的,
from textspotter.
@crazysal 数据层我修改了@argman的east python数据层, 我把loss_4s和iou_loss都注释掉了,只训练文字识别的softmaxloss; 但不知为何出现内存溢出的问题;不知你的数据层用什么代码编写的;不知你的数据层怎么编写的? 在@tonghe给的代码基础上,加上自己的数据层和iou_loss层是否就可以成功训练了?
from textspotter.
@wenston2006 下标索引你说的是对的,我之前忽略了这个问题。那么假设忽略第9个元素,其他的前移,那么你的gt_label格式是这样吗?(x1, y1, x2, y2, x3, y3, x4,y4, len, 't', 'e', 'x', 't')
from textspotter.
@chunhui999 我的理解是这样的,但我目前训练时遇到内存溢出(segmentation fault)的问题; 目前还不清楚是数据层还是别的层存在问题;
from textspotter.
@wenston2006 我也遇到了内存溢出的问题,应该是输入图片尺寸的问题,我把resize尺寸改小了一倍(参照之前测试当中遇到的内存溢出问题),就可以训练了。
from textspotter.
@wenston2006 请问你训练成功了吗?结果怎么样?
from textspotter.
请问如何能分享一下synthtext格式转换为icdar格式的脚本吗,谢谢鸭
from textspotter.
Related Issues (20)
- sorry, pressed Enter accidentally
- how about the time cost? HOT 2
- "gt_label" in tool_layers/gen_gts_layer HOT 2
- paramater "rf" HOT 1
- Can't read test_iou.pt but test_lstm.pt works HOT 2
- How to generate Binary masks ?
- Transpose layer breaks caffe installation HOT 1
- @tonghe90 训练数据的准备问题? HOT 1
- @tonghe90 如何准备text/non-text 掩膜数据? HOT 2
- @tonghe90 我这边没有12gb的显卡,可否用两块6gb的显卡(比如GTX1060)替代? HOT 3
- @tonghe90 train.pt中ignore_bbox是用来做什么的? HOT 3
- @tonghe90 请问训练时,文字识别的ground truth(文字的标签)在哪里输入? HOT 2
- 章型字体可以识别吗
- 可以识别中文吗? HOT 1
- Synthtext pre processing and table 2 accuracies HOT 2
- @tonghe90 输入的bbox大小有什么限制吗? HOT 1
- 请问如何进行端到端的训练 HOT 4
- 请问有人成功训练了吗?
- 请问有pytorch或者tensorflow的版本吗?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from textspotter.