opconty / pytorch_ctpn Goto Github PK

This is a pytorch implementation of CTPN(Detecting Text in Natural Image with Connectionist Text Proposal Network). You may want to finetune from: https://drive.google.com/open?id=1JHhI4sEIXfs5gDa1I9AgJBY477HTzAd0

Home Page: https://mp.weixin.qq.com/s/VO42GzwwJBOabpPJOWVn4g

Python 100.00%

pytorch_ctpn's People

Stargazers

Watchers

Forkers

zhuchangjiang xgmiao xiesibo hhgxx123 fireae 10183308 happog xiedidan 980044579 ocelot7777 wangjianyuweg alwc wangshuai66666 youngyoung611 courao jeffrey98-ai juandai8401 luwei6896 liuwenhaha myhub dssone aliborji xiiiiiiii liangzz1991 uptodiff dominirong liulei13 fangjunwang missyangx tensorflow-pool jamesbondzhou shualite jainszhang axia75 eugdou fxwfzsxyq kingwpf challenging6 cltdevelop chenxinpeng minyuping kapitsa2811 yingao4937 hwwehouse magnetstone phybrain yfcloud doublecake xisi789 17350220163 cfireworks jiaodalpp zlszhonglongshen hixuehai ygest leoli08 dlreseach deeptoby xiaocmxiao xwjbupt kunlaotou jeozhao mael-zys hell-to-heaven bushou-yhh koni0626 lqyiii 0sxw0 duc050297 dokyeongk dreamerdoremi wuxiaolianggit regulusddj aiedward lie-huo hsome2020 yxandam alan-l-ricardo brahimbellahcen abcxs ghul-huan ada-gc vcasecnikovs woodyyyyyyyy defenseless beijing-penguin brucegai kevin03-16 colin3dmax shmtu-herven yon-023 henlis jjz-learning liuwenfu1218 euphoriayan phoebe731178 lizhenqi111 liuxiaoxiao666 knavezl liushu309

pytorch_ctpn's Issues

识别效果

您最后实现的安卓OCR如果直接给一整张运单效果怎么样？我这边如果像您demo里一样取一小块效果还行，但是一整张运单，加上褶皱模糊污渍等效果简直不能看，您有解决这方面的问题吗

ctpn_train batch=1 change to 128 get error

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
checkpoints_weight = args['pretrained_weights']
if os.path.exists(checkpoints_weight):
pretrained = False

dataset = VOCDataset(args['image_dir'], args['labels_dir'])
dataloader = DataLoader(dataset, batch_size=**128**, shuffle=True, num_workers=args['num_workers'])
model = CTPN_Model()
model.to(device)

Traceback (most recent call last):
File "ctpn_train.py", line 91, in
for batch_i, (imgs, clss, regrs) in enumerate(dataloader):
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
RuntimeError: Traceback (most recent call last):
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 232, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 232, in
return [default_collate(samples) for samples in transposed]
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 209, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 562 and 600 in dimension 2 at /pytorch/aten/src/TH/generic/THTensorMoreMath.cpp:1333

voc2007复现失败

训练了19个epoch，然后测试，完美失败，没一行文字能找到……
https://s1.ax1x.com/2020/04/19/JMiPyV.png

数据集连接失效了，能重新发一下吗

转成onnx和mnn

你好，请问有尝试过转成onnx和mnn吗？是否支持

RuntimeError: index 24910 is out of bounds for dimension 0 with size 7840

shape is: torch.Size([3, 448, 448]) torch.Size([1, 25000]) torch.Size([25000, 3])
Traceback (most recent call last):
File "ctpn_train.py", line 105, in
loss_cls = critetion_cls(out_cls, clss)
File "/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/mnt/lustre/chenjinsheng/project/text_detection/pytorch_ctpn/ctpn_model.py", line 54, in forward
cls_pred = input[0][cls_keep]

RuntimeError: index 24910 is out of bounds for dimension 0 with size 7840

这个index对不上怎么解决呀

android ocr

Could you share the code of android?
thank you !

vs2019 调试五轮，ep05_0.0278_0.0990_0.1268 就很难下降了

对代码做了一些改动，发现原始代码 loss 很难下降，是计算 loss 时维度出了问题，二者不一样，导致第二轮就很难收敛；我也稍做了一些改动，但到 0.0278_0.0990_0.1268 就下不去了。
想问下，你真调到了 0.07 以下吗？
ctpn_ep01_0.0411_0.1523_0.1934.pth
ctpn_ep02_0.0309_0.1167_0.1477.pth
ctpn_ep03_0.0282_0.1071_0.1353.pth
ctpn_ep04_0.0279_0.1046_0.1325.pth
ctpn_ep05_0.0278_0.0990_0.1268.pth
这是五轮结果(^_^)

求助大大

如果一个证件上只有一些文字我想要识别出来，而对其他对文本内容我不关心的话，那在准备数据集标注数据时候是否也只需要标注那些我感兴趣的文字进行训练，
还有个问题，标注的时候是基于原图的尺寸的坐标，但是训练进网络的时候就会resize统一到一样的尺寸，是不是就有问题？在标注的时候就要把图像都整理成统一的大小？

number of epochs?

How many times did the pre-trained model spin the epoch?

can not log on dropbox! can you give a baidu yunpan link

loss=0.44 on the SROIE dataset

@opconty
I have been training for 1 day on the SROIE dataset., the loss is still 0.44 !
It works well on other datasets, but not the SROIE dataset?
Am I doing something wrong?
dataset download link

你们的batch_size 都是1吗？

为什么batch size 是1，修改成其它都报错，还需要修改哪里？谢谢指教

训练时候12g显存爆了..

RT. Orz

请问一下显存炸了怎么办

小白一枚，作者给的链接都炸了，数据集我用的icdar2013转成了voc。请问是什么问题，还能提供一下原数据集吗

请问一下数据标签是什么呢就是用lebelimg框出的字符区域吗

有关损失函数问题

您好，论文中的最终loss是三个不同的损失函数相加等得到，而我看您的是RPN_CLS_Loss、RPN_REGR_Loss两个损失函数相加，请问我是有什么理解不到位的地方吗？

请问使用的数据集图片尺寸是多少呢，predict失败，无文本框

找不到原数据集就使用了ICDAR2013，由于原尺寸跑这个代码会oom，因此做了resize成640*480。train之后达到这个水准，ctpn_ep18_0.0710_0.1520_0.2228.pth.tar。
接下来使用图片进行predict时，输出没有文本框，代码中print(text)输出为[ ]（空？）。是否训练失败，检测不到文本框？也许跟数据集图片尺寸有关

pretrained weights is unavailable

能再分享一下你训练好的模型吗

the dropbox link is invalid now

could you update it once again. many thanks.

性能如何？

您好，麻烦请问pytorch版的ctpn性能如何，结合crnn，识别一张图片大概需要多长时间，有用到GPU还是在CPU上计算的？
我在用CRNN的时候，发现CPU版的耗时很长，一张身份证图片，crnn得十几秒(32core E5 v3的cpu)，不知道是什么原因。
您在实际使用的时候，FPS大概能达到多少？

O(∩_∩)O谢谢

是否可以选择开源协议以利于共享

哪里有数据集

请问一下，有人遇到过这个问题吗，网上说是标签的越界，是在损失函数里面报的错。

`RPN_REGR_Loss Exception: copy_if failed to synchronize: device-side assert triggered
Traceback (most recent call last):
File "./PyTorch/ctpn_/utils/loss/loss_utils.py", line 31, in forward
reg_keep = (cls == 1).nonzero()[:, 0]
RuntimeError: copy_if failed to synchronize: device-side assert triggered

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./PyTorch/ctpn_/train.py", line 98, in
train()
File "./PyTorch/ctpn_/train.py", line 75, in train
loss_reg = criterion_reg(out_reg, reg_s)

loss = torch.tensor(0.0)

RuntimeError: CUDA error: device-side assert triggered`

ctpn_train.py: error: argument --num-workers: invalid int value: 'num_workers'

training-data sample

@opconty
What type of labels are required for the training-data?
Can you provide a sample of the training data?

Waiting for your reply

请问能提供一下训练好的权重文件吗？

为什么没有side-refinement分枝？

数据集

您好，数据集连接失效了，能帮忙重新发一下吗

作者您好！请问下可否提供下训练好的weights？期待您的回复~

作者您好！请问下可否提供下训练好的weights？期待您的回复！如有可能，可以将训练好的weights发送至 [email protected]，感谢

Reason for loss does not falling

Thank you for sharing your project !
There is no problem with the code logic. The reason why the loss does not fall may be that the positive and negative samples are not balanced, and it is not known whether the author makes adjustments to the ratio of positive and negative samples.

Can you upload requirements.txt which you can create using the command pip freeze
Also environment.yml which you can create using conda env export > environment.yml

x1 = x.permute(0,2,3,1).contiguous() # channels last

gt处理问题

在处理gt的时候不需要将gt进行拆分处理吗？就是切分成长度为16的小框作为gt

opconty / pytorch_ctpn Goto Github PK

pytorch_ctpn's People

Stargazers

Watchers

Forkers

pytorch_ctpn's Issues

Recommend Projects

Recommend Topics

Recommend Org