Git Product home page Git Product logo

pytorch_ctpn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

pytorch_ctpn's Issues

识别效果

您最后实现的安卓OCR如果直接给一整张运单效果怎么样?我这边如果像您demo里一样取一小块效果还行,但是一整张运单,加上褶皱模糊污渍等效果简直不能看,您有解决这方面的问题吗

ctpn_train batch=1 change to 128 get error

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
checkpoints_weight = args['pretrained_weights']
if os.path.exists(checkpoints_weight):
pretrained = False

dataset = VOCDataset(args['image_dir'], args['labels_dir'])
dataloader = DataLoader(dataset, batch_size=**128**, shuffle=True, num_workers=args['num_workers'])
model = CTPN_Model()
model.to(device)

Traceback (most recent call last):
File "ctpn_train.py", line 91, in
for batch_i, (imgs, clss, regrs) in enumerate(dataloader):
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
RuntimeError: Traceback (most recent call last):
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 232, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 232, in
return [default_collate(samples) for samples in transposed]
File "/home/zhanglijun/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 209, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 562 and 600 in dimension 2 at /pytorch/aten/src/TH/generic/THTensorMoreMath.cpp:1333

转成onnx和mnn

你好,请问有尝试过转成onnx和mnn吗?是否支持

RuntimeError: index 24910 is out of bounds for dimension 0 with size 7840

shape is: torch.Size([3, 448, 448]) torch.Size([1, 25000]) torch.Size([25000, 3])
Traceback (most recent call last):
File "ctpn_train.py", line 105, in
loss_cls = critetion_cls(out_cls, clss)
File "/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/mnt/lustre/chenjinsheng/project/text_detection/pytorch_ctpn/ctpn_model.py", line 54, in forward
cls_pred = input[0][cls_keep]

RuntimeError: index 24910 is out of bounds for dimension 0 with size 7840

这个index对不上怎么解决呀

android ocr

Could you share the code of android?
thank you !

vs2019 调试五轮,ep05_0.0278_0.0990_0.1268 就很难下降了

对代码做了一些改动,发现原始代码 loss 很难下降,是计算 loss 时维度出了问题,二者不一样,导致第二轮就很难收敛;我也稍做了一些改动,但到 0.0278_0.0990_0.1268 就下不去了。
想问下,你真调到了 0.07 以下吗?
ctpn_ep01_0.0411_0.1523_0.1934.pth
ctpn_ep02_0.0309_0.1167_0.1477.pth
ctpn_ep03_0.0282_0.1071_0.1353.pth
ctpn_ep04_0.0279_0.1046_0.1325.pth
ctpn_ep05_0.0278_0.0990_0.1268.pth
这是五轮结果(^_^)

求助大大

如果一个证件上只有一些文字我想要识别出来,而对其他对文本内容我不关心的话,那在准备数据集标注数据时候是否也只需要标注那些我感兴趣的文字进行训练,
还有个问题,标注的时候是基于原图的尺寸的坐标,但是训练进网络的时候就会resize统一到一样的尺寸,是不是就有问题?在标注的时候就要把图像都整理成统一的大小?

请问一下显存炸了怎么办

小白一枚,作者给的链接都炸了,数据集我用的icdar2013转成了voc。请问是什么问题,还能提供一下原数据集吗

有关损失函数问题

您好,论文中的最终loss是三个不同的损失函数相加等得到,而我看您的是RPN_CLS_Loss、RPN_REGR_Loss两个损失函数相加,请问我是有什么理解不到位的地方吗?

请问使用的数据集图片尺寸是多少呢,predict失败,无文本框

找不到原数据集就使用了ICDAR2013,由于原尺寸跑这个代码会oom,因此做了resize成640*480。train之后达到这个水准,ctpn_ep18_0.0710_0.1520_0.2228.pth.tar。
接下来使用图片进行predict时,输出没有文本框,代码中print(text)输出为[ ](空?)。是否训练失败,检测不到文本框?也许跟数据集图片尺寸有关

性能如何?

您好,麻烦请问pytorch版的ctpn性能如何,结合crnn,识别一张图片大概需要多长时间,有用到GPU还是在CPU上计算的?
我在用CRNN的时候,发现CPU版的耗时很长,一张身份证图片,crnn得十几秒(32core E5 v3的cpu),不知道是什么原因。
您在实际使用的时候,FPS大概能达到多少?

O(∩_∩)O谢谢

请问一下,有人遇到过这个问题吗,网上说是标签的越界,是在损失函数里面报的错。

`RPN_REGR_Loss Exception: copy_if failed to synchronize: device-side assert triggered
Traceback (most recent call last):
File "./PyTorch/ctpn_/utils/loss/loss_utils.py", line 31, in forward
reg_keep = (cls == 1).nonzero()[:, 0]
RuntimeError: copy_if failed to synchronize: device-side assert triggered

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./PyTorch/ctpn_/train.py", line 98, in
train()
File "./PyTorch/ctpn_/train.py", line 75, in train
loss_reg = criterion_reg(out_reg, reg_s)

loss = torch.tensor(0.0)

RuntimeError: CUDA error: device-side assert triggered`

training-data sample

@opconty
What type of labels are required for the training-data?
Can you provide a sample of the training data?

Waiting for your reply

数据集

您好,数据集连接失效了,能帮忙重新发一下吗

Reason for loss does not falling

Thank you for sharing your project !
There is no problem with the code logic. The reason why the loss does not fall may be that the positive and negative samples are not balanced, and it is not known whether the author makes adjustments to the ratio of positive and negative samples.

pretrained weights

请问pretrained weights现在如何获取呢?数据集如何获取呢

数据集

您好,请问您安卓运单OCR这个项目里用的数据集是什么,能分享吗

requirements.txt & environment.yml

@opconty

  • Can you upload requirements.txt which you can create using the command pip freeze
  • Also environment.yml which you can create using conda env export > environment.yml

Push push your requirements.txt

Please push requirements.txt to the repo. Then the others can be able to follow your project.
Or at least show me your input pip freeze

gt处理问题

在处理gt的时候不需要将gt进行拆分处理吗?就是切分成长度为16的小框作为gt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.