chenjun2hao / attention_ocr.pytorch Goto Github PK

This repository implements the the encoder and decoder model with attention model for OCR

Python 100.00%

attention-model ocr pytorch attentionocr

attention_ocr.pytorch's Issues

解码器权重加载写成加载编码器了

if opt.decoder:
    print('loading pretrained encoder model from %s' % opt.decoder)
    encoder.load_state_dict(torch.load(opt.encoder))

上面这段代码应该是加载decoder, 但其实加载成了encoder,会导致后面测试的时候全是错的

不定长的识别问题

你好，用您提供的开源模型进行不定长测试，有这两种问题：
1.图片不定长：
transformer = dataset.resizeNormalize((280, 32))，非280会报错，CRNN的处理是按照32的高然后同比例缩放图片的宽，因此输入是（x,32）
2.文字不定长：
可能是因为训练的时候都是10个字，预测的时候不管图片里面几个字，预测结果还都是10个字左右？

举个例子，把图片

中的字去掉几个后，还是280*32输入识别，

结果是这样：
predict_str:，__不愿意意意资（9个字） => prob:0.002346405293792486

predict_str:**通信信位主办、《 (10个字) => prob:0.05960559844970703

predict_str:，（通信学会主主府（9个字） => prob:0.000349084148183465

predict_str:叶国通信学会主里”《（10个字） => prob:0.01799328438937664

想问如何解决？是不是训练需要不定长训练啊？谢谢~

您好，大佬问您一个问题，为什么attention解码训练的时候，都要重置 decoder_hidden = decoder.initHidden(b).cuda()参数呢

您好，大佬问您一个问题，为什么attention解码训练的时候，都要重置 decoder_hidden = decoder.initHidden(b).cuda()参数呢，我的理解应该是编码层输出会有一个decoder_hidden 参数啊，大佬可以解答一下吗？ @chenjun2hao

多行文本使用attention能训练吗

1.固定2行,第一行4个字,第2行7个字,在不分隔的情况下能使用attention训练吗

2.试了ctc不行

Class Attention()中的text_length具体是指什么

字典文件char_std_5990.txt找不到

你好，想测试下，但是提示字典文件char_std_5990.txt找不到

图片位置和decoder_5.pth位置

Class Attention()中的text_length具体是指什么

关于损失函数的问题，CRNN的损失函数不是CTC loss吗？为什么你的代码是NULLloss的？我刚入门不太清楚，望解答谢谢啦~

如题

报错KeyError:' '

你好，我在运行你的代码时候报错KeyError:' '，这种是怎么回事呀？
if isinstance(text, str):
text = [self.dict[item] for item in text]

Increasing batch_size of validation set throws tensor size mismatch error

GO 和END_TOKEN？

这里面训练有加GO(START_TOKEN)和END_TOKEN么？我只在crnn.lang中看到target_txt_decode有加，但是这个函数没有被调用到。
data = val_iter.next()
cpu_images, cpu_texts = data
...
target_variable = converter.encode(cpu_texts)
target_variable = target_variable.cuda()
decoder_input = target_variable[0].cuda()
这里decoder_input val的decode_input(no_teach_forcing)应该是一个GO(START_TOKEN)，看上去它调用的是一个cpu_texts的第一个字吧?

你好，非常感谢你的代码，我正在参考它理解Attention-OCR，但是我有一些不明白的地方，

我想知道“教师强制：将目标label作为下一个输入”是在干什么？

decoder每次预测一个字符，这样是不是很慢

python3 change

Great Thanks for sharing the code!

I found that this code must have been developed with python2.7.

In order to do experiments with python 3.x, I had to change some parts that dealing with unicode & utf-8.

Following is what I did.
dataset.py:
label = line_splits[1]#.decode('utf-8')
utils.py (line 53):
if isinstance(text, str): # python3 string default is unicode #unicode):

ref: https://stackoverflow.com/questions/4987327/how-do-i-check-if-a-string-is-unicode-or-ascii

thanks again for code sharing. It is very much helpful for studying DNNs.

load decoder_path error

hi, thanks your excellent job, I meet the error:
RuntimeError: Error(s) in loading state_dict for decoder:
size mismatch for decoder.embedding.weight: copying a param of torch.Size([17765, 256]) from checkpoint, where the shape is torch.Size([5992, 256]) in current model.
size mismatch for decoder.out.bias: copying a param of torch.Size([17765]) from checkpoint, where the shape is torch.Size([5992]) in current model.
size mismatch for decoder.out.weight: copying a param of torch.Size([17765, 256]) from checkpoint, where the shape is torch.Size([5992, 256]) in current model.

it looks like model your list for inference has size error. so, how to fix it.

模型准确率的问题

你好，我想问一下，我重新运行了一下你的代码但是21轮epoch之后，识别的准确率还是很低，达不到你所给出的效果。您觉得，这可能与什么原因有关呢？

超参数设置

请问作者超参设置是程序默认值吗，大概训练多少epoch模型收敛？

训练报错 AttributeError: 'str' object has no attribute 'decode'

AttributeError Traceback (most recent call last)
~/Attention_ocr.pytorch-master/train.py in
9 import numpy as np
10 import os
---> 11 import src.utils as utils
12 import src.dataset as dataset
13 import time

~/Attention_ocr.pytorch-master/src/utils.py in
16 data = f.readlines()
17 alphabet = [x.rstrip() for x in data]
---> 18 alphabet = ''.join(alphabet).decode('utf-8') # python2不加decode的时候会乱码
19
20
调用decode时候报错

前几个loss正常，为什么后面的损失都不正常，从5.多到0.1

感觉就取了前几张训练，loss下降的很快，然后test都为空

encoder和decoder使用两次optimizer是为了更好的收敛吗？

AssertionError: Torch not compiled with CUDA enabled 请问，这个程序一定要在GPU上跑吗？

作者你好，我的电脑只有CPU，没有GPU，也没有安装CUDA，Ubuntu环境。请问能够正常运行这个程序吗？我试着运行了几次都是失败了。查找原因好像是需要CUDA，可是我没有GPU，还能够有什么方法让程序继续正常运行吗？

楼主可以分享一下预训练的模型么？

预训练模型什么时候出来？

不定长测试图片

你好，
我目前还没有条件运行你的程序。
我想先问一下，这个模型可以识别长一点的文本行图片么？
我看了demo程序，里面有设置最大字符个数15，这个值是固定的么？
谢谢。

inference 能改为批量预测吗

现在inference只能一张一张预测，比较慢

'unexpected key "cnn.7.num_batches_tracked" in state_dict'

运行demo.py的时候，出错？
loading pretrained models ......
Traceback (most recent call last):
File "demo.py", line 34, in
encoder.load_state_dict(torch.load(encoder_path))
File "/usr/lib64/python2.7/site-packages/torch/nn/modules/module.py", line 522, in load_state_dict
.format(name))
KeyError: 'unexpected key "cnn.7.num_batches_tracked" in state_dict'

chenjun2hao / attention_ocr.pytorch Goto Github PK

attention_ocr.pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org