shawnh2 / bankcard-recognizer Goto Github PK

View Code? Open in Web Editor NEW

78.0 7.0 31.0 57.22 MB

Identifying numbers from bankcard, based on Deep Learning with Keras

License: MIT License

Python 100.00%

keras deep-learning ocr cnn-blstm-ctc pyqt5 east crnn vgg

bankcard-recognizer's Introduction

BankCard-Recognizer

Extracting numbers from bankcard, based on Deep Learning with Keras.

Including auto and manual location, number identification, with GUI.

中文BLOG: 点击此处链接

Roadmap

cnn_blstm_ctc
EAST/manual locate
GUI

Requirements

Python == 3.6

pip install requirements

Environment

Windows10 x64, Anaconda, PyCharm 2018.3, NVIDIA GTX 1050.

Usage

Download trained model, CRNN extracting-code:6eqw, EAST extracting-code:qiw5.
Then put CRNN model into crnn/model, put EAST model into east/model.
Run python demo.py.
In GUI, press Load button to load one image about bankcard or load from dataset/test/.
Press Identify button, it will start locate and do identification.
Activate manual location by double click the Image view, then draw the interest area and press Identify.

Training

Prepare

Download my dataset, CRNN extracting-code:1jax, EAST extracting-code:pqba. and unzip dataset in ./dataset.

The structure of dataset looks like:

- dataset
  - /card          # for east
  - /crad_nbr      # for crnn
  - /test
  ...

for CRNN

Run python crnn/preprocess.py.
Run python crnn/run.py to train, and you can change some parameters in crnn/cfg.py.

for EAST

My dataset is collecting from Internet: Baidu, Google, and thanks Kesci. It has been labeld with ICDAR 2015 format, you can see it in dataset/card/txt/. This tiny dataset is unable to cover all the situation, if you have rich one, it may perform better.
If you would like to get more data, make sure data has been labeled, or you can take dataset/tagger.py to label it.
Modify east/cfg.py, see default values.
Run python east/preprocess.py. If process goes well, you'll see generated data like this:

Finally, python east/run.py.

About

cnn_blstm_ctc

The model I used, refer to CNN_RNN_CTC. The CNN part is using VGG, with BLSTM as RNN and CTC loss.

The model's preview:

EAST/manual locate

Auto locate is using one famous Text Detection Algorithm - EAST. See more details.

In this project, I prefer to use AdvancedEAST. It is an algorithm used for Scene-Image-Text-Detection, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate. Original repo see Reference 1.

Also, training process is quiet quick and nice. As practical experience, img_size is better to be 384. The epoch_nbr is no longer important any more, for img_size like 384, usually training will early stop at epoch 20-22. But if you have a large dataset, try to play with these parameters.

This model's preview:

Manual locate is only available in GUI. Here're some performance in .gif:

GUI

Using QtDesigner to design UI, and PyQt5 to finish other works.

Statement

The bankcard images are collecting from Internet, if any images make you feel uncomfortable, please contact me.
If you have any issues, post it in Issues.

Reference

Advanced EAST - https://github.com/huoyijie/AdvancedEAST
EAST model - https://www.cnblogs.com/lillylin/p/9954981.html

bankcard-recognizer's People

Contributors

Stargazers

Watchers

bankcard-recognizer's Issues

修改batch size

你好！我想请问下为什么修改batch size之后报tensorflow.python.framework.errors_impl.InvalidArgumentError: Not enough time for target transition sequence (required: 4, available: 3)1You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs
这个错误？
谢谢

找不到img_24.png,384,384

整型转换

res.append(int(ch))

ValueError: invalid literal for int() with base 10: 'i'
0%| | 0/548 [00:00<?, ?it/s]
这里会报错怎么不办

识别模型loss

请问您的识别模型当初训练了多久呀，我基于预训练的参数，loss最低降到2.6左右，之后上升。最终early stop。
训练有什么技巧吗。
谢谢。

有关于训练的loss

您好，我想请问一下，您项目里面的east模型训练到最后平均loss是多少

CRNN模型loss不下降

感谢楼主的分享，看到您当前提交的程序中已经对之前其他同学提出的类似问题做出了修改，但是我这边跑训练时候，CRNN模型仍然在降低到2.7-2.8附近后，不再下降，得到的模型与楼主分享的相差较大，希望您看到后能给予指导，谢谢。

loss值高，acc为0

参数：
AUG_NBR = 100

11262/11262 [==============================] - 4133s 367ms/step - loss: 3.9718 - acc: 1.1099e-05 - val_loss: 2.8851 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4147s 368ms/step - loss: 2.8700 - acc: 1.1099e-05 - val_loss: 2.8474 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4099s 364ms/step - loss: 2.8316 - acc: 0.0000e+00 - val_loss: 2.8033 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4043s 359ms/step - loss: 2.8115 - acc: 2.2199e-05 - val_loss: 2.7982 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3998s 355ms/step - loss: 2.8044 - acc: 0.0000e+00 - val_loss: 2.7776 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3987s 354ms/step - loss: 2.7995 - acc: 0.0000e+00 - val_loss: 2.7764 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3989s 354ms/step - loss: 2.7964 - acc: 0.0000e+00 - val_loss: 2.7756 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3990s 354ms/step - loss: 2.7916 - acc: 3.3298e-05 - val_loss: 2.8062 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3991s 354ms/step - loss: 2.7950 - acc: 0.0000e+00 - val_loss: 2.7778 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3989s 354ms/step - loss: 2.7866 - acc: 1.1099e-05 - val_loss: 2.8157 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3990s 354ms/step - loss: 2.7643 - acc: 0.0000e+00 - val_loss: 2.7624 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3987s 354ms/step - loss: 2.7613 - acc: 0.0000e+00 - val_loss: 2.7621 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3986s 354ms/step - loss: 2.7620 - acc: 0.0000e+00 - val_loss: 2.7604 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3988s 354ms/step - loss: 2.7616 - acc: 0.0000e+00 - val_loss: 2.7653 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3988s 354ms/step - loss: 2.7613 - acc: 0.0000e+00 - val_loss: 2.7633 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3991s 354ms/step - loss: 2.7617 - acc: 0.0000e+00 - val_loss: 2.7618 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4154s 369ms/step - loss: 2.7606 - acc: 0.0000e+00 - val_loss: 2.7605 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4092s 363ms/step - loss: 2.7599 - acc: 0.0000e+00 - val_loss: 2.7601 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4089s 363ms/step - loss: 2.7602 - acc: 0.0000e+00 - val_loss: 2.7608 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4079s 362ms/step - loss: 2.7597 - acc: 0.0000e+00 - val_loss: 2.7604 - val_acc: 0.0000e+00

可以分享一下训练用的一堆图片吗？

卡号定位是不是非得用完整的银行卡图片才能做成？无论是完整大小的银行卡图片和裁剪过的我都找不到。

参数

在模型里面的的参数 input(None,32,256,1) 32和256是输入图像的像素值么。32*256