Git Product home page Git Product logo

bankcard-recognizer's Introduction

BankCard-Recognizer

Extracting numbers from bankcard, based on Deep Learning with Keras.

Including auto and manual location, number identification, with GUI.

中文BLOG: 点击此处链接

bankcard

Roadmap

  • cnn_blstm_ctc
  • EAST/manual locate
  • GUI

Requirements

Python == 3.6

pip install requirements

Environment

Windows10 x64, Anaconda, PyCharm 2018.3, NVIDIA GTX 1050.

Usage

  1. Download trained model, CRNN extracting-code:6eqw, EAST extracting-code:qiw5.
  2. Then put CRNN model into crnn/model, put EAST model into east/model.
  3. Run python demo.py.
  4. In GUI, press Load button to load one image about bankcard or load from dataset/test/.
  5. Press Identify button, it will start locate and do identification.
  6. Activate manual location by double click the Image view, then draw the interest area and press Identify.

Training

Prepare

Download my dataset, CRNN extracting-code:1jax, EAST extracting-code:pqba. and unzip dataset in ./dataset.

The structure of dataset looks like:

- dataset
  - /card          # for east
  - /crad_nbr      # for crnn
  - /test
  ...

for CRNN

  1. Run python crnn/preprocess.py.
  2. Run python crnn/run.py to train, and you can change some parameters in crnn/cfg.py.

for EAST

  1. My dataset is collecting from Internet: Baidu, Google, and thanks Kesci. It has been labeld with ICDAR 2015 format, you can see it in dataset/card/txt/. This tiny dataset is unable to cover all the situation, if you have rich one, it may perform better.
  2. If you would like to get more data, make sure data has been labeled, or you can take dataset/tagger.py to label it.
  3. Modify east/cfg.py, see default values.
  4. Run python east/preprocess.py. If process goes well, you'll see generated data like this:

act_gt

  1. Finally, python east/run.py.

About

cnn_blstm_ctc

The model I used, refer to CNN_RNN_CTC. The CNN part is using VGG, with BLSTM as RNN and CTC loss.

The model's preview:

model

EAST/manual locate

Auto locate is using one famous Text Detection Algorithm - EAST. See more details.

In this project, I prefer to use AdvancedEAST. It is an algorithm used for Scene-Image-Text-Detection, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate. Original repo see Reference 1.

Also, training process is quiet quick and nice. As practical experience, img_size is better to be 384. The epoch_nbr is no longer important any more, for img_size like 384, usually training will early stop at epoch 20-22. But if you have a large dataset, try to play with these parameters.

This model's preview:

model

Manual locate is only available in GUI. Here're some performance in .gif:

manual-locate1

manual-locate2

GUI

Using QtDesigner to design UI, and PyQt5 to finish other works.

Statement

  1. The bankcard images are collecting from Internet, if any images make you feel uncomfortable, please contact me.
  2. If you have any issues, post it in Issues.

Reference

  1. Advanced EAST - https://github.com/huoyijie/AdvancedEAST
  2. EAST model - https://www.cnblogs.com/lillylin/p/9954981.html

bankcard-recognizer's People

Contributors

shawnh2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

bankcard-recognizer's Issues

修改batch size

你好! 我想请问下为什么修改batch size之后报tensorflow.python.framework.errors_impl.InvalidArgumentError: Not enough time for target transition sequence (required: 4, available: 3)1You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs
这个错误?
谢谢

整型转换

res.append(int(ch))

ValueError: invalid literal for int() with base 10: 'i'
0%| | 0/548 [00:00<?, ?it/s]
这里会报错怎么不办

识别模型loss

请问您的识别模型当初训练了多久呀,我基于预训练的参数,loss最低降到2.6左右,之后上升。最终early stop。
训练有什么技巧吗。
谢谢。

有关于训练的loss

您好,我想请问一下,您项目里面的east模型训练到最后平均loss是多少

CRNN模型loss不下降

感谢楼主的分享,看到您当前提交的程序中已经对之前其他同学提出的类似问题做出了修改,但是我这边跑训练时候,CRNN模型仍然在降低到2.7-2.8附近后,不再下降,得到的模型与楼主分享的相差较大,希望您看到后能给予指导,谢谢。

loss值高,acc为0

参数:
AUG_NBR = 100

11262/11262 [==============================] - 4133s 367ms/step - loss: 3.9718 - acc: 1.1099e-05 - val_loss: 2.8851 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4147s 368ms/step - loss: 2.8700 - acc: 1.1099e-05 - val_loss: 2.8474 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4099s 364ms/step - loss: 2.8316 - acc: 0.0000e+00 - val_loss: 2.8033 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4043s 359ms/step - loss: 2.8115 - acc: 2.2199e-05 - val_loss: 2.7982 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3998s 355ms/step - loss: 2.8044 - acc: 0.0000e+00 - val_loss: 2.7776 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3987s 354ms/step - loss: 2.7995 - acc: 0.0000e+00 - val_loss: 2.7764 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3989s 354ms/step - loss: 2.7964 - acc: 0.0000e+00 - val_loss: 2.7756 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3990s 354ms/step - loss: 2.7916 - acc: 3.3298e-05 - val_loss: 2.8062 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3991s 354ms/step - loss: 2.7950 - acc: 0.0000e+00 - val_loss: 2.7778 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3989s 354ms/step - loss: 2.7866 - acc: 1.1099e-05 - val_loss: 2.8157 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3990s 354ms/step - loss: 2.7643 - acc: 0.0000e+00 - val_loss: 2.7624 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3987s 354ms/step - loss: 2.7613 - acc: 0.0000e+00 - val_loss: 2.7621 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3986s 354ms/step - loss: 2.7620 - acc: 0.0000e+00 - val_loss: 2.7604 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3988s 354ms/step - loss: 2.7616 - acc: 0.0000e+00 - val_loss: 2.7653 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3988s 354ms/step - loss: 2.7613 - acc: 0.0000e+00 - val_loss: 2.7633 - val_acc: 0.0000e+00
11262/11262 [==============================] - 3991s 354ms/step - loss: 2.7617 - acc: 0.0000e+00 - val_loss: 2.7618 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4154s 369ms/step - loss: 2.7606 - acc: 0.0000e+00 - val_loss: 2.7605 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4092s 363ms/step - loss: 2.7599 - acc: 0.0000e+00 - val_loss: 2.7601 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4089s 363ms/step - loss: 2.7602 - acc: 0.0000e+00 - val_loss: 2.7608 - val_acc: 0.0000e+00
11262/11262 [==============================] - 4079s 362ms/step - loss: 2.7597 - acc: 0.0000e+00 - val_loss: 2.7604 - val_acc: 0.0000e+00

参数

在模型里面的的参数 input(None,32,256,1) 32和256是输入图像的像素值么。32*256

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.