Git Product home page Git Product logo

java-ys / invoice_ocr Goto Github PK

View Code? Open in Web Editor NEW

This project forked from 384863451/invoice_ocr

0.0 0.0 0.0 17.15 MB

混合票据识别,增值税专用发票, 增值税普通发票, 增值税电子专用发票, 增值税电子普通发票, 增值税普通发票(卷式), 非税财政电子票据, 过路费发票, 火车票, 飞机票, 客运票, 出租车票, 定额, 通用机打发票

Shell 0.77% Python 98.91% Dockerfile 0.31%

invoice_ocr's Introduction

混合报销票据识别

识别文件类型:图片,pdf,ofd, 0,90,180,270四种度数。 识别类型:增值税专用发票, 增值税普通发票, 增值税电子专用发票, 增值税电子普通发票, 增值税普通发票(卷式), 过路费发票, 火车票, 飞机票, 客运票, 出租车票, 定额, 通用机打发票

环境

  1. python3.5/3.6
  2. 依赖项安装:pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
  3. 有GPU环境的可修改安装requirements.txt对应版本的tensorflow-gpu,config.py文件中控制GPU的开关

模型架构

YOLOv5 + CRNN + CTC

模型

  1. 模型下载地址:链接:链接:https://pan.baidu.com/s/1E_OE9HOjjFh6GZdPWQVbMg 提取码:voqi
  2. 将下载完毕的模型文件夹models放置于项目根目录下

服务启动

  1. 控制台 python manage.py runserver 127.0.0.1:8080
  2. 端口可自行修改
  3. 服务调用地址:http://...: [端口号]/detection_images,http://127.0.0.1:8080/detection,例:http://127.0.0.1:8080/detection_images

测试demo

  1. 测试工具:postman,可自行下载安装
  2. 4张增值税发票混拍

Image text

代码执行过程说明

  • 使用django命令启动
  • 首先对图片做处理,可以接收的参数为图片文件,图片base64编码,图片下载地址
  • 图片中发票定位,并把识别结果放到list
  • 判断对应的发票类型进一步识别发票具体部位。
  • 识别到关键部位通过crnn识别具体信息
  • 电子发票特别优化,可以识别pdf和ofd

后期开发计划

  • 增值税发票只识别了五要素,后续打算结合发票查验直接获取全票面
  • 其他发票都只识别了几个部位,后期有空完善
  • crnn使用了chineseocr项目自带的,正在做,工作量太大有空更新

参考

chineseocr https://github.com/chineseocr/chineseocr

##总结 新手做着玩,代码写的很乱。

invoice_ocr's People

Contributors

384863451 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.