Git Product home page Git Product logo

opencv_ocr's Introduction

项目简介

软件功能

将照片中的页面进行投影变换成正面,然后通过第三方开源OCR字符识别库pytesseract对照片中文本进行识别。

效果图

安装说明

python版本 3.6.4

opencv版本 3.4.1

pytesseract版本0.3.8:通过Release v0.3.8 · madmaze/pytesseract (github.com)下载源码进行安装 cd pytesseract && pip install -U .

安装pytesseract软件windows版(根目录上有安装包)

配置环境变量:

  1. pytesseract软件目录

  2. TESSDATA_PREFIX的环境变量,设置为安装目录下的tessdata目录 
    如:D:\Program Files (x86)\Tesseract-OCR\tessdata

更新日志

V1.0.0 版本

第一步 边缘检测

  • 读取原始图像,根据比例放大

  • 预处理:灰度

  • 高斯模糊(降低图像噪音,使图像的边缘更平滑)

  • Canny边缘检测算法

第二步 轮廓检测

  • 获取边缘检测后图像中的轮廓

  • 根据面积排序,取出前五的轮廓

  • 遍历轮廓,对当前轮廓进行多边形逼近(当前轮廓的周长*0.02为原始轮廓到近似轮廓的最大距离)。当得出的轮廓是一个四边形则跳出循环,并将该轮廓保存下来。

第三步 投影变化

  • 将坐标根据 左上、右上、右下、左下 进行排序

  • 根据两点间距离公式(根号 x平方 + y平方),计算出w和h值。

  • 以左上为(0,0)根据h和w得出变换后的四个坐标位置。

  • 计算变换矩阵

  • 将原始图像根据变换矩阵进行透视变换

第四步 调用第三方OCR库

  • 对图像进行灰度

  • 中值滤波(去除图像中的噪点,椒盐噪点和斑点噪点)

  • 将图像保存到本地,生成临时图像文件(以进程号为文件名)

  • 调用Image库打开图像,传入pytesseract接口

  • 得出识别后的文字信息

opencv_ocr's People

Contributors

codermk96 avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.