Git Product home page Git Product logo

batch9_dyslexia's Introduction

batch9_dyslexia

OCR devlopment

Install

You first need to install tesseract

On Mac

brew install tesseract

This will install tesseract supporting English. If you want to add other languages (French for instance), you can add:

brew install tesseract-lang

On Windows

  1. Download Binary from https://github.com/UB-Mannheim/tesseract/wiki
  2. Run the executable file to install. It should install it to C:\Program Files (x86)\Tesseract-OCR
  3. Make sure your TESSDATA_PREFIX environment variable is set correctly
  • Go to Control Panel -> System -> Advanced System Settings -> Advanced tab -> Environment Variables... button
  • In System variables window scroll down to TESSDATA_PREFIX. If it's not right, select and click Edit...

On Linux

sudo apt-get update
sudo apt-get install tesseract-ocr
sudo apt-get install libtesseract-dev

Then you should install python package:

pip install tesseract
pip install tesseract-ocr

You can now install Dyslexia packages

You can install this package by cloning the repository and using this command :

cd batch9_dyslexia
pip install .

You can use pip install -e . if you are developing on it.

dyslexia package

Submodules Description
app Main function such as pipeline() or get_results()
io Input / Outputs functions such as load_image()
plots Plots functions such as plot_image()
preprocessing Preprocessing functions such as image_to_gray()
ocr OCR functions using tesseract backend

Using the package

Example using dyslexia

from dyslexia import preprocessing
from dyslexia.io import load_image
from dyslexia.ocr import extract_text_from_image

fpath = 'Exemples/SVT/IMG_20210329_123029.jpg'
image_orig = load_image(fpath)
image_no_shadow = preprocessing.remove_shadow(image_orig)
image_gray = preprocessing.image_to_gray(image_no_shadow, threshold=True)

result = extract_text_from_image(image_gray)

Using the pipeline

from dyslexia.app import get_results

fpath = 'Exemples/SVT/IMG_20210329_123029.jpg'
result = get_results(image_gray)

=======

App

Run app

uvicorn app:app --port 5000

Access swagger : http://127.0.0.1:5000/docs#/

Endpoint

/ocr_file/

Takes as input a file object and outputs the ocr results in the form

{"paragraphs" : ["....", "...."], "bboxes": [[0,0,100,50], [0,100,100,50]]}

Where paragraphs is the list of differents paragraphs and bboxes the coordinates (x1,y1,w,h) for each paragraph

/orc_url/

Takes as input an image and outputs the ocr results in the form

{"paragraphs" : ["....", "...."], "bboxes": [[0,0,100,50], [0,100,100,50]]}

Where paragraphs is the list of differents paragraphs and bboxes the coordinates (x1,y1,w,h) for each paragraph

Example query :

curl -X 'POST' \
  'http://127.0.0.1:5000/ocr_url/?url=https%3A%2F%2Fdata2.unhcr.org%2Fimages%2Fdocuments%2Fbig_4cda85d892a5c0b5dd63b510a9c83e9c9d06e739.jpg' \
  -H 'accept: application/json' \
  -d ''

Docker

docker-compose build

docker-compose up

Eval Scripts

dyslexia eval-txt-folder --truth_path tests/data/truth/ --hypothesis_path tests/data/hypothesis/

output

wer : 0.16666666666666666
mer : 0.16129032258064516
wil : 0.27311827956989243
wip : 0.7268817204301076
hits : 26.0
substitutions : 4.0
deletions : 0.0
insertions : 1.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.