Git Product home page Git Product logo

penteract-ocr's Introduction

Build Status Coverage

penteract

The native Node.js bindings to the Tesseract OCR project.

  • Using Node.js bindings, avoid spawning tesseract command line.
  • Asynchronous I/O: Image reading and processing in insulated event loop backed by libuv.
  • Support to read image data from JavaScript buffers.

Contributions are welcome.

Install

First of all, a g++ 4.9 compiler is required.

Before install penteract, the following dependencies should be installed

$ brew install pkg-config tesseract # mac os

Then npm install

$ npm install penteract

To Use with Electron

Due to the limitation of node native modules, if you want to use penteract with electron, add a .npmrc file to the root of your electron project, before npm install:

runtime = electron
; The version of the local electron,
; use `npm ls electron` to figure it out
target = 1.7.5
target_arch = x64
disturl = https://atom.io/download/atom-shell

Usage

Recognize an Image Buffer

import {
  recognize
} from 'penteract'

import fs from 'fs-extra'

const filepath = path.join(__dirname, 'test', 'fixtures', 'penteract.jpg')

fs.readFile(filepath).then(recognize).then(console.log) // 'penteract'

Recognize a Local Image File

import {
  fromFile
} from 'penteract'

fromFile(filepath, {lang: 'eng'}).then(console.log)     // 'penteract'

recognize(image [, options])

  • image Buffer the content buffer of the image file.
  • options PenteractOptions= optional

Returns Promise.<String> the recognized text if succeeded.

fromFile(filepath [, options])

  • filepath Path the file path of the image file.
  • options PenteractOptions=

Returns Promise.<String>

PenteractOptions Object

{
  // @type `(String|Array.<String>)=eng`,
  //
  // Specifies language(s) used for OCR.
  //   Run `tesseract --list-langs` in command line for all supported languages.
  //   Defaults to `'eng'`.
  //
  // To specify multiple languages, use an array.
  //   English and Simplified Chinese, for example:
  // ```
  // lang: ['eng', 'chi_sim']
  // ```
  lang: 'eng'
}

Promise.reject(error)

  • error Error The JavaScript Error instance
    • code String Error code.
    • message String Error message.
    • other properties of Error.

code: ERR_READ_IMAGE

Rejects if it fails to read image data from file or buffer.

code: ERR_INIT_TESSER

Rejects if tesseract fails to initialize

Example of Using with Electron

// For details of `mainWindow: BrowserWindow`, see
// https://github.com/electron/electron/blob/master/docs/api/browser-window.md
mainWindow.capturePage({
  x: 10,
  y: 10,
  width: 100,
  height: 10

}, (data) => {
  recognize(data.toPNG()).then(console.log)
})

Compiling Troubles

For Mac OS users, if you are experiencing trouble when compiling, run the following command:

$ xcode-select --install

will resolve most problems.

Warnings:

xcode-select: error: tool 'xcodebuild' requires Xcode, but active developer directory '/Library/Developer/CommandLineTools' is a command line tools instance

resolver:

$ sudo xcode-select -s /Applications/Xcode.app/Contents/Developer

License

MIT

penteract-ocr's People

Contributors

kaelzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

penteract-ocr's Issues

Documentation example on How to set options request

Hello. I had perf issue using the teserract.js so i decided to give this penteract binding a try. However
I am quite confused on how to set options like: use my own dictionary words

I am also wondering if version 4 of tesseract with neural LSTM is supported .

THANKS

pkg-config install failed on windows and unbuntu

It works well on mac, pertty good, but I have no idea when install pkg-config dependencies on win10 and unbuntu18, if this issue dose not belong to penteract-ocr, please forgive me and close it, thanks.

I just want to know two questions:

  1. Is penteract-ocr support windows and unbuntu
  2. Is pkg-config could install success on windows and ubuntu as penteract-ocr wanted

The problem I am having is

  1. I don't know how to install pkg-config success on windows, I have try https://stackoverflow.com/questions/1710922/how-to-install-pkg-config-in-windows, but faild.
  2. I install pkg-config success by apt-get on Ubuntu, but it does't have lept, however, tesseract needs it.

so, could you give me some help or advise, thanks.

How to set dpi

I keep getting Warning. Invalid resolution 0 dpi. Using 70 instead., the dpi is 72 however and it is giving bad results. Is there a way to set dpi to 72 without changing the files?

Thanks for making this!

Module not found: Error: Can't resolve '../build/Release/penteract'

I am trying to use this in my electron project. Unfortunately when I run my project (via webpack), I am getting this:

    ERROR in ./node_modules/penteract/src/index.js
    Module not found: Error: Can't resolve '../build/Release/penteract' in '/Users/steph/Documents/workspace/myElectronProject/node_modules/penteract/src'
     @ ./node_modules/penteract/src/index.js 1:0-49 25:4-12

I checked the build/Release folder, and there is a penteract.node file present.
Any idea?

On a side note: I am using node-loader, not sure if this is related. Webpack config:

  module: {
    rules: [
      {
        test: /\.node$/,
        use: 'node-loader'
      },
   ...
  }

Container usage from node.js app

What are the setup and run instructions if I choose to run a node.js app get penteract (running in a docker container in BlueMix) as an engine to process uploaded files (on the server/linking a volume to a container)?

Can I run the container and access the image's filesystem via a bash shell for testing purposes?

Do I need a compiler? My test machine is with Windows.

Thank you

Dependency information is not enough

Hello, I just found this repo and so glad to test it.
But it is not working simple code:
require("penteract.js");

Looks like dependency problem. Please add this in your readme. :)

internal/modules/cjs/loader.js:730
return process.dlopen(module, path.toNamespacedPath(filename));
^

Error: dlopen(/Users/jeefo/projects/face_recognition/node_modules/penteract/build/Release/penteract.node, 1): Library not loaded: /usr/local/opt/webp/lib/libwebp.5.dylib
Referenced from: /usr/local/opt/libtiff/lib/libtiff.5.dylib
Reason: image not found
at Object.Module._extensions..node (internal/modules/cjs/loader.js:730:18)
at Module.load (internal/modules/cjs/loader.js:600:32)
at tryModuleLoad (internal/modules/cjs/loader.js:539:12)
at Function.Module._load (internal/modules/cjs/loader.js:531:3)
at Module.require (internal/modules/cjs/loader.js:637:17)
at require (internal/modules/cjs/helpers.js:22:18)
at Object. (/Users/jeefo/projects/face_recognition/node_modules/penteract/lib/index.js:12:18)
at Module._compile (internal/modules/cjs/loader.js:701:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:712:10)
at Module.load (internal/modules/cjs/loader.js:600:32)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.