stefan-langenmaier / ocrfeeder Goto Github PK
View Code? Open in Web Editor NEWThis project forked from gnome/ocrfeeder
ocrfeeder
Home Page: http://live.gnome.org/OCRFeeder
License: GNU General Public License v3.0
This project forked from gnome/ocrfeeder
ocrfeeder
Home Page: http://live.gnome.org/OCRFeeder
License: GNU General Public License v3.0
================================= OCRFeeder - A Complete OCR Suite ================================= :Author: Joaquim Rocha :Contact: joaquimrocha1 {AT} gmail {DOT} com :Date: 2009-05-10 :Web site: https://wiki.gnome.org/Apps/OCRFeeder :Version: 0.3 :License: OCRFeeder is published under the GNU GPL v3 license. .. contents:: Introduction ============== OCRFeeder is a complete Optical Character Recognition and Document Analysis and Recognition program. This document explains what's needed to and how to install and run OCRFeeder. ODFPy is developed by OpenDocument Fellowship: http://opendocumentfellowship.com System Requirements ==================== The following list presents the system requirements for this project to run with all its functionalities. Each dependence has the minimum version that should be used. While newer versions are likely to keep working as well, older versions may or may not work as they were not tested. * Python (version 2.5); * PyGTK (version 2.13); * Python Enchant (version 1.3) * Python Imaging Sane (version 1.1.7) * Python Image Library (version 1.1); * PyGoocanvas (version 0.12); * Ghostscript (version 8.63); * Unpaper (version 0.3); The module ODFPy is included as part of this project in order to make the installation and execution of the project easier. Since this project doesn't use a particular OCR engine, no engine was listed as a dependence above. Nevertheless, the project is not usable without an OCR engine. The configuration XML file for the engine Ocrad is already included with the project so only what's needed to be installed for a first test is the Ocrad engine itself. In case the user doesn't want to use Ocrad, the configuration file that is placed in the OCRFeeder's configuration folder (~/.ocrfeeder) must be deleted. Other engines that might also be considered are Gocr and Tesseract. Installation on Ubuntu ======================= The only packages needed to be installed on Ubuntu 8.10 is PyGoocanvas and Unpaper, the rest of the dependencies are already installed in a fresh install of this version of Ubuntu. The engine Ocrad is also installed for the reasons explained in the previous section. To install PyGoocanvas, Ocrad and Unpaper, the following command should be executed as superuser: # apt-get install python-pygoocanvas ocrad unpaper After all of the packages finish the installation, OCRFeeder is ready to be installed. To install it, all that is needed is to run setup.py script as superuser: # setup.py install OCRFeeder can now be run by calling it from a desktop menu or by running the *ocrfeeder* command. When using the GNOME desktop, if the desktop menu entry is not showing the OCRFeeder's icon, the following command must be used to update the icon cache (as superuser): # gtk-update-icon-cache -f -t /usr/share/icons/hicolor Command Line Usage =================== This section explains how to use OCRFeeder from the command line. The command line interface part of OCRFeeder aims at users who want to perform quick and unattended conversions of document images to editable formats. It also makes this project usable from other applications. Two parameters are mandatory: 1) the path to each document image to be processed is given after the parameter --images; 2) the name of the document to be generated is given after the parameter --o. For example: $ ocrfeeder-cli --images ~/image1.png ~/image2.jpeg --o converted_document The pages of the generated documents honor the order of the given paths. It is also possible to specify the format of the document to be generated (HTML or ODT) with the option --format. In case no format is specified, the images will be exported to ODT. Continuing with the example above: $ ocrfeeder-cli --images ~/image1.png ~/image2.jpeg --format HTML --o converted_document OCRFeeder Studio (the graphical user interface part) can also be launched from the command line. Two options can be used to load images right after the program initiates. Those are --images which will add the images given as the option's arguments and --dir that will add all the images under a given directory path. The options can be used individually or combined, for example: $ ocrfeeder --images ~/image1.png ~/image2.jpeg --dir ~/Desktop For any usage, the options and parameters can be given in any order. Copyright ========== OCRFeeder is (c) 2008-2010 Joaquim Rocha OCRFeeder is (c) 2009-2010 Igalia, S.L. https://wiki.gnome.org/Apps/OCRFeeder The *odf* module is (c) Søren Roug, European Environment Agency http://odfpy.forge.osor.eu/ See AUTHORS file for details on authors OCRFeeder is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. OCRFeeder is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. See the COPYING file for more details.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.