Git Product home page Git Product logo

transformers_ocr's Introduction

Transformers OCR

https://tatsumoto.neocities.org/blog/mining-from-manga.html

AUR Chat GitHub

An OCR tool for the GNU operating system that uses Transformers. Supports Xorg and Wayland.

ocr.mp4

This Manga OCR application is likely the most suckless and lightweight option available. The application is designed to work best with a tiling window manager. It requires a minimum of dependencies, and all of them you probably already have. However, it still has to rely on large Python libraries to work. To isolate the bloat, these libraries are installed in a dedicated folder. But if your computer is rather slow, use Tesseract instead.

Installation

Arch Linux and Arch-based distros

Install from the AUR.

Other distros

If you want to package this program for your distribution and know how to do it, please create a pull request. Otherwise, read the section below.

To install manually (not recommended)

The steps below are for people who can't access the AUR.

Step 1. Install the following dependencies if they are not installed.

Xorg
Wayland
GNOME
KDE

Step 2. Install the program using Makefile.

git clone 'https://github.com/Ajatt-Tools/transformers_ocr.git'
cd -- 'transformers_ocr'
sudo make install

Setup

Before you start, download manga-ocr data:

transformers_ocr download

The files will be saved to ~/.local/share/manga_ocr.

Usage

To show a help page, run transformers_ocr help.

To OCR text on a manga page, run:

transformers_ocr recognize

Bind the command to a keyboard shortcut using your WM's config. This enables you to call the OCR from anywhere, as shown in the demo video.

For example, if you use i3wm, add this line to the config file.

bindsym $mod+o  exec --no-startup-id transformers_ocr recognize

The first run will take longer than usual. There are additional files that will be downloaded and saved to ~/.cache/huggingface.

On the first run transformers_ocr launches a listener process that is running is the background and reads any new screenshots passed to it. To speed up the first run, add the command below to autostart (using ~/.profile, ~/.xinitrc, etc.).

transformers_ocr start

Holding text

Quite often one sentence, phrase or a chunk of meaning is split between two or more speech bubbles. This is a problem because if you take a screenshot of the whole area, including the area between the speech bubbles, you will likely end up with junk in the results. Processing each bubble separately is also not ideal since you want to analyze the entire sentence in GoldenDict, add it to Anki, etc.

A solution is to have transformers-ocr hold text for you. It will recognize one speech bubble, remember it, then wait for another, and only copy the text from all bubbles altogether when you're done.

To use this feature, add a new keyboard shortcut to the config file of your WM, for example Mod+Shift+o. Example for i3wm:

bindsym $mod+Shift+o  exec --no-startup-id transformers_ocr hold
screencast.mp4

Every time you call hold, a speech bubble will be recognized and saved for later. Finally, call recognize using the usual keyboard shortcut to copy the last speech bubble and all the saved ones together. The list of saved bubbles will be emptied when calling recognize.

Config file

Optionally, you can create a config file.

mkdir -p ~/.config/transformers_ocr
touch ~/.config/transformers_ocr/config

Each line must have this format: key=value. Lines that start with # are ignored.

Send text to an external application

Instead of copying text to the clipboard, you may want to pass it as an argument to an external application. In the example below clip_command is set to goldendict which allows you to send recognized text directly to GoldenDict and keep the system clipboard for other tasks.

echo 'clip_command=goldendict %TEXT%' >> ~/.config/transformers_ocr/config
transformers_ocr stop
transformers_ocr start

If %TEXT% is passed as a parameter, it will be replaced with the actual text in the speech bubble. If not, the text will be passed to stdin of the called program.

Force CPU

If you want to force CPU.

echo 'force_cpu=yes' >> ~/.config/transformers_ocr/config

transformers_ocr's People

Contributors

latrolage avatar lunagnuisance avatar masakichi avatar nikohonu avatar tatsumoto-ren avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.