Git Product home page Git Product logo

adaptech-cz / tesseract4android Goto Github PK

View Code? Open in Web Editor NEW
704.0 15.0 109.0 31.89 MB

Fork of tess-two rewritten from scratch to support latest version of Tesseract OCR.

License: Apache License 2.0

Java 1.06% CMake 0.58% C++ 27.59% C 57.54% Makefile 2.30% Shell 4.49% Roff 5.49% M4 0.15% Lua 0.01% PostScript 0.01% HTML 0.32% SAS 0.05% Smalltalk 0.02% WebAssembly 0.05% Assembly 0.06% Module Management System 0.06% DIGITAL Command Language 0.04% Dockerfile 0.01% Batchfile 0.01% Awk 0.15%
android ocr tesseract tesseract-ocr leptonica libpng libjpeg optical-character-recognition tesseract-android

tesseract4android's People

Contributors

fab1ano avatar robyer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tesseract4android's Issues

TessPdfRenderer not working with jpg files

Hi.

Having used the alexcohn/tess-two repository, the TessPdfRenderer works with jpg files, but when using the latest version of this library, it doesn't. The generated PDF apparently has the text in it, but not the image, and the workaround is to create a PNG out of the JPG file, which in some situations adds up to 2 seconds of processing. This situation is quite similar to the old issue (from 2015) found in the original rmtheis repository.

Here is the code that works with 'com.rmtheis:tess-two:9.1.0' but not this library:

TessBaseAPI mTess = new TessBaseAPI();
mTess.setDebug(true);
mTess.init(DATA_PATH, lang);

String pdfOutput = Environment.getExternalStorageDirectory().toString() + "/Download/ocrOutput";
String jpegInput = Environment.getExternalStorageDirectory().toString() + "/Download/jpegInput.jpg";
TessPdfRenderer renderer = new TessPdfRenderer(mTess, pdfOutput);
    
mTess.beginDocument(renderer);

File file = new File(jpegInput);
Pix pix = ReadFile.readFile(file);

boolean addedPageOne = mTess.addPageToDocument(pix, file.getAbsolutePath(), renderer);
Log.e(TAG, "convertImageToSearchablePdf: addedPageOne: " + addedPageOne);

boolean endDocument = mTess.endDocument(renderer);
Log.e(TAG, "convertImageToSearchablePdf: endDocument: " + endDocument );

renderer.recycle();
pix.recycle();

Am I missing something in my code, so it works with the other library and not this one?

Crash on onProgressValues when shrinking code

I'm experiencing this crash

java.lang.NoClassDefFoundError: com.googlecode.tesseract.android.TessBaseAPI
	at com.xxx.yyy.MainActivity.a(Unknown Source:291)
	at com.xxx.yyy.a.onMethodCall(Unknown Source:2)
	at io.flutter.plugin.common.MethodChannel$IncomingMethodCallHandler.onMessage(Unknown Source:17)
	at io.flutter.embedding.engine.dart.DartMessenger.handleMessageFromDart(Unknown Source:57)
	at io.flutter.embedding.engine.FlutterJNI.handlePlatformMessage(Unknown Source:4)
	at android.os.MessageQueue.nativePollOnce(Native Method)
	at android.os.MessageQueue.next(MessageQueue.java:326)
	at android.os.Looper.loop(Looper.java:160)
	at android.app.ActivityThread.main(ActivityThread.java:6863)
	at java.lang.reflect.Method.invoke(Native Method)
	at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:537)
	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:858)
Caused by: java.lang.NoSuchMethodError: no non-static method "Lcom/googlecode/tesseract/android/TessBaseAPI;.onProgressValues(IIIIIIIII)V"
	at com.googlecode.tesseract.android.TessBaseAPI.nativeClassInit(Native Method)
	at com.googlecode.tesseract.android.TessBaseAPI.<clinit>(Unknown Source:20)

when building the apk with flutter. See related flutter issue here. Flutter shrinks the code, and given onProgressValues looks unused (because called from native code) it is removed.
We should either update the proguard file to keep it or use the @Keep notation.

To be clear this issue is not limited to flutter, you would have the same problem even when compiling a normal android project using Tesseract4Android when shrinking the code, but it's now more evident for flutter users because flutter shrinks by default starting from the latest version.

Reference https://developer.android.com/studio/build/shrink-code

Thread for running the OCR

Does Tesseract4Android runs the OCR on a separate thread other than the UI thread ??
Or the users is responsible for running on separate thread.

What is the right approach to using TessBaseAPI with AsyncTask ??

tesseract-ocr version is 4.0.0

I import tesseract4android-2.0.0.aar,but the version is 4.0.0

        TessBaseAPI tessBaseAPI=new TessBaseAPI();
        String version = tessBaseAPI.getVersion();
        Log.e("vvv",tessBaseAPI.getVersion()); // 4.0.0

Not working with some traineddata files for tesseract 4

Hi guys, great job done! :D

I used your library for a while and it is working well, until recently that I tried to use with this traineddata file:
https://github.com/Shreeshrii/tessdata_shreetest/blob/master/fas-minus-float.traineddata

I extracted the mentioned traineeddata file and the .version file says:
4.0.0-beta.1-232-g45a6:fas:minus20180518:from:4.00.00alpha:Arabic:synth20170629

while for the eng.traineddata shipped with this repo, the .version file says:
Pre-4.0.0

Is my fas-minus-float.traineddata version is right? Can it be used with your library?

I soon provide the error thrown on my android device, sorry that I cannot provide it at the moment. Thought maybe the version of my traineddata is not compatible at all so that error is not important.

How to pass ocr configurations properly

I tried to put my anpr.tessconfig configuration file into /sdcard/tesseract/tessdata where it contains:

load_system_dawg 0
load_freq_dawg 0
tessedit_char_whitelist 0123456789ABGRZSTJPMNO
user_patterns_suffix anpr.user-patterns

Knowing that anpr.user-patternsis my pattern file (holding this pattern only: \c\d*) placed at the same destination, I tried to pass these configs using the readConfigFile function as follows, however it seems not to be working, neither for the setPageSegMode.

mTess = new TessBaseAPI();
 mTess.setPageSegMode(TessBaseAPI.PageSegMode.PSM_SINGLE_WORD);
mTess.readConfigFile("anpr.tessconfig");
mTess.init(datapath, "eng");

Please note that i tried to pass the configs using the setVariable() function like this:

mTess.setVariable("load_system_dawg", "false");
mTess.setVariable("load_freq_dawg", "false");
mTess.setVariable("tessedit_char_whitelist", "0123456789ABGRZSTJPMNO");
mTess.setVariable("user_patterns_suffix", "anpr.user-patterns");

but still not working either.
any help would be highly appreciated. Thanks!!

How can I use Tess.setVariable(whitelist)?

I tried to use setVariable to use whitelist that I can filter characters.
I heard that Tesseract 4.0 doesn't have whitelist or blacklist but 4.1x does. And this project uses Tesseract 4.1.1.
I made aar file on your guide, copy the aar file on my project's libs folder and implementation on my build.gradle. Is there anything I did wrong way?
If anyone who successfully apply whitelist, PLEASE give me some guide.
Thanks for great works.

preserve_interword_spaces available?

Thanks for great work and updates.
I'm using Tesseract in Android to recognize Korean characters now and I've been using Tesseract at PC with python. Almost everything works like at PC.
But only in Korean the output text has word spacing between every each character.
And I know that -c preserve_interword_spacing can fix this problem. I've searched for
http://localhost:63342/nm8ebwyhq7nmjt405kxwyfp053hpsn7y2puzf/imgTest/tesseract4android-2.1.0-javadoc.jar/com/googlecode/tesseract/android/package-summary.html
but there's no mention for that function.
Is it possible to use word spacing at Android? Or can I make it on my own?(Is it gonna be very difficult?)

Crash on nativeGetUTF8Text

I am getting the whole application to crash, once: getUTF8Text() method is called. When used debugger it crashes once I hit the: nativeGetUTF8Text.
Nothing obvious in logs, I can't catch any exception and when I execute the: tess.utF8Text line in the debugger I simply get: VMDisconnectedException.

I've tried this dataset: https://github.com/tesseract-ocr/tessdata/blob/4.0.0/eng.traineddata

tried on multiple different images, but no success. Trying on emulator, but on real phone happens as well.

I am using below code.

 val assetName = "eng.traineddata"
                    val fileNameOnDevice = "eng.traineddata"
                    val tess = TessBaseAPI()
                    val data: File = File(context.application.dataDir, "tessdata")
                    val traineddataFile = File(data, fileNameOnDevice)
                    if (!traineddataFile.exists()) {
                        data.mkdirs()
                        val src = context.getAssets().open(assetName)
                        FileUtils.copyToFile(src, traineddataFile);
                    }

                    tess.init(context.application.dataDir.absolutePath,"eng")
                    tess.setImage(File(image.absolutePath))
                    try {

                         // CRASHES HERE !
                        val text = tess.utF8Text

                        Toast.makeText(context, text, Toast.LENGTH_SHORT).show()
                    } catch (e : Exception) {
                        val bp = 1
                    }

                    tess.recycle()

I am using: 4.1.1 tesseract4android.

Any ideas how to debug what's the issue?

Some mobile crash when update system.

Some Sony and Samsung users report that recognition crashes after recent system upgrades
Almost covers nearly 1~2 years of mobile phones...
I don't know how to fix it...

TessBaseAPI api = new TessBaseAPI();
api.setPageSegMode(TessBaseAPI.PageSegMode.PSM_SINGLE_LINE);
api.init(context.getCacheDir().getAbsolutePath() + "/", "jpn");

api.setImage(bitmap);
api.setRectangle(rect);
String result = api.getUTF8Text();

tessdata:tessdata_fast

Here is the report from google play

Error type 1

System version:

  • 12 (78.6%)
  • 11 (20.2%)
  • 12L (1.2%)

Devices

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
pid: 0, tid: 0 >>> net.package <<<

backtrace:
  #00  pc 0000000000051b20  /apex/com.android.runtime/lib64/bionic/libc.so (abort+168)
  #00  pc 000000000018e000  /data/app/~~EFTlK7iTvvXOP0nicI2Jvw==/net.package-SpUGjz22JpoYWPdXxVgDbA==/split_config.arm64_v8a.apk!libtesseract.so (ERRCODE::error(char const*, TessErrorLogCode, char const*, ...) const+376)
  #00  pc 0000000000119c5c  /data/app/~~EFTlK7iTvvXOP0nicI2Jvw==/net.package-SpUGjz22JpoYWPdXxVgDbA==/split_config.arm64_v8a.apk!libtesseract.so (tesseract::Tesseract::SegmentPage(STRING const*, BLOCK_LIST*, tesseract::Tesseract*, OSResults*)+136)
  #00  pc 00000000000e2318  /data/app/~~EFTlK7iTvvXOP0nicI2Jvw==/net.package-SpUGjz22JpoYWPdXxVgDbA==/split_config.arm64_v8a.apk!libtesseract.so (tesseract::TessBaseAPI::FindLines()+652)
  #00  pc 00000000000e27f4  /data/app/~~EFTlK7iTvvXOP0nicI2Jvw==/net.package-SpUGjz22JpoYWPdXxVgDbA==/split_config.arm64_v8a.apk!libtesseract.so (tesseract::TessBaseAPI::Recognize(ETEXT_DESC*)+56)
  #00  pc 00000000000e15cc  /data/app/~~EFTlK7iTvvXOP0nicI2Jvw==/net.package-SpUGjz22JpoYWPdXxVgDbA==/split_config.arm64_v8a.apk!libtesseract.so (tesseract::TessBaseAPI::GetUTF8Text()+60)
  #00  pc 00000000002a3c48  /data/app/~~EFTlK7iTvvXOP0nicI2Jvw==/net.package-SpUGjz22JpoYWPdXxVgDbA==/split_config.arm64_v8a.apk!libtesseract.so (Java_com_googlecode_tesseract_android_TessBaseAPI_nativeGetUTF8Text+64)
  #00  pc 000000000006c594  /data/app/~~EFTlK7iTvvXOP0nicI2Jvw==/net.package-SpUGjz22JpoYWPdXxVgDbA==/oat/arm64/base.odex (art_jni_trampoline+100)
  #00  pc 0000000000101518  /data/app/~~EFTlK7iTvvXOP0nicI2Jvw==/net.package-SpUGjz22JpoYWPdXxVgDbA==/oat/arm64/base.odex (com.googlecode.tesseract.android.TessBaseAPI.b+56)
  #00  pc 000000000014e490  /data/app/~~EFTlK7iTvvXOP0nicI2Jvw==/net.package-SpUGjz22JpoYWPdXxVgDbA==/oat/arm64/base.odex (net.package.b.a.c+2400)
  #00  pc 00000000001507d8  /data/app/~~EFTlK7iTvvXOP0nicI2Jvw==/net.package-SpUGjz22JpoYWPdXxVgDbA==/oat/arm64/base.odex (net.package.b.c.run+552)
  #00  pc 00000000001bf19c  /apex/com.android.art/javalib/arm64/boot.oat (java.lang.Thread.run+76)
  #00  pc 00000000002ca764  /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+548)
  #00  pc 000000000030e980  /apex/com.android.art/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+156)
  #00  pc 00000000003c1db4  /apex/com.android.art/lib64/libart.so (art::JValue art::InvokeVirtualOrInterfaceWithJValues<art::ArtMethod*>(art::ScopedObjectAccessAlreadyRunnable const&, _jobject*, art::ArtMethod*, jvalue const*)+380)
  #00  pc 00000000004578ec  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+992)
  #00  pc 00000000000b6e44  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+264)
  #00  pc 0000000000053454  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+68)

Error type 2

System version:

  • 11 (99.6%)
  • 12 (0.4%)

Devices

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
pid: 0, tid: 0 >>> net.package <<<

backtrace:
  #00  pc 000000000004e40c  /apex/com.android.runtime/lib64/bionic/libc.so (abort+164)
  #00  pc 000000000018e000  /data/app/~~-7k4zfCJeQZLILrkkybH2g==/net.package-24GS5V1rQeWBo82xzjDSqA==/split_config.arm64_v8a.apk!lib/arm64-v8a/libtesseract.so (offset 0x8d1000) (ERRCODE::error(char const*, TessErrorLogCode, char const*, ...) const+376)
  #00  pc 0000000000119c5c  /data/app/~~-7k4zfCJeQZLILrkkybH2g==/net.package-24GS5V1rQeWBo82xzjDSqA==/split_config.arm64_v8a.apk!lib/arm64-v8a/libtesseract.so (offset 0x8d1000) (tesseract::Tesseract::SegmentPage(STRING const*, BLOCK_LIST*, tesseract::Tesseract*, OSResults*)+136)
  #00  pc 00000000000e2318  /data/app/~~-7k4zfCJeQZLILrkkybH2g==/net.package-24GS5V1rQeWBo82xzjDSqA==/split_config.arm64_v8a.apk!lib/arm64-v8a/libtesseract.so (offset 0x8d1000) (tesseract::TessBaseAPI::FindLines()+652)
  #00  pc 00000000000e27f4  /data/app/~~-7k4zfCJeQZLILrkkybH2g==/net.package-24GS5V1rQeWBo82xzjDSqA==/split_config.arm64_v8a.apk!lib/arm64-v8a/libtesseract.so (offset 0x8d1000) (tesseract::TessBaseAPI::Recognize(ETEXT_DESC*)+56)
  #00  pc 00000000000e15cc  /data/app/~~-7k4zfCJeQZLILrkkybH2g==/net.package-24GS5V1rQeWBo82xzjDSqA==/split_config.arm64_v8a.apk!lib/arm64-v8a/libtesseract.so (offset 0x8d1000) (tesseract::TessBaseAPI::GetUTF8Text()+60)
  #00  pc 00000000002a3c48  /data/app/~~-7k4zfCJeQZLILrkkybH2g==/net.package-24GS5V1rQeWBo82xzjDSqA==/split_config.arm64_v8a.apk!lib/arm64-v8a/libtesseract.so (offset 0x8d1000) (Java_com_googlecode_tesseract_android_TessBaseAPI_nativeGetUTF8Text+64)
  #00  pc 000000000006d6a4  /data/app/~~-7k4zfCJeQZLILrkkybH2g==/net.package-24GS5V1rQeWBo82xzjDSqA==/oat/arm64/base.odex (art_jni_trampoline+132)
  #00  pc 00000000000710a4  /data/app/~~-7k4zfCJeQZLILrkkybH2g==/net.package-24GS5V1rQeWBo82xzjDSqA==/oat/arm64/base.odex (net.package.b.a.c+2564)
  #00  pc 0000000000072d20  /data/app/~~-7k4zfCJeQZLILrkkybH2g==/net.package-24GS5V1rQeWBo82xzjDSqA==/oat/arm64/base.odex (net.package.b.c.run+752)
  #00  pc 000000000015ab38  /apex/com.android.art/javalib/arm64/boot.oat (java.lang.Thread.run+72)
  #00  pc 0000000000133564  /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+548)
  #00  pc 00000000001a8a78  /apex/com.android.art/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+200)
  #00  pc 0000000000554cac  /apex/com.android.art/lib64/libart.so (art::JValue art::InvokeVirtualOrInterfaceWithJValues<art::ArtMethod*>(art::ScopedObjectAccessAlreadyRunnable const&, _jobject*, art::ArtMethod*, jvalue const*)+460)
  #00  pc 00000000005a4048  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+1308)
  #00  pc 00000000000b0048  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+64)
  #00  pc 00000000000503c8  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64)

Crash "nativeGetUTF8Text" in unit test after 30min of running

First thank you for this library ๐Ÿ‘
I am using it for long running unit test (extracting 100+ documents sequentially), where it crashes after about 30 min of running

2019-05-07 09:02:46.794 14237-14272/de.minirechnung.devel I/Tesseract(native): Initialized Tesseract API with language=eng
    
    --------- beginning of crash
2019-05-07 09:03:00.058 14237-14272/de.minirechnung.devel A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 14272 (roidJUnitRunner)
2019-05-07 09:03:00.118 990-990/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
2019-05-07 09:03:00.118 990-990/? A/DEBUG: Build fingerprint: 'samsung/dream2qltezh/dream2qltechn:7.1/N2G48H/G9550ZHU1AQEE:user/release-keys'
2019-05-07 09:03:00.118 990-990/? A/DEBUG: Revision: '12'
2019-05-07 09:03:00.118 990-990/? A/DEBUG: ABI: 'x86'
2019-05-07 09:03:00.118 990-990/? A/DEBUG: pid: 14237, tid: 14272, name: roidJUnitRunner  >>> de.minirechnung.devel <<<
2019-05-07 09:03:00.119 990-990/? A/DEBUG: signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
2019-05-07 09:03:00.119 990-990/? A/DEBUG:     eax 00000000  ebx 0000379d  ecx 000037c0  edx 00000006
2019-05-07 09:03:00.119 990-990/? A/DEBUG:     esi 999fa978  edi 999fa920
2019-05-07 09:03:00.119 990-990/? A/DEBUG:     xcs 00000073  xds 0000007b  xes 0000007b  xfs 0000003b  xss 0000007b
2019-05-07 09:03:00.119 990-990/? A/DEBUG:     eip b7662bf0  ebp 999f2028  esp 999f1fcc  flags 00000292
2019-05-07 09:03:00.121 990-990/? A/DEBUG: backtrace:
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #00 pc 00000bf0  [vdso:b7662000] (__kernel_vsyscall+16)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #01 pc 0007cadc  /system/lib/libc.so (tgkill+28)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #02 pc 000782b5  /system/lib/libc.so (pthread_kill+85)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #03 pc 00028a2a  /system/lib/libc.so (raise+42)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #04 pc 0001eed6  /system/lib/libc.so (abort+86)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #05 pc 0016b50a  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZNK7ERRCODE5errorEPKc16TessErrorLogCodeS1_z+266)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #06 pc 001bfb4e  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZNK9tesseract4Dict7case_okERK11WERD_CHOICERK10UNICHARSET+142)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #07 pc 001ceb5b  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZNK9tesseract4Dict16AcceptableResultEP8WERD_RES+443)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #08 pc 0010b2f0  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZN9tesseract9Tesseract20tess_acceptable_wordEP8WERD_RES+48)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #09 pc 000cd630  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZN9tesseract9Tesseract17match_word_pass_nEiP8WERD_RESP3ROWP5BLOCK+256)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #10 pc 000cd313  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZN9tesseract9Tesseract19classify_word_pass1ERKNS_8WordDataEPP8WERD_RESPNS_13PointerVectorIS4_EE+403)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #11 pc 000ca299  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZN9tesseract9Tesseract17RetryWithLanguageERKNS_8WordDataEMS0_FvS3_PP8WERD_RESPNS_13PointerVectorIS4_EEEbS6_S9_+233)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #12 pc 000c49eb  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZN9tesseract9Tesseract26classify_word_and_languageEiP11PAGE_RES_ITPNS_8WordDataE+491)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #13 pc 000c58b5  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZN9tesseract9Tesseract18RecogAllWordsPassNEiP10ETEXT_DESCP11PAGE_RES_ITP13GenericVectorINS_8WordDataEE+757)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #14 pc 000c6d98  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZN9tesseract9Tesseract15recog_all_wordsEP8PAGE_RESP10ETEXT_DESCPK4TBOXPKci+456)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #15 pc 000ae902  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZN9tesseract11TessBaseAPI9RecognizeEP10ETEXT_DESC+1154)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #16 pc 000acf7c  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (_ZN9tesseract11TessBaseAPI11GetUTF8TextEv+76)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #17 pc 002c2b5a  /data/app/de.minirechnung.devel-2/lib/x86/libtesseract.so (Java_com_googlecode_tesseract_android_TessBaseAPI_nativeGetUTF8Text+74)
2019-05-07 09:03:00.122 990-990/? A/DEBUG:     #18 pc 00116157  /system/lib/libart.so

TessBaseAPI init failed.

Hi.
I called TessBaseAPI.init after copy eng.traineddata but return value is false.
Do you know what problem is?

confusing rootproject.libraryVersion

Dear Developer,

The rootProject.ext.libraryVersion is confusing when we use more libraries.
I suggest to rename it to tesseractLibraryVersion, or move this constant from the project level to module level.

Thanks
Gabor

local implementation very slow

Hello,
Thank you for the sample.
I have build the Tesseract project, and the sample

When I change the option (first) :
// Use library from JitPack for simplicity
implementation 'cz.adaptech:tesseract4android:4.1.1'
// Or use library compiled locally
//implementation project(':tesseract4android')

by (second option)
// Use library from JitPack for simplicity
//implementation 'cz.adaptech:tesseract4android:4.1.1'
// Or use library compiled locally
implementation project(':tesseract4android')

and execute the sample
It takes 30 seconds to recognize the text, and only about 2 seconds with first option

with the same smartphone.

Sincerely,

Couldn't find "libleptonica.so"

Good afternoon.
I'm trying to link this library to a .jar project. That runs directly without installation. I'm getting the following error on initialization:
"
java.lang.UnsatisfiedLinkError: dalvik.system.PathClassLoader[DexPathList[[zip file "/data/local/tmp/appname.jar"],nativeLibraryDirectories=[/system/lib, /product/lib, /system/lib, /product/lib]]] couldn't find "libleptonica.so"
at java.lang.Runtime.loadLibrary0(Runtime.java:1012)
at java.lang.System.loadLibrary(System.java:1672)
at com.googlecode.tesseract.android.TessBaseAPI.(TessBaseAPI.java:57)
"
Any ideas how to fix this?

Expose Deskew from Skew

Currently only findSkew is exposed in Skew.java , Deskew is also equally important which is not exposed by this lib. Kindly make available them.

PIX * pixDeskew (PIX *pixs, l_int32 redsearch)
PIX * pixFindSkewAndDeskew (PIX *pixs, l_int32 redsearch, l_float32 *pangle, l_float32 *pconf)

SIGSEGV when calling getUTF8Text with custom traineddata

First of all, I would like so say thank you for your work and efforts on this project.

As the title says we are getting an abortion signal SIGSEGV when calling getUTF8Text(). This happens while we are using our trained data trained with the OCR-D repository.

When using this custom traineddata from command line (tesseract version 4.0.0beta) it works fine.
Does someone has any idea why this the case?

simple app calling the library

Hello,
Could you provide a very simple App that call the library (with an image stored in res resource for example) that gets the text ?
So, we could get all compiler option and parameters in an App ready for use.
Sincerely,

Tesseract OCR: less memory consumption by avoiding new instances of TessBaseAPI?

Not really an issue but I thought about asking here since I don't know exactly how this API wrapper works.

Basically I make a new instance of MyOCR whenever I need to perform OCR.
This is currently what my constructor looks like:

public MyOCR(Bitmap bitmap)
	{
		this.tessBaseAPI = new TessBaseAPI();
		this.tessBaseAPI.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO_ONLY);
		try {
			this.tessBaseAPI.setDebug(true);
			this.tessBaseAPI.init("storage/emulated/0", "eng");
			this.tessBaseAPI.setImage(bitmap);
			this.text = tessBaseAPI.getUTF8Text();
			this.tessBaseAPI.end();
		} catch (Exception e) {
			e.printStackTrace();
			System.err.println(e.getMessage());
		}
	}

I was wondering if performance wise, in the long run, the following code be preferable.
Basically I make only one instance of MyOCR and set the new image every time I need to perform OCR.

public MyOCR()
{
	this.tessBaseAPI = new TessBaseAPI();
	this.tessBaseAPI.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO_ONLY);
	try {
		this.tessBaseAPI.setDebug(true);
		this.tessBaseAPI.init("storage/emulated/0", "eng");
	} catch (Exception e) {
		e.printStackTrace();
		System.err.println(e.getMessage());
	}
}

public void ocrTask(Bitmap bitmap)
{
	this.tessBaseAPI.setImage(bitmap);
	this.text = tessBaseAPI.getUTF8Text();
}

duplicate class

The following error shows when building app
java.lang.RuntimeException: Duplicate class com.googlecode.leptonica.android.AdaptiveMap found in modules tesseract4android-4.1.0-runtime.jar (cz.adaptech.tesseract4android:tesseract4android:4.1.0) and tesseract4android-openmp-4.1.0-runtime.jar (cz.adaptech.tesseract4android:tesseract4android-openmp:4.1.0)

Here is the build.gradle
`dependencies {
implementation fileTree(dir: 'libs', include: ['*.jar'])
implementation 'androidx.appcompat:appcompat:1.0.2'
implementation 'androidx.constraintlayout:constraintlayout:1.1.3'
testImplementation 'junit:junit:4.12'
androidTestImplementation 'androidx.test.ext:junit:1.1.0'
androidTestImplementation 'androidx.test.espresso:espresso-core:3.1.1'
implementation project(path: ':opencv')
implementation ('cz.adaptech:tesseract4android:4.1.0')
// implementation 'cz.adaptech:tesseract4android-openmp:4.1.0'

}
`

Could not initialize Tesseract API with language

I used getExternalFilesDir("/testOCR/") on Android12 to obtain external storage, which should not require read and write permissions. I copied the characters in assets to this directory and succeeded, but the Could not initialize Tesseract API with language, not working

pixReadMemTiff - function not present

Hello,
after building this library and invoking it on the device I get the following error:

Error in pixReadMemTiff: function not present
Error in pixReadMem: tiff: no pix returned
Error in pixaGenerateFontFromString: pix not made
Error in bmfCreate: font pixa not made

what can be done about it?

Guide for getting it running?

Hey!

I saw the previous issue for "installed, now what" post but I'm still not able to get it running. Is it possible to get a step-by-step installation walk-through?

SegFault in ReadFile.nativeReadBytes8

Hi,
I encountered a null pointer dereference in ReadFile.nativeReadBytes8 (Java_com_googlecode_leptonica_android_ReadFile_nativeReadBytes8, here).

Whenever either pixCreateNoInit or pixSetupByteProcessing return a null pointer, there happens a null pointer dereference at memcpy (here). This can be triggered by causing an Integer overflow in ReadFile.readBytes8 in the multiplication of the width and height parameter, which leads to pixCreateNoInit returning NULL (due to a check in pixCreateHeader, called by pixCreateNoInit).

In particular, this call gives me a SIGSEGV:

byte[] pixelData = "Some String".getBytes();
Pix p = ReadFile.readBytes8(pixelData, 0x10000, 0x10000);

To my knowledge these stubs are originally from Google. Since the google project seems unmaintained and the tess-two project as well is now longer maintained by the author, I think it makes sense to add checks in the wrapper files here.

Pull request #33 adds checks that prevent the crash from happening.

Bug in the documentation

	/**
	 * Calls End() and finalizes native data. Must be called on object destruction.
	 */
	private native void nativeRecycle(long mNativeData);

Replace end to recycle - TessBaseAPI

Duplicate classes during build

I am getting the following duplicate class issues during compilation.

By using the latest version of Tesseract4Android and

    implementation 'androidx.appcompat:appcompat:1.3.0'
    implementation 'com.google.android.material:material:1.3.0'

I get these:

	Duplicate class kotlin.internal.jdk7.JDK7PlatformImplementations found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk7-1.3.21 (org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.3.21)
Duplicate class kotlin.jdk7.AutoCloseableKt found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk7-1.3.21 (org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.3.21)

By using the latest version of Tesseract4Android and

    implementation 'androidx.appcompat:appcompat:1.6.1'
    implementation 'com.google.android.material:material:1.9.0'

I get these:

Duplicate class kotlin.collections.jdk8.CollectionsJDK8Kt found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk8-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.6.0)
Duplicate class kotlin.internal.jdk7.JDK7PlatformImplementations found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk7-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.6.0)
Duplicate class kotlin.internal.jdk8.JDK8PlatformImplementations found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk8-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.6.0)
Duplicate class kotlin.io.path.ExperimentalPathApi found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk7-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.6.0)
Duplicate class kotlin.io.path.PathRelativizer found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk7-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.6.0)
Duplicate class kotlin.io.path.PathsKt found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk7-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.6.0)
Duplicate class kotlin.io.path.PathsKt__PathReadWriteKt found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk7-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.6.0)
Duplicate class kotlin.io.path.PathsKt__PathUtilsKt found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk7-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.6.0)
Duplicate class kotlin.jdk7.AutoCloseableKt found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk7-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.6.0)
Duplicate class kotlin.jvm.jdk8.JvmRepeatableKt found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk8-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.6.0)
Duplicate class kotlin.random.jdk8.PlatformThreadLocalRandom found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk8-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.6.0)
Duplicate class kotlin.streams.jdk8.StreamsKt found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk8-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.6.0)
Duplicate class kotlin.streams.jdk8.StreamsKt$asSequence$$inlined$Sequence$1 found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk8-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.6.0)
Duplicate class kotlin.streams.jdk8.StreamsKt$asSequence$$inlined$Sequence$2 found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk8-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.6.0)
Duplicate class kotlin.streams.jdk8.StreamsKt$asSequence$$inlined$Sequence$3 found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk8-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.6.0)
Duplicate class kotlin.streams.jdk8.StreamsKt$asSequence$$inlined$Sequence$4 found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk8-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.6.0)
Duplicate class kotlin.text.jdk8.RegexExtensionsJDK8Kt found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk8-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.6.0)
Duplicate class kotlin.time.jdk8.DurationConversionsJDK8Kt found in modules jetified-kotlin-stdlib-1.8.0 (org.jetbrains.kotlin:kotlin-stdlib:1.8.0) and jetified-kotlin-stdlib-jdk8-1.6.0 (org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.6.0)

Jitpack or maven repo

Thanks for great work,
i used this library in [Trivia Hack(https://github.com/SubhamTyagi/loco-answers) as a submodule and it working fine. But it will be more helpful for other if you could make available this library as maven (or others types) dependencies for android like tess-two

Again thanks for this amazing work

UnsatisfiedLinkError when trying to create new instance of TessBaseAPI

This is a weird error because it doesn't happen every time - mostly rare - so it's difficult to debug.
This is the line where the error occurs:

TessBaseAPI tessBaseApi = new TessBaseAPI();

The error log:

E/linker: package com.app.myapp: library "/system/lib64/libjpeg.so" ("/system/lib64/libjpeg.so") needed or dlopened by "/system/lib64/libnativeloader.so" is not accessible for the namespace: [name="classloader-namespace", ld_library_paths="", default_library_paths="/data/app/com.app.myapp-4EcKvX8ZmvEUrqVJAF20Dg==/lib/arm64:/data/app/com.app.myapp-4EcKvX8ZmvEUrqVJAF20Dg==/base.apk!/lib/arm64-v8a", permitted_paths="/data:/mnt/expand:/mnt/asec:/data/data/com.app.myapp"]

D/AndroidRuntime: Shutting down VM
E/AndroidRuntime: FATAL EXCEPTION: main
Process: com.app.myapp, PID: 6393
java.lang.UnsatisfiedLinkError: dlopen failed: library "/system/lib64/libjpeg.so" needed or dlopened by "/system/lib64/libnativeloader.so" is not accessible for the namespace "classloader-namespace"
at java.lang.Runtime.loadLibrary0(Runtime.java:1016)
at java.lang.System.loadLibrary(System.java:1657)
at com.googlecode.tesseract.android.TessBaseAPI.(TessBaseAPI.java:52)
at com.app.myapp.utils.UtilsOCR.getTessBaseAPI(UtilsOCR.java:257)
at com.app.myapp.ocr.OCRTextEvaluator.init(OCRTextEvaluator.java:381)
at com.app.myapp.ocr.OCRTextEvaluator.(OCRTextEvaluator.java:48)
at com.app.myapp.helper.NotebookWriter.init(NotebookWriter.java:530)
at com.app.myapp.helper.NotebookWriter.(NotebookWriter.java:89)
at com.app.myapp.exercises.writing.fillblankspen.FillBlanksPenFragment.init(FillBlanksPenFragment.java:382)
at com.app.myapp.exercises.writing.fillblankspen.FillBlanksPenFragment.onCreateView(FillBlanksPenFragment.java:77)
at androidx.fragment.app.Fragment.performCreateView(Fragment.java:2600)
at androidx.fragment.app.FragmentManagerImpl.moveToState(FragmentManagerImpl.java:881)
at androidx.fragment.app.FragmentManagerImpl.moveFragmentToExpectedState(FragmentManagerImpl.java:1238)
at androidx.fragment.app.FragmentManagerImpl.moveToState(FragmentManagerImpl.java:1303)
at androidx.fragment.app.BackStackRecord.executeOps(BackStackRecord.java:439)
at androidx.fragment.app.FragmentManagerImpl.executeOps(FragmentManagerImpl.java:2079)
at androidx.fragment.app.FragmentManagerImpl.executeOpsTogether(FragmentManagerImpl.java:1869)
at androidx.fragment.app.FragmentManagerImpl.removeRedundantOperationsAndExecute(FragmentManagerImpl.java:1824)
at androidx.fragment.app.FragmentManagerImpl.execPendingActions(FragmentManagerImpl.java:1727)
at androidx.fragment.app.FragmentManagerImpl.dispatchStateChange(FragmentManagerImpl.java:2663)
at androidx.fragment.app.FragmentManagerImpl.dispatchActivityCreated(FragmentManagerImpl.java:2613)
at androidx.fragment.app.Fragment.performActivityCreated(Fragment.java:2624)
at androidx.fragment.app.FragmentManagerImpl.moveToState(FragmentManagerImpl.java:904)
at androidx.fragment.app.FragmentManagerImpl.moveFragmentToExpectedState(FragmentManagerImpl.java:1238)
at androidx.fragment.app.FragmentManagerImpl.moveToState(FragmentManagerImpl.java:1303)
at androidx.fragment.app.BackStackRecord.executeOps(BackStackRecord.java:439)
at androidx.fragment.app.FragmentManagerImpl.executeOps(FragmentManagerImpl.java:2079)
at androidx.fragment.app.FragmentManagerImpl.executeOpsTogether(FragmentManagerImpl.java:1869)
at androidx.fragment.app.FragmentManagerImpl.removeRedundantOperationsAndExecute(FragmentManagerImpl.java:1824)
at androidx.fragment.app.FragmentManagerImpl.execPendingActions(FragmentManagerImpl.java:1727)
at androidx.fragment.app.FragmentManagerImpl$2.run(FragmentManagerImpl.java:150)
at android.os.Handler.handleCallback(Handler.java:789)
at android.os.Handler.dispatchMessage(Handler.java:98)
at android.os.Looper.loop(Looper.java:251)
at android.app.ActivityThread.main(ActivityThread.java:6572)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.Zygote$MethodAndArgsCaller.run(Zygote.java:240)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:767)

Missing PageSegMode 13

I notify that TessBaseApi class is missing PageSegMode 13 (PSM_RAW_LINE ) which is used with the new LSTM engine to OCR a single text line image.

PSM_SINGLE_BLOCK_VERT_TEXT doesn't work for Japanese

I'm getting some pretty non-sensical results when I try to use PSM_SINGLE_BLOCK_VERT_TEXT with Japanese text. Back when I used to use tess-two instead of this library, it seemed to work. I'm using jpn.traineddata and jpn_vert.traineddata in https://github.com/tesseract-ocr/tessdata_best. And the way I'm initializing the APIs is here: https://github.com/0xbad1d3a5/Kaku/blob/master/app/src/main/java/ca/fuwafuwa/kaku/Ocr/OcrRunnable.kt

image

But yeah, I'm not entirely sure what's wrong here. Any hints/tips on how to debug this issue? Thanks!

Error with initialize tessdata

Hey, i have got the error where is say "couldn't initialize tesseract api with language", i have try many solution, even create subdirectory "tessdata". but it isn't work
Error

Crash on initialization on some devices

Hello. I use the compiled version of the library in my application. When I run the app on my Google Pixel 6a everything works great. But if I run the application on a weaker device or on a device with an old version of Android, I get a crash with the following error: A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 12321. Tested on two devices: Xiaomi Redmi note 10 Pro(Android 11) and Xiaomi Mi 5(Android 8). All attempts to correct the error, to no avail.
image

Detail log:

22971-22971 libc com.example.forblitz.livestatistics A Fatal signal 6 (SIGABRT), code -6 in tid 22971 (.livestatistics)
2023-06-09 02:16:51.102 23086-23086 DEBUG pid-23086 A pid: 22971, tid: 22971, name: .livestatistics >>> com.example.forblitz.livestatistics <<<
2023-06-09 02:16:51.119 23086-23086 DEBUG pid-23086 A #2 pc 000000000021cfc4 /data/app/com.example.forblitz.livestatistics-SH0kovyRsnozQwxSJhbXWg==/lib/arm64/libtesseract.so (_ZNK9tesseract7ERRCODE5errorEPKcNS_16TessErrorLogCodeES2_z+368)
2023-06-09 02:16:51.119 23086-23086 DEBUG pid-23086 A #3 pc 0000000000232cd8 /data/app/com.example.forblitz.livestatistics-SH0kovyRsnozQwxSJhbXWg==/lib/arm64/libtesseract.so (_ZN9tesseract8Classify22InitAdaptiveClassifierEPNS_15TessdataManagerE+420)
2023-06-09 02:16:51.119 23086-23086 DEBUG pid-23086 A #4 pc 0000000000310b08 /data/app/com.example.forblitz.livestatistics-SH0kovyRsnozQwxSJhbXWg==/lib/arm64/libtesseract.so (ZN9tesseract7Wordrec14program_editupERKNSt6__ndk112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEEPNS_15TessdataManagerESB+84)
2023-06-09 02:16:51.120 23086-23086 DEBUG pid-23086 A #5 pc 00000000001d1c90 /data/app/com.example.forblitz.livestatistics-SH0kovyRsnozQwxSJhbXWg==/lib/arm64/libtesseract.so (_ZN9tesseract9Tesseract14init_tesseractERKNSt6__ndk112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEES9_S9_NS_13OcrEngineModeEPPciPKNS1_6vectorIS7_NS5_IS7_EEEESH_bPNS_15TessdataManagerE+688)
2023-06-09 02:16:51.120 23086-23086 DEBUG pid-23086 A #6 pc 0000000000180d78 /data/app/com.example.forblitz.livestatistics-SH0kovyRsnozQwxSJhbXWg==/lib/arm64/libtesseract.so (_ZN9tesseract11TessBaseAPI4InitEPKciS2_NS_13OcrEngineModeEPPciPKNSt6__ndk16vectorINS6_12basic_stringIcNS6_11char_traitsIcEENS6_9allocatorIcEEEENSB_ISD_EEEESH_bPFbS2_PNS7_IcSC_EEE+1040)
2023-06-09 02:16:51.120 23086-23086 DEBUG pid-23086 A #7 pc 0000000000180958 /data/app/com.example.forblitz.livestatistics-SH0kovyRsnozQwxSJhbXWg==/lib/arm64/libtesseract.so (_ZN9tesseract11TessBaseAPI4InitEPKcS2_NS_13OcrEngineModeEPPciPKNSt6__ndk16vectorINS6_12basic_stringIcNS6_11char_traitsIcEENS6_9allocatorIcEEEENSB_ISD_EEEESH_b+56)
2023-06-09 02:16:51.120 23086-23086 DEBUG pid-23086 A #8 pc 00000000003134f0 /data/app/com.example.forblitz.livestatistics-SH0kovyRsnozQwxSJhbXWg==/lib/arm64/libtesseract.so (Java_com_googlecode_tesseract_android_TessBaseAPI_nativeInitOem+136)
2023-06-09 02:16:51.121 23086-23086 DEBUG pid-23086 A #9 pc 0000000000043f34 /data/app/com.example.forblitz.livestatistics-SH0kovyRsnozQwxSJhbXWg==/oat/arm64/base.odex (offset 0x30000)

Cannot resolve symbol 'TessBaseAPI'

I am trying to switch from:

implementation 'cz.adaptech.android:tesseract4android:2.1.0'

to:

implementation 'cz.adaptech:tesseract4android:4.1.1'

But when I try to use the class TessBaseAPI I am not able to import it. With the 2.1.0 version I was able to import it with:

import com.googlecode.tesseract.android.TessBaseAPI;

What am I missing here?

Not able to execute gradlew tesseract4android:assembleRelease

Hi,
Thank you for this amazing repository, being new to ocr it took a lot of research to found this repository, which works with tesseract4. I have cloned the project run the assemblerelease successfully but i am not able to execute gradlew tesseract4android:assembleRelease from terminal.
I tried finding solutions to it which says,

Command 'gradlew' not found

for solving this i am running ,

gradle wrapper --gradle-version 5.1.1

which is giving the error:

A problem occurred evaluating project ':app'.

Failed to apply plugin [id 'com.android.application']
Minimum supported Gradle version is 5.1.1. Current version is 4.4.1. If using the gradle wrapper, try editing the distributionUrl in /home/akanksha/.gradle/daemon/4.4.1/gradle/wrapper/gradle-wrapper.properties to gradle-5.1.1-all.zip

I have tried updating gradle versions, changing distribution url in gradle wrapper, but not able to execute this, can you please help me with it?

Thank you.

Cannot interupt OCR process started by getHOCRText

I use the TessBaseApi inside AsyncTask then call the getHOCRTText using an instance asyncOCR of AsyncOcr as follows :
asyncOcr.execute(image);

In doInBackground method:

protected String doInBackground(Bitmap... bitmaps) {
tessAPI.setImage(bitmaps[0]);
return tessAPI.getHOCRText(0);
}

When the AsyncOcr is interupted:
@OverRide
protected void onCancelled() {
super.onCancelled();
tessAPI.stop();
}

But the OCR is not interupted and runs to completion

Below is the full code of AsyncOcr:

static class AsyncOcr extends AsyncTask<Bitmap, Void, String>{
private TessBaseAPI tessAPI;
AsyncOcr(Reader context){
tessAPI = new TessBaseAPI(new TessBaseAPI.ProgressNotifier() {
@OverRide
public void onProgressValues(TessBaseAPI.ProgressValues progressValues) {
}
});
tessAPI.init(dataPath, model_name, TessBaseAPI.OEM_LSTM_ONLY);
}
@OverRide
protected void onPreExecute() {
super.onPreExecute();

    }

    @Override
    protected String doInBackground(Bitmap... bitmaps) {
        tessAPI.setImage(bitmaps[0]);
        return tessAPI.getHOCRText(0);
    }
    @Override
    protected void onPostExecute(String text) {
     }

    @Override
    protected void onCancelled() {
        super.onCancelled();
        tessAPI.stop();
    }
}

Received status code 401 from server: Unauthorized

I get an error in version 4.1.1. Maybe there are private submodules?

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':app:checkDebugAarMetadata'.
> Could not resolve all files for configuration ':app:debugRuntimeClasspath'.
   > Could not resolve cz.adaptech.android:tesseract4android:4.1.1.
     Required by:
         project :app > project :flutter_mrz_scanner
      > Could not resolve cz.adaptech.android:tesseract4android:4.1.1.
         > Could not get resource 'https://jitpack.io/cz/adaptech/android/tesseract4android/4.1.1/tesseract4android-4.1.1.pom'.
            > Could not GET 'https://jitpack.io/cz/adaptech/android/tesseract4android/4.1.1/tesseract4android-4.1.1.pom'. Received status code 401 from server: Unauthorized

mistake in build line

On the github page change :

dependencies {
// To use Standard variant:
implementation 'cz.adaptech:tesseract4android:4.1.1'

replace the quote : implementation "cz.adaptech:tesseract4android:4.1.1"

Move to mavenCentral?

If you delete jcenter now, then the project is not going to. jcenter is deprecated. Maybe move to mavenCentral?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.