Comments (6)
AFAIK the Compute CTC targets failed for
is merely a problem on a single line pair and not directly indicative of a problem with the network topology. Did you inspect those samples visually?
Also, why do you say these are terminal
? Does training not continue?
You already linked to the VGSL docs, which are pretty comprehensive. Here is the implementation (the spec parser).
In my experience, the problem with custom net specs is more with getting training to converge to low error rates at all. Usually it stays in the high nineties percentage BCER. Once you found a workable spec, you may still need to set a large max iterations to even see the initial drop in error rate, esp. if you have many CNN layers. (Note that Tesseract has no 1d or 2d dropout, so training large networks is much harder, perhaps best attempted via append/impact strategy ...)
Your configurations should be fine IMO – what kind of material are you trying?
from tesstrain.
@bertsky Thank you for sharing your insights and the references! They are all very relevant to my project.
Back to the topic -
-
Sorry for the confusion. By "Terminal," I just meant the shell console for Macbook. I was merely showing you the error message from Tesstrain.
-
Yes, I visually inspected the .box, .gt.txt, .png files, which were procedurely generated for my project. The .lstmf files were generated during
make lists
, and were not modified since the successful run ofNET_SPEC- NET_SPEC := [1,48,0,1 Ct3,3,16 Mp3,3 Lfys64 Lfx96 Lrx96 Lfx512 O1c\#\#\#]
-
I read in a recent update that
.lstmf
files are no longer needed for the training process. If so, after removing the.lstmf
files from my training set directory, what info do I write tolist.train
andlist.eval
? For me, these two list files used to list all.lstmf
files, which are based on.png
and.box
files I provided. What and how to tell the training process if I want to train on the.png
and.box
files? -
To your last question, again, I am trying to generate a more powerful OCR model for a mix of Chinese language and some mathematical symbols. The goal is to have a model that is more capable of dealing with various noises. I have therefore procedurally generated 10 million text lines (I modified the makefile to accommodate a more complex directory structure to host this many files) with varying fonts, tilt, background gridlines, printer/ink/camera effects. All images were precisely labeled for each character in the
.box
file during the generation process.
I started by fine-tuning the existing chi_sim
model, and it plateaued at about 3% error rate for a while before I decided to try a larger model which will hopefully be able to absorb the added complexity of my training information.
Then this net spec NET_SPEC := [1,64,0,1 Ct5,5,32 Mp3,3 Lfys128 Lfx256 Lrx256 Lfx1024 O1c\#\#\#]
got about 35% after a few hundreds of thousands of iterations and entered a long plateau. That was when I wanted to try the 5 proposed net specs. However, from your feedback, the issue may not be entirely the size of the net, and the lack of the dropout mechanism may be a major factor. Should I do some hacking and implement the dropout mechanism myself?
Any experience or insights in training more powerful than official Tesseract OCR models will be greatly appreciated.
from tesstrain.
Related Issues (20)
- fine tuning arabic traineddata to solve extended words issue HOT 2
- Error while compiling tesseract within tesstrain HOT 2
- Maths OCR
- Can't open lstm.train despite (probably) having all training tools HOT 1
- Training a model from scratch with own imgs + txts? HOT 1
- Trying to train Tesseract for a different font, unable to get CER under 50%
- File not found - *.gt.txt HOT 3
- Error fine tuning new font for Thai Language
- What if my ground truth includes characters not found in a *.unicharset?
- Error generate text2image using khm.training_text HOT 1
- make training not building traineddata file HOT 1
- `make lists -j32` doesn't seem to be honoring the thread count. (Also happens when calling `make training -j32`) HOT 3
- deu_latf wordfile HOT 4
- unicharset_extractor stuck HOT 1
- How to train captcha? HOT 4
- winget install GnuWin32.Make error HOT 10
- make tesseract-langdata error HOT 7
- A question about missing dependency warnings when compiling and installing tesseract on centos using source code HOT 1
- How to train Chinese tradtional vertical in Tesseract 5? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tesstrain.