Comments (5)
Hi,
I will maybe need a little bit clarification on your question.
Right now there are different approaches to recognition of handwritten text. One of them first separates a word into individual characters using Gap-Classifier and than recognize these separated characters using Char-Classifier. Therefore, Gap-Classifier and Char-Classifier are two different models. If you already have Char-Classifier which can recognize individual characters you need to train Gap-Classifier.
Gap-Classifier takes slices from an image of a word and classifies them on either letters or gaps. Right now I have two different Gap-Classifiers: GapClassifier.ipynb and GapClassifier-BiRNN.ipynb. The first one uses only a convolutional neural network and the second uses it with a combination of a recurrent neural network.
The model you should train depends mainly on your dataset. The GapClassifier.ipynb takes as an input images separated into two classes: the images of letters are in 0 class and the images of gaps between letters have class 1. On the other hand, the GapClassifier-BiRNN.ipynb takes as an input images of words with an array of numbers corresponding to positions of gaps in the image.
Can you give me more details about your dataset, so I can give you more details.
from handwriting-ocr.
Thank you so much @Breta01 .
I wanted the model to predict both lowercase and uppercase characters from even cursive written text. So I made handwritten character dataset (characters alone with corresponding labels) and trained it using CharClassifier.ipynb and obtained a model. This is the model i used for testing a word. I found that the borders drawn for separating characters on a word is wrong. Suppose a word 'Mind' is given, the border is drawn in the middle of character M. Therefore prediction also fails.
I am attaching a snapshot.
SO, can you please guide me with this.
from handwriting-ocr.
SO, can you please guide me with this.
from handwriting-ocr.
I understand the problem that we are facing here, but there is a limited number of things I can do about it.
The current GapClassifier which detects the borders between letters was trained on my handwriting and it fails on a handwriting of others. Therefore, if you want to improve the gap detection, you have to extend the current dataset and train new GapClassifier.
There is still a lot of work on this project. One of the most important parts is a large dataset, so I would really appreciate if you were willing to share your dataset.
Hope this explains the issue and thank you for your interest.
from handwriting-ocr.
Thanks @Breta01 . This clarifies . Dataset will upload soon.
from handwriting-ocr.
Related Issues (20)
- Query: Punctuation Marks HOT 1
- Language HOT 3
- not giving output same as in your github ocr.ipynb ctc model HOT 9
- ValueError: zero-size array to reduction operation minimum which has no identity
- unimplementederror: tensor array has size zero, but element shape [?,256] is not fully defined. currently only static shapes are supported when packing zero-size tensorarray
- File models/gap-clas/CNN-CG.meta does not exist.
- No Function : imageNorm ? HOT 1
- 'TrainingPlot' object has no attribute 'updateCost' HOT 2
- Tensor shape error / not training my images HOT 1
- handwriting-ocr/word_classifier_CTC.ipynb question
- ModuleNotFoundError: No module named 'ocr'
- ValueError: too many values to unpack (expected 2) HOT 5
- training time
- How much time it takes for training i am waiting for 2 hours and what is value of LOSS_ITER and also can you check the train.csv, dev.csv, test.csv i have generated are good to use or have some error?
- What does this code doing and how can i visualize it's output. HOT 1
- ValueError: Cannot feed value of shape (13, 1, 3600) for Tensor 'inputs:0', which has shape '(None, 64, None, 1)'
- Javascript implementation HOT 1
- File does not exist. Received: F:\MY_PROJECT\handwriting-ocr-master\src\ocr\../../models/gap-clas/CNN-CG.meta. HOT 1
- Request for resources
- field to access
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from handwriting-ocr.