Git Product home page Git Product logo

hebhtr's People

Contributors

lotemn102 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

hebhtr's Issues

Hello can i use this inside android app i'm building?

Hello,

I want to use this module to extract hand written hebrew text from images taken within an android phone,
for an android app i'm builiding.

would love some help and\or directions on how to use, if possible at all.

thanks,

cant use it with error below

Traceback (most recent call last):
File "/home/yaniv/HebHTR/pic.py", line 7, in
text = img.imgToWord(iterations=5, decoder_type='word_beam')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yaniv/HebHTR/HebHTR.py", line 14, in imgToWord
model = getModel(decoder_type=decoder_type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yaniv/HebHTR/predictWord.py", line 26, in getModel
model = Model(open(FilePaths.fnCharList).read(), decoderType,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yaniv/HebHTR/Model.py", line 29, in init
self.is_train = tf.placeholder(tf.bool, name='is_train')
^^^^^^^^^^^^^^
AttributeError: module 'tensorflow' has no attribute 'placeholder'

Dataset Access

Hi, I'm building a conditional GAN which creates Hebrew handwrite, can I have a link to the full dataset?

Example application to OCR yiddish handwritten letter is not able to successfully execute

Hi Lotem,

I have been trying to use this project to OCR a handwritten letter in yiddish - we are hoping that once we can read the text better, we might be able to translate it.
I cloned this project and https://github.com/githubharald/CTCWordBeamSearch. I built CTCWordBeamSearch according to their docs and that worked perfectly.
My entire set of changes is available by viewing: https://github.com/Lotemn102/HebHTR/compare/master...gedkott:yiddish-ocr?expand=1
I didn't open a PR to avoid creating noise for you since this is more of an application (with some portions, e.g. requirements.txt, .python-version potentially desired upstream if you'd like).
Additionally, I am running Ubuntu 18.04.6 LTS, 64 bit OS, with 15.3 GiB RAM, Intel® UHD Graphics (CML GT2), and Intel® Core™ i7-10710U CPU @ 1.10GHz × 12.

When I attempted to run a program using your project:

from HebHTR import *

# Create new HebHTR object.
img = HebHTR('./yiddish.png')

# Infer words from image.
text = img.imgToWord(iterations=5, decoder_type='word_beam')

I ran into several issues.

  1. the version of tensorflow required according to the README.md appears to be only available when running python 2.7.18 (or lower presumably). I added a requirements.txt file with the following:
opencv-python == 4.2.0.32
tensorflow == 1.12.0
numpy == 1.16.4

After some experimentation, I found that these were the best versions of each dependency needed. I am happy to supply the requirements.txt back to this project if you'd like.

  1. python would crash when reading the data files because of invalid encoding:
Traceback (most recent call last):
  File "main.py", line 7, in <module>
    text = img.imgToWord(iterations=5, decoder_type='word_beam')
  File "/home/gedalia-kott/hebrew-ocr-projects/HebHTR/HebHTR.py", line 14, in imgToWord
    model = getModel(decoder_type=decoder_type)
  File "/home/gedalia-kott/hebrew-ocr-projects/HebHTR/predictWord.py", line 27, in getModel
    mustRestore=True)
  File "/home/gedalia-kott/hebrew-ocr-projects/HebHTR/Model.py", line 38, in __init__
    self.setupCTC()
  File "/home/gedalia-kott/hebrew-ocr-projects/HebHTR/Model.py", line 155, in setupCTC
    corpus.encode('utf8'), chars.encode('utf8'),
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 0: ordinal not in range(128)

This was resolved by defaulting all encodings to utf-8 during the main.py script execution:

# coding: utf-8
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
  1. Next, the script would fail due to check-pointing with tensorflow. An excerpt of the stacktrace:
InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [1,1,512,96] rhs shape= [1,1,512,69]
	 [[node save/Assign_16 (defined at /home/gedalia-kott/hebrew-ocr-projects/HebHTR/Model.py:161)  = Assign[T=DT_FLOAT, _class=["loc:@Variable_5"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Variable_5/RMSProp, save/RestoreV2:16)]]

I can provide more detail if it helps, but this was resolved by deleting the checkpoint, snapshot, and accuracy files from the model directory.

  1. Next, the script would fail when executing:
if self.mustRestore and not latestSnapshot:
            raise Exception('No saved model found in: ' + modelDir)

This was resolved by modifying the invocation in predictWord.py from:

model = Model(open(FilePaths.fnCharList).read(), decoderType,
                  mustRestore=False)

to

model = Model(open(FilePaths.fnCharList).read(), decoderType,
                  mustRestore=True)

I believe this will force the model files to be recreated if the previous ones are not compatible with my environment.

  1. Last, I am not able to run the script because in processFunctions.py in the method preprocessImageForPrediction, I am seeing that the dimensions of my PNG image is a tuple with three elements (dimension of 3) and the function is assuming the image will be a tuple of dimension two resulting in the following:
Traceback (most recent call last):
  File "main.py", line 12, in <module>
    text = img.imgToWord(iterations=5, decoder_type='word_beam')
  File "/home/gedalia-kott/hebrew-ocr-projects/HebHTR/HebHTR.py", line 15, in imgToWord
    transcribed_words.extend(predictWord(self.original_img, model))
  File "/home/gedalia-kott/hebrew-ocr-projects/HebHTR/predictWord.py", line 32, in predictWord
    return infer(model, image)
  File "/home/gedalia-kott/hebrew-ocr-projects/HebHTR/predictWord.py", line 15, in infer
    img = preprocessImageForPrediction(image, Model.imgSize)
  File "/home/gedalia-kott/hebrew-ocr-projects/HebHTR/processFunctions.py", line 20, in preprocessImageForPrediction
    target[0:newSize[1], 0:newSize[0]] = img
ValueError: could not broadcast input array from shape (32,26,3) into shape (32,26)

I am not sure how to proceed from here.

Would love to get your input on how I can update the project to work for me. I am happy to discuss compensation for your time and effort helping me as well.

Gedalia Kott

Access to data

Hi Lotem,
I know it's been a while but I'm trying my luck..
I'm building a GAN which creates Hebrew handwriting, can you please give a link or something to the full data?

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.