Git Product home page Git Product logo

Comments (17)

Vijayabhaskar96 avatar Vijayabhaskar96 commented on August 15, 2024 4

@rkcosmos Good news! Finally, I found a way to render all Tamil characters correctly with Unicode fonts.
pyvips renders everything fine(I rendered and glanced every alphabet in ta_cha.txt manually and everything looks fine).
Remember to use unique names for the font while loading, I got the same font even after loading next font(maybe because I was testing on Jupiter notebook the states were saved?)

import pyvips
image = pyvips.Image.text(text_string, dpi=3000, font='Catamaran', \
                          fontfile="F:/tamil-fonts-master/open-sourced/Catamaran-Black.ttf")
image.write_to_file("text.png")

Directly convert to PIL image as:

mem_image = image.write_to_memory()
pil_image = Image.frombytes('L', (image.width, image.height), mem_image)

from easyocr.

rkcosmos avatar rkcosmos commented on August 15, 2024 3

I'll train the model and update the repo. It usually takes me at least one week for each model.

from easyocr.

vanangamudi avatar vanangamudi commented on August 15, 2024 3

from https://en.wikipedia.org/wiki/Tamil_All_Character_Encoding. Looks like I cannot use Unicode with Tamil

Though TACE16 is well designed encoding for Tamil, UTF-8 is the most popular encoding for Tamil that is in wide spread use. There are few unicode fonts for Tamil. We have to use complex layout engines like pango for eastern languages like Tamil.

Let me try this out

from easyocr.

Vijayabhaskar96 avatar Vijayabhaskar96 commented on August 15, 2024 2

I used open-tamil to convert unicode to bamini font and rendered using the same font and it worked.

from PIL import Image
from PIL import ImageFont, ImageDraw
image=Image.new("RGB",[320,320])
draw = ImageDraw.Draw(image)
a="கௌ"
a=tamil.txt2unicode.unicode2bamini(a)
font=ImageFont.truetype("F:/Bamini.ttf",25)
draw.text((50, 50), a, font=font)

you can download the font here https://www.freetamilfont.com/download.php?id=735930
I don't know any other best way, will try to find help from someone.

from easyocr.

arulrajnet avatar arulrajnet commented on August 15, 2024

Here is the PR #37

from easyocr.

rkcosmos avatar rkcosmos commented on August 15, 2024

Ok, waiting for word list.

from easyocr.

Vijayabhaskar96 avatar Vijayabhaskar96 commented on August 15, 2024

I would like to know how to add a new language to the repo and get it working, For example: the character and words for the Tamil language is now added to the repo but how to get it working? how to train the model?

from easyocr.

rkcosmos avatar rkcosmos commented on August 15, 2024

@Vijayabhaskar96 @arulrajnet

I'm now trying to do Tamil OCR. The problem is rendering something like 'கௌ'. Now I'm having
image
image from Pillow. How can I render it correctly?

from easyocr.

rkcosmos avatar rkcosmos commented on August 15, 2024

from https://en.wikipedia.org/wiki/Tamil_All_Character_Encoding. Looks like I cannot use Unicode with Tamil

from easyocr.

rkcosmos avatar rkcosmos commented on August 15, 2024

it works! Is there other font that work like this? I normally use >20 fonts for other languages? It's important to have variation. I already have a lot of unicode font but not a single one works.

from easyocr.

rkcosmos avatar rkcosmos commented on August 15, 2024

Do you guys want Tamil's number in the model as well. https://en.wikipedia.org/wiki/Tamil_numerals
They looks similar to what is in the list but python says they are different. or is it just unicode issue?

from easyocr.

Vijayabhaskar96 avatar Vijayabhaskar96 commented on August 15, 2024

No, I don't. Nobody uses them.

from easyocr.

rkcosmos avatar rkcosmos commented on August 15, 2024

ok, please tell me if there's any clue about rendering, I am in blind spot now. Do u guys use unicode font in other software, like word, excel, notepad, etc? I found old post complaining about rendering in windows7, 10.

from easyocr.

Vijayabhaskar96 avatar Vijayabhaskar96 commented on August 15, 2024

I'm trying to find answers, so far I think PIL is what messing up the order of the characters while rendering. Is there any other way to render texts as image?

from easyocr.

Vijayabhaskar96 avatar Vijayabhaskar96 commented on August 15, 2024

Did you figure out a way to render? @vanangamudi told me to try

convert pango:"தமிழ்" output.gif
We need to install imagemagick and pango. I am on ubuntu. any linux machine should work.

I don't have a Linux machine, tried on colab but had installation issues. can you give a try if nothing works?

from easyocr.

rkcosmos avatar rkcosmos commented on August 15, 2024

I'm on Linux, will try. I think we have no choice but using other programming languages to render. It is surely possible but I have to modify my training pipeline a lot, so I'll put Tamil on hold for a while and work on other languages first. Will come back to Tamil later, everything should be done within Roadmap's timeline.

from easyocr.

rkcosmos avatar rkcosmos commented on August 15, 2024

Wonderful, will try.

from easyocr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.