Git Product home page Git Product logo

unrealtext's People

Contributors

belval avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

unrealtext's Issues

Drawing images instead of text

From looking at the code, I was wondering how hard it would be to hijack the StickerTextActor to place small images instead of text, allowing for the generation of a dataset for label detection.

A bit more context, I am working on building a hazmat label detector (the diamond-shaped sign found on explosive/corrosive/infectious material containers) for a competition and it would be useful to have a lot of synthetic data to save the actual data (of which I do not have a lot) for fine-tuning.

From glancing at your code, unless I am mistaken you already load the text as PNG and then draw it with the engine, meaning that the text generation is on the Python side and I could simply load my own PNG (that wouldn't be text) and it would work?

Great project btw, can't wait for your presentation at CVPR :)

EDIT: Just to be clear, this is not a feature request, I just want to be sure that it is doable.

The final question

Thanks for your guidance before.
But do you know how to run the executable files in the background without an editor graphical interface?

Can this be run on multiple terminals?

I want to make the image generation faster by running multiple terminal on the code. Is it feasible to do so? I am stuck cuz only one terminal work and the rest are just not responding

What is counted as difficult text ?

Thanks for open sourcing data generation code. May I know your the intended usage on the flag is_difficult ?

In the source code it is always set to be 0, this looks a bit confusing. Any additional explanation would be greatly appreciated !

Can you share the word crop code

In the paper : "We crop from the proposed multilingual dataset. We discard images with widths shorter than 32 pixels as they are too blurry, and obtain 4.1M word images in total."
But I ended up with more than 7 million text line images.

some imgs seem to be corrupted

Hi,
Thanks for your open source dataset. But when I use Unrealtext dataset, I found some imgs corrupted. I referenced mmocr source code to pretrain my model, but it seems loading img meet error:

            with open(img_name, 'rb') as f:
                img_buff = f.read()
            img = mmcv.imfrombytes(img_buff, IMREAD_COLOR)

accidently, above code would meets error: cv2.error: OpenCV(4.1.2) /io/opencv/modules/imgcodecs/src/loadsave.cpp:730: error: (-215:Assertion failed) !buf.empty() in function 'imdecode_'. It seems the img file is corrupted. But I am not sure which one or ones corrupted (Debugging and finding)

The character boxes on hook faces seems to be offset

There seems to be offset of character boxes on hook faces with high probability, shown in the following images.
This image is from English dataset Sub39 27.jpg
image
Only use perspective transformation to get the boxes may not work.

I find that it is difficult to run the demo.

When it run to Line 72 in TexPlacingModule.py, error occurred as 'FileNotFoundError: [Errno 2] No such file or directory: '/home/eecbd/github_project/UnrealText_CVPR2020/GeneratedData/DataFraction_14/WordCrops/adjusted_text_box.txt''.
We can't find anywhere to save 'adjusted_text_box.txt' in your code, could you please help me solve this problem?

Issue with the labels file

In some files, the number of cbox and number of character in text doesn't match. For example: in UnrealText/sub_5/labels/288.json: no of cboxes are 32 while no. of characters are 33. Any way to resolve this?

quite some character level bboxes are of zero width or zero height

thanks again for open sourcing your dataset and generation code. When training a model that requires character level labelling, I found that many of the character level bboxes either have zero width or zero height. May I know a suggested way to clean the data ?

Can I extract background scene before add text??

I tried to extract image before placing text in the scene by adding self.cleint.SaveImg before 'step 3: place text' in DataGeneratorModule.py

but text already exists in the extracted scene

is it possible extract a scene that is the same as the generated image but has no text??

Chinese in the english dataset?

Hello,

First off, thanks for releasing this dataset, I believe it is a very useful contribution for the OCR community.

I have a question regarding the splits. The paper indicates that the English dataset is generated using English words. As a result, I expected to find mostly (only?) Latin characters.
However, from what I can tell, there is quite a bit of exotic utf8 characters, which on first inspection seem to be mostly Chinese/Arabic characters.

Is this expected? As noted in #11, this is problematic since the Chinese characters seem to be often miss-rendered.

Here is an example, taken from sub_2/imgs/4482.jpg
image

Using the following snippet, run on the sub_0 to sub_5, I estimate that such characters appear in ~15% of the images of the dataset, which I wouldn’t qualify as negligible:

import json
import tqdm
import glob
all_files = glob.glob("unrealtext/sub_*/labels/*.json")
count = 0
for fname in tqdm.tqdm(all_files):
    with open(fname, "r") as f:
        data = json.load(f)
        for t in data["text"]:
            if max([ord(c) for c in t]) > 500:
                count += 1
                break
print("problematic proportion", count / len(all_files))

Did I miss something? Could you perhaps clarify on how the english words were mined in the first place?
Thanks in advance

Airsim?

Is airsim required to run the project
And the opengl whether is needed install?

Some characters are not supported by the fonts

Some characters seem to be unsupported by the fonts files and the 10 pixel rule doesn't seem to apply.

example file: english/sub_101/labels/444.json

Some unknown characters are rendered on the top left corner ( under Toun ) in the following image.
444

May I know if there's a good way to solve it ?

Thanks !

Data annotation problem

Hi, When I used the dataset (English/Latin) you provided, I found that some json annotations were wrong, such as 749 ~ 753 for sub_48. And why are some coordinates negative?
Looking forward to your reply.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.