Git Product home page Git Product logo

coco-text's People

Contributors

aicentral avatar andreasveit avatar bgshih avatar congyao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

coco-text's Issues

Problems occurred when running ct.showAnns(anns)

Hi, I have run this API on my Mac OS and everything went right until problems occurred after the line ct.showAnns(anns).
The warning is as follows:
image

And the result of this line is:
image

Is there something wrong with this code or my computer?

Question about the extracted text image

Hi

Thanks for sharing. I noted that the coco text api (getAnnIds) when defining the area range, tend to give me very small size image. I had tried to put in [1000, 5000], [2000, -1] but it still gives me very very small text images (e..g 3 x 4 pixels). I presume by defining the area range, i am indicating that i desire image of certain area sizing . Am i using correctly?

Is the input in 'pixel' metric? and what is the difference in mask and bounding box? thanks

MSCoco 2014 Train Images do not Match the Annotations

The website for Coco-Text (https://bgshih.github.io/cocotext/) says to download the 2014 train images
image

However, when I downloaded the train images from the linked website, several of the images specified in the annotation file did not exist in the folder I downloaded from MSCoco. For example, the annotation file specifies "COCO_train2014_000000540965.jpg," however the 2014 training images I downloaded did not contain this image.
image

Am I downloading the wrong images, or have these images been updated since the website has been updated?

Missing annotations

This is not really a problem related to the API but rather to COCO_Text.json
I didn't know where to report this looking on the website so here I am.

While running coco_evaluation.getDetections() this is what happened:

gt_box = groundtruth.anns[gt_box_id['bbox']
KeyError: 1218650

The same holds for coco_evaluation.evaluateEndToEnd()

This is due to non-existent annotations referenced in imgToAnns.

Am I missing something or this is a real problem in the json file?

Label quality varies 2 much

Hi,
Is there any plan to release a better label for training set? The quality of current one varies too much. Some polygons are exactly same as bbox for oriented text. This is really annoying.
Thanks

some example images with typical issues
COCO_train2014_000000294914.jpg 2 different polygons covers same text differently
COCO_train2014_000000262184.jpg The polygon cant convers the text
COCO_train2014_000000131174.jpg It seems that a region is mislabeled

Python 3 Support

The code is not compatible with python 3, if you are willing to accept PRs I would love to contribute to it ๐Ÿ˜„

language 'na'

In the annotation part, language is classified by 'English or Not English or Na'.

Could you explain the meaning of 'na' ?

About 1000 test annotation

Hi, where can find the test annotations? In json files, there has only train set and val set. In ICDAR2017 official page, it says '1000 val and 1000 test'. Should I break the 2000 val into two part and change the json file by myself?

BTW: there has some kind of spelling errors in the code file and official page: some words of 'illegible' are spelled in 'illegilbe'.
https://github.com/andreasveit/coco-text/blob/master/coco_text_Demo.ipynb
and the official site https://vision.cornell.edu/se3/coco-text/

Images' filename

>>> ct = coco_text.COCO_Text('COCO_Text.json')
>>> train = [id for id in ct.imgs.keys() if 'train2014' in ct.imgs[id]['file_name']]
>>> len(train)
63686
>>> len(ct.train)
43686

Apparently every entry in imgs has a file_name field as follows:

COCO_train2014_ID.jpg

even though validation and testing images should have a different one. ( val2014 and test2014 instead of train2014)

Note that this is a problem in COCO_Text.json rather than the API itself.

dataset download

Hello, sorry if my question seems to be silly but I can't find out how to download the coco-text dataset images?

Regarding v1's polygon vs v2's mask

Hello
Thank you for great source
I have one question!

In cocotext version1, you have code in showAnn() in coco_text.py
tl_x, tl_y, tr_x, tr_y, br_x, br_y, bl_x, bl_y = ann['polygon']
So polygon information is fixed in length (8)

On the other hand, in https://github.com/bgshih/coco-text/blob/master/coco_text.py showAnn(),
verts = list(zip(*[iter(ann['mask'])] * 2)) + [(0, 0)]
which means mask annotation has different lengths per box

Could you explain what this difference mean?

multi-oriented detection evaluation?

Hi, may I ask obout multi-oriented detection evaluation? Do you provide evaluation method based on polygon predictions? such as (x1,y1,x2,y2,x3,y3,x4,y4)

Creating tfrecord for the dataset

Hi,

I'm currently trying to train the coco-text dataset with the Tensorflow object detection API. I would like to discuss here how to create a script that allows us to interface with the TF object detection API. Please note that I'm able to parse the tfrecords generated by my own script in a graph session. With the TF Object detection API I end up getting:

ConcatOp : Dimensions of inputs should match: shape[0] = [1,46] vs. shape[1] = [1,23]
	 [[Node: concat = ConcatV2[N=4, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ExpandDims, ExpandDims_1, ExpandDims_2, ExpandDims_3, Equal_4/y)]]

I'm wondering what could be the issue and how to solve it, since the error doesn't say much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.