Git Product home page Git Product logo

li-xirong / coco-cn Goto Github PK

View Code? Open in Web Editor NEW
168.0 5.0 24.0 199.49 MB

Enriching MS-COCO with Chinese sentences and tags for cross-lingual multimedia tasks

License: MIT License

Python 0.48% Shell 0.02% CSS 0.25% HTML 0.05% Jupyter Notebook 0.78% OpenEdge ABL 98.31% Makefile 0.01% Batchfile 0.01% C++ 0.03% C 0.01% JavaScript 0.07%
chinese-image-captioning cross-lingual-image-captioning image-captioning image-tagging cross-lingual-image-retrieval

coco-cn's Introduction

COCO-CN

COCO-CN is a bilingual image description dataset enriching MS-COCO with manually written Chinese sentences and tags. The new dataset can be used for multiple tasks including image tagging, captioning and retrieval, all in a cross-lingual setting.

Chinese sentences COCO-CN train COCO-CN val COCO-CN test
human written
human translation
machine translation (baidu)

coco-cn annotation examples

Progress

  • version 201805: 20,341 images (training / validation / test: 18,341 / 1,000 / 1,000), associated with 22,218 manually written Chinese sentences and 5,000 manually translated sentences. Data is freely available upon request. Please submit your request via Google Form.
  • Precomputed image features: ResNext-101
  • COCO-CN-Results-Viewer: A lightweight tool to inspect the results of different image captioning systems on the COCO-CN test set, developed by Emiel van Miltenburg at the Tilburg University.
  • NUS-WIDE100: An extra test set.

Citation

If you find COCO-CN useful, please consider citing the following paper:

coco-cn's People

Contributors

li-xirong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

coco-cn's Issues

Feature Extraction

Would you share the code of feature extraction? I tried a mxnet version of resnext101 but the extracted features do not match with the provided ones...

数据集下载

您好,百度网盘链接已失效;我已经填写了google form申请表,目前没有收到回复。希望您可以给予帮助,谢谢!

About dataset

Hi, I'm a Ph.D. student, and I have read your paper(it is an excellent work). So I'm going to do some research based on your COCO-cn dataset. And I have submitted the COCO-CN Application Form 5 days ago. There's no response now, so I hope you can quickly pass the audit. Many thanks!

About Dataset

Hi, I have submitted the form, I hope you can quickly pass the audit. Thank you very much.

About dataset

I've applied several times and no one has replied, my email is mmmwhy#mail.ustc.edu.cn (replace # as @)

about JSON files

Where can I download the files dataset_mscoco-cn.json and captions_mscoco-cn.json ?

关于数据集

您好,我未能找到COCO-CN数据集的下载链接,我希望将该数据集用于我的科研中,希望能得到您的支持,我的邮箱是[email protected],非常感谢!

关于coco-cn_retrieval_data.tar.gz

作者您好,感谢你在retrieval领域的贡献。
您提供的coco-cn data 的链接 “http://lixirong.net/data/coco-cn/coco-cn_retrieval_data.tar.gz”,现在 page not found 了,真诚地希望您可以给予帮助。

另:在coco-cn文件夹下找到了coco-cn_retrieval_data-v0.tar.gz 版本,但是在后面的运行中,发现会出现“IOError: [Errno 2] No such file or directory: ”,所以认为可能-v0版本 与需要使用的数据集并不一样。

about dataset

Thanks for sharing the benchmark.
I have submitted the form, I hope you can quickly pass the audit.

about dataset

Hello, thanks for releasing the great work.
Sorry to bother you, but when I check the released json file, I have two questions:

  • a. There is no tags in any json file, where can I find the tags like it presented in the paper ?

  • b. There is 19165 images (smaller than 20342) in captions_mscoco-cn.json, is this json file a final version?

Thanks for your reply!

Request for JSON files

I have already get this coco-cn dataset, but I find that two json file dataset_mscoco-cn.json and captions_mscoco-cn.json mentioned in README.md in coco-cn_caption directory is missing. And the download link you provided in issues is invalid now. So I request a new download link for these files. Thanks!

百度网盘的分享失效了

作者您好,感谢您在多模态资源建设方面的工作!我们希望能够下载到这份数据集的最新版本,但是看之前issue里面的网盘分享已经失效了。能否麻烦您再分享一份呢?

关于数据集的图片

你好,我已经通过邮件获得了你们的中文标注数据,非常感谢。但关于图片一直没有找到一个下载通道,请问能否提供一个下载链接

about dataset

Hi, I have submitted the form, I hope you can quickly pass the audit. Thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.