Git Product home page Git Product logo

mcic-coco's Introduction

MCIC-COCO is a machine comprehension dataset that is generated based on the
publicly available COCO dataset. The technique to create such a dataset
is reported in the paper:

"Understanding Image and Text Simultaneously: a Dual Vision-Language Machine
Comprehension Task",
Nan Ding, Sebastian Goodnman, Fei Sha, Radu Soricut

We generate a datasets of over half-million examples,
on which we estimate that the human-level accuracy is in the 83% range
(in a 5-way multi-choice setup; for comparison, a random-guess approach has 20%
accuracy).
A novel neural-network architecture that combines the representation power
of recursive neural networks with the discriminative power of fully-connected
multi-layered networks (see above cited paper) achieves the best result as of
the date of the dataset publication: 60.8% on the test set.

What is enclosed in this package is the MCIC-COCO dataset.

Datasets needed:
D1. COCO images (train_2014 and val_2014).
    [Image data available for download at:
     http://mscoco.org/dataset/#download].
D2. The MCIC-COCO dataset that comes with this package, see data/

How to read MCIC-COCO dataset:
Each line of the MCIC-COCO dataset is one example, which contains the following
fields:
answer_[0-4]: 5 candidate captions (tokenized) for the image. All captions come
              from captions_train2014.json and captions_val2014.json of the COCO
              dataset.
image:        the image filename from train_2014 or val_2014 of the COCO images
example_id:   a unique string id for each example
reference:    the answer index of the true caption (0 to 4)

Note: Due to Google open-source policy, we replaced three sensitive words from the original dataset to "j*llyfish", "f*cking", "s*upid". 

mcic-coco's People

Contributors

dingnan-google avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.