Git Product home page Git Product logo

squad_v2.0_dataset's Introduction

SQuAD_v2.0_Dataset

Stanford Question-Answering Dataset (v2.0) - parsed completely in ms-excel file.

The SQuAD 2.0 dataset is available online in JSON format Link There are several csv files for SQuAD 2.0 available for download on Kaggle etc. but none of them have all the attributes related to a question (as given in json files). Also, most of the datasets tend to ignore the unanswerable questions of SQuAD 2.0. The parsing of JSON files can be a task for beginners, so I have taken the opportunity of uploading the completely parsed SQuAD 2.0 dataset in excel files (both training and development sets).

The Training data has following columns:

  1. Title
  2. Context
  3. Question
  4. Id
  5. Answer
  6. Answer start
  7. Plausible Answer
  8. Plausible Answer Start
  9. Is_impossible

The Development set has following columns:

  1. Title
  2. Context
  3. Question
  4. Id
  5. Answer
  6. Answer start
  7. Plausible Answer
  8. Plausible Answer Start
  9. Is_impossible

The SQuAD 2.0 dataset has about 50,000 questions that are unanswerable and for such questions, the training and development dataset has plausible answer and plausible answer start options.

Train_set details: Total: 130319 unique questions Unanswerable questions: 43498 Answerable: 86821

Dev_set details: Total: 11873 unique questions Unanswerable questions: 5945 Answerable questions: 5928 15 unanswerable questions have no plausible answers given in the dataset.

Feel free to use the dataset for R&D purposes! Thank you!

Important Note: This dataset is the original property of Rajpurkar et al. (2018) and I haven't made any new changes in it so don't forget to cite Rajpurkar et al. (2018) when using this dataset.

Cite: Rajpurkar, P., Jia, R., & Liang, P. (2018). Know what you don't know: Unanswerable questions for SQuAD. arXiv preprint arXiv:1806.03822.

squad_v2.0_dataset's People

Contributors

pragyakatyayan avatar

Stargazers

Nisheeth Joshi avatar

Watchers

 avatar

Forkers

nisheethjoshi

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.