Git Product home page Git Product logo

open-ie-papers's Introduction

Table of Contents

  1. General
  2. Literature Reviews
  3. Papers - Neural Networks
  4. Papers - Parse-based and statistical
  5. Papers - Older papers and legacy systems
  6. Training and Testing Data

General

This README containts OpenIE and ORE papers and resources. Summaries are by @jbecke and @TheodoreChristakis, to the best of our abilities after reading each paper or testing the system (when available). We welcome pull requests with additional resources, papers, or data.

Literature Reviews

Papers - Neural Networks

*Learning Open Information Extraction of Implicit Relations from Reading Comprehension Datasets extracting more implied ("common sense") relations.

Papers - Parse-based and statistical

  • Graphene generates n-ary extractions with semantically linking-labels like "TEMPORAL", "CAUSE", etc. as well as open relations
  • Stanford Open IE: produces maximally-shortened tuples. It seems to often produce tuples for which the reported confidience is often 1.0. GPL or proprietary available as part of Stanford Core NLP.
  • OpenIE-X (v4, v5, allen institute version). Works well with simple statements (see examples in this dataset). Outputs context for extractions and gives good confidence predictions that can be used to balance precision-recall. Note the restrictive license (research purposes only).
  • Open Relation Extraction and Grounding: Extracts argument pairs of relation tuples and forms weighted dependency trees between two arguments. It shows promising results in determining relative importance of each argument in the tree.
  • Unsupervised Open Relation Extraction: Used for unsupervised relation extraction from free text by using pretrained word embeddings while using a sentence's dependency parse tree as a foundation.

Papers - Older papers and legacy systems

  • From University of Washington
    • TextRunner - One of the earliest papers addressing open information extraction
    • Reverb - Improved the extraction to better form the tuple of (argument, relation, argument)
    • OLLIE - Addressed the issue of misleading propositions and non-verb mediated relations
  • CSD-IE - Generation of nested contractions which is especially effective in sentences using subordinating clauses
  • PropS: Syntax Based Proposition Extraction
  • ClausIE - Formed a strong relation between grammatical clauses, propositions, and OIE extractions by defining seven grammatical patterns
  • ReNoun - Used predominantly for noun-mediated relations.

Training and Testing Data

  • 35M sentence-tuple pairs: from the paper Neural Open Information Extraction. It was generated by OpenIE-4, removing any tuples less then 0.9 confidence. Because there is no sample data, I've copied a bit below. As you can see, the data is somewhat noisy. It might be useful for extra training data, but not as a gold dataset.
* moving and handling '' ' - a comprehensive course that covers safe handling and transport of casualties .
<arg1> '' ' - a comprehensive course </arg1> <rel> covers </rel> <arg2> safe handling and transport of casualties </arg2>

this word , adjectival magavan meaning `` possessing maga - '' , was once the premise that avestan maga - and median magu - were co-eval .
<arg1> - '' , was once the premise that avestan maga - and median magu - </arg1> <rel> were </rel> <arg2> co-eval </arg2>

melora walters as candy ' - a hooker who works for the motel where john person is staying , as a complimentary service to the guests .
<arg1> ' - a hooker </arg1> <rel> works </rel> <arg2> for the motel </arg2>

- - a hunter who uses bows and arrows instead of guns .
<arg1> - - a hunter </arg1> <rel> uses </rel> <arg2> bows and arrows instead of guns </arg2>
  • TupleInf Open IE Dataset: OpenIE-4 extractions of 8th grade and 4th grade questions. By inspection, these tend to be cleaner than the above dataset because of the simplicity of the language. Confidence-values are retained so you can make your own tradeoff between precision and recall. Note suitable for a gold dataset.
01 April 1969 The ATM would be a manned solar observatory making measurements of the Sun by telescopes and instruments above 
0.96 (The ATM; would be; a manned solar observatory making measurements of the Sun by telescopes and instruments)
0.93 (a manned solar observatory; making; measurements of the Sun)

01 April 1969 The ATM would be a manned solar observatory making measurements of the Sun by telescopes and instruments above the Earth's atmosphere.
0.96 (The ATM; would be; a manned solar observatory making measurements of the Sun by telescopes and instruments above the Earth's atmosphere)
0.93 (a manned solar observatory; making; measurements of the Sun)

01 - Compare the physical properties of ice, liquid, water, and vapor.

01 Earthly Seasons PURPOSE: To show that the seasons are the consequence of the tilt of earth.

0.1% water can lower the melting temperature of peridotite by 100 C.
0.91 (0.1% water; can lower; the melting temperature of peridotite)

( 020 ) Celsius &#176;C The international temperature scale where water freezes at 0 (degrees) and boils at 100 (degrees).
0.89 (water; freezes; at 0 (degrees)
  • Squadie (not yet published, expect changes): this is our dataset derived from Squad. It uses a similar JSON format to SQuAD and contains 50,000 tuples. This tuple can then be matched with the corresponding sentence in the training corpus. Not suitable as a gold corpus. Squadie is useful for extracting implied relations. We have also converted Maluuba NewsQA.
                        {
                            "question": "Which film did Beyoncé star in 2001 with Mekhi Phifer?",
                            "id": "56d4831f2ccc5a1400d83155",
                            "answer": "Carmen: A Hip Hopera",
                            "tuple": "<Which film\tdid Beyoncé star with Mekhi Phifer\tCarmen: A Hip Hopera>"
                        },
                        {
                            "question": "What was the name of Destiny Child's third album?",
                            "id": "56d4831f2ccc5a1400d83156",
                            "answer": "Survivor",
                            "tuple": "<Survivor\tthe name of\tDestiny Child 's third album>"
                        },
                        {
                            "question": "Who filed a lawsuit over Survivor?",
                            "id": "56d4831f2ccc5a1400d83157",
                            "answer": "Luckett and Roberson",
                            "tuple": "<Luckett and Roberson\tfiled a lawsuit over\tSurvivor>"
                        },
                        {
                            "question": "When did Destiny's Child announce their hiatus?",
                            "id": "56d4831f2ccc5a1400d83158",
                            "answer": "October 2001",
                            "tuple": "<Destiny 's Child\tannounce their hiatus\tOctober 2001>"
                        }

open-ie-papers's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.