Git Product home page Git Product logo

sure's Introduction

Unsupervised Relation Extraction with Sentence level Distributional Semantics

SURE is an unsupervised system for relationship extraction relying on Sentence level Distributional Semantics (i.e., sentence enocoding). For more details please refer to: The paper

Architecture: system description


Dependencies

You need to have Python 3.6 or above and the following libraries installed:

Sentence-Transformers: https://www.sbert.net/

which you can install issuing the following command:

pip install -r requirements.txt

Usage:

To run the relation extraction system use the following command:

python main.py corpus entity_type1 entity_type2
Config
between_length: 6             # Maximum number of tokens between two entities       
before_after_window: 3        # Maximum number of tokens before the first entity and maximum number of tokens after second entity
similiraty: 0.25              # Cosine similirity threshold during the first iteration
top_similar: 15               # Maximum number of top similar sentences to the query term using cosine similarity
query_term: born in           # Natural language representation for relationship birthPlace
corpus

A sample sentence in the corpus is one sentence per line, with tags identifing the named type of named-entities, e.g.:

<ORG> Consolidated Edison </ORG>, based in <LOC> New York </LOC>, generated more than $7 billion in annual revenue.
The social media platform <ORG> Facebook,Inc.</ORG> announced it was acquiring <ORG>WhatsApp</ORG>, its largest acquisition to date.
<LOC> Herzogenaurach </LOC> is the home of goods company <ORG> Adidas<ORG>.
entity types

Two named entity types are provided initally for a particular relation for example, for rleation headquarterIn we provide ORG (Organization) and LOC (Location).

Dataset used

Dataset Download
NYT-FB dataset Download
Wikipedia_Wikidata dataset Download
English gigaword Download

Run with basic configuration

python main.py "text_corpus"  PER LOC 2

Authors

sure's People

Contributors

manzoorali29 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.