Git Product home page Git Product logo

document-models-with-ext-information's Introduction

Document Modeling with External Information for Sentence Extraction

This repository contains the code neccesary to reproduce the results in the paper:

Document Modeling with External Attention for Sentence Extraction, Shashi Narayan, Ronald Cardenas, Nikos Papasarantopoulos, Shay B. Cohen, Mirella Lapata, Jiangsheng Yu and Yi Chang, ACL 2018, Melbourne, Australia.

Extractive Summarization

To train XNet+ (Title + Caption), run:

python document_summarizer_gpu2.py --max_title_length 1 --max_image_length 10 --train_dir --model_to_load 8 --exp_mode train

from extractive_summ/.

Answer Selection

  1. Datasets and Resources

a) NewsQA

Download the combined dataset from: https://datasets.maluuba.com/NewsQA/dl

Download splitting scripts from NewsQA repo: https://github.com/Maluuba/newsqa

b) SQuAD: https://rajpurkar.github.io/SQuAD-explorer/

c) WikiQA: https://www.microsoft.com/en-us/download/details.aspx?id=52419

d) MarcoMS: http://www.msmarco.org/dataset.aspx

e) 1 billion words benchmark: http://www.statmt.org/lm-benchmark/

  1. Preprocessing

First, train word embeddings on the 1BW benchmark using word2vec and place the files on answer_selection/datasets/word_emb.

Generate the score files (IDF, ISF, word counts) for each dataset by running

python reformat_corpus.py

from answer_selection/datasets//

The preprocessed files will be placed in the folder: answer_selection/datasets/preprocessed_data/

  1. Training

Run the scripts run_ in each model folder for training.

  1. Evaluation

Run the scripts eval_ in each model folder for training.

document-models-with-ext-information's People

Contributors

ronaldahmed avatar shashiongithub avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

document-models-with-ext-information's Issues

Environments

Does this model run with python2.7 and tensorflow0.10 ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.