Git Product home page Git Product logo

abstractive-text-summarization's Introduction

abstractive-text-summarization

This repository and notebook contains code for in-progress implementation/experiments on Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond paper.

Requirements

  1. Create conda environment

conda env create -f environment.yml --gpu

conda env create -f environment-cpu.yml --cpu

  1. Activate environment

source activate abs-sum -- gpu

source activate abs-sum-cpu -- cpu

  1. Install dependencies (PyTorch, Fastai, etc) via:

pip install -r requirements.txt

  1. Download spacy english module

python -m spacy download en

Dataset

The dataset used is a subset of the gigaword dataset and can be found here.

It contains 3,803,955 parallel source & target examples for training and 189,649 examples for validation.

After downloading, we created article-title pairs, saved in tabular datset format (.csv) and extracted a sample subset (80,000 for training & 20,000 for validation). This data preparation can be found here.

An example article-title pair looks like this:

source: the algerian cabinet chaired by president abdelaziz bouteflika on sunday adopted the #### finance bill predicated on an oil price of ## dollars a barrel and a growth rate of #.# percent , it was announced here .

target: algeria adopts #### finance bill with oil put at ## dollars a barrel

Experimenting on the complete dataset (3M) would take a really long time (also $$$$). So in order to train and experiment faster we use our sample subset of 80,000.

Current Features

  • model architecture supports LSTM & GRU (biLSTM-to-uniLSTM or biGRU-to-uniGRU)
  • implements batch data processing
  • implements attention mechanism (Bahdanau et al. & Luong et al.(global dot))
  • implements scheduled sampling (teacher forcing)
  • implements tied embeddings
  • initializes encoder-decoder with pretrained vectors (glove.6B.200d)
  • implements custom training callbacks (tensorboard visualization for PyTorch, save best model & log checkpoint)
  • implements attention plots

To-Do

  • Implement additional linguistic features embeddings
  • Implement generator-pointer switch and replace unknown words by selecting source token with the highest attention score.
  • Implement large vocabulary trick
  • Implement sentence level attention
  • Implement beam search during inference
  • implement rouge evaluation

Baseline Training & Validation Loss

alt text

abstractive-text-summarization's People

Contributors

alesee avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.