Git Product home page Git Product logo

amr2text-summ's Introduction

Work in progress

Guided Neural Language Generation for Abstractive Summarization Using AMR

This repository contains code for our EMNLP 2018 paper "Guided Neural Language Generation for Abstractive Summarization Using AMR"

Obtaining the Dataset

We used the Abstract Meaning Representation Annotation Release 2.0 which contains manually annotated document and summary AMR.

Preprocessing the Data

For preprocessing, clone the AMR preprocessing repository.

git clone https://github.com/sheffieldnlp/AMR-Preprocessing

Run the AMR linearizing where the input is the system summary AMR from Liu's summarizer ($F) and the AMR raw dataset ($AMR). Here we use the test dataset. Run the preprocessing on the training, and validation dataset if you want to train the model.

export F_TRAIN=/<path to AMR proxy train>/amr-release-2.0-amrs-training.txt
export F_TEST=/<path to AMR proxy train>/amr-release-2.0-amrs-test.txt
export F_DEV=/<path to AMR proxy train>/amr-release-2.0-amrs-dev.txt
export OUTPUT=/<output path for the results>/
python var_free_amrs.py -f $F_TRAIN -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var
python var_free_amrs.py -f $F_TEST -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var
python var_free_amrs.py -f $F_DEV -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var

For each set (train, test and dev) the script will produce a set of two files: the sentence (.sent) and its respective linearized AMR (.tf) files.

Training New Model

export SRC=/<path to the linearized AMR tf training file>/all_amr-release-2.0-amrs-training.txt.tf
export TGT=/<path to the sentence training file>/all_amr-release-2.0-amrs-training.txt.sent
export SRC_VALID=/<path to the linearized AMR tf validation file>/all_amr-release-2-0-amrs-dev-all.txt.tf
export TGT_VALID=/<path to the sentence training file>/all_amr-release-2-0-amrs-dev-all.txt.sent
export SAVE=/<path to save directory>/

python preprocess.py -train_src $SRC -train_tgt $TGT -valid_src $SRC_VALID -valid_tgt $TGT_VALID -save_data $SAVE -src_seq_length 1000 -tgt_seq_length 1000 -shuffle 1
export F=/<path to test summarizer output>/summ_ramp_10_passes_len_edges_exp_0
export OUTPUT=/<path to test preprocessed output>
export AMR=/<path to AMR>/amr-release-2.0-amrs-test-proxy.txt
python var_free_amrs.py -is_dir -f $F -output_path $OUTPUT --custom_parentheses --no_semantics --delete_amr_var --with_side -side_file $AMR




python $WORK/train.py -data $PREPROCESS/van_noord/no_filter_amr_2/data -save_model $MODEL/$TYPE -rnn_size 500 -layers 2 -epochs 2000 -optim sgd -learning_rate 1 -learning_rate_decay 0.8 -encoder_type brnn -global_attention general -seed 1 -dropout 0.5 -batch_size 32

Generation with New Model

python $WORK/translate.py -src $file -output $INPUT/gen/summ_rigotrio_fluent_side/$(basename $file).system -model $MODEL/rse/sprint_1/acc_53.28_ppl_46.79_e126.pt -replace_unk -side_src $INPUT/processed/rigotrio/body$(basename $file).s -side_tgt $INPUT/processed/rigotrio/body_$(basename $file).sent.s -beam_size 5 -max_length 100 -n_best 1 -batch_size 1 -verbose -psi 0.95 -theta 2.5 -k 15

amr2text-summ's People

Contributors

adamlerer avatar apaszke avatar askender avatar blodstone avatar bmccann avatar bpopeters avatar chenbeh avatar colesbury avatar da03 avatar ebetica avatar guillaumekln avatar gwenniger avatar helson73 avatar irshadbhat avatar jianyuzhan avatar jingxil avatar jsenellart avatar justinchiu avatar orina1123 avatar playma avatar pltrdy avatar scarletpan avatar sebastiangehrmann avatar smartkiwi avatar soumith avatar srush avatar taolei87 avatar thammegowda avatar wjbianjason avatar xutaima avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.