Git Product home page Git Product logo

editingcl's Introduction

EditingCL

This repository contains the code for our ACL2022 paper: An Imitation Learning Curriculum for Text Editing with Non-Autoregressive Models.

editingcl

Running the code

Setup

Run bash scripts/setup.sh to install the libraries and dependencies.

Data

The data for Abstractive Summarization from Toutanova et al. (2016) can be found in data-summ, which contains 6K short input texts, with upto 5 summaries each.. The Newsela data can be requested from here.

Training

AR  
> bash scripts/train.sh -i 1 -j 1 -m seq2seq -u data-summ

EDITOR (From Reference) 
> bash scripts/train.sh -i 2 -j 1 -m nat -u data-summ 

Editing Roll-in 
> bash scripts/train.sh -i 3 -j 1 -m nat -u data-summ -r experiments/exp-2/checkpoints1/checkpoint_best.pt -a " --use-source 1  --noisy-expert --lr 0.0001 "

Editing CL
> bash scripts/train.sh -i 4 -j 1 -m nat -u data-summ -r experiments/exp-2/checkpoints1/checkpoint_best.pt -a " --use-source 1 --noisy-expert --pacing root --lr 0.0001 "

The skip-token-refine is used to ignore the grade tokens on the decoder side for simplification task and should be set to two to ignore the source and grade tokens.

Evaluation

Example evaluation configs for the summarization tasks can be found in scripts/evaluate_summ.sh.

Acknowledgements

We thank Eleftheria Briakou, Khanh Nguyen, Kianté Brantley, Weijia Xu, the members of the CLIP lab at UMD, and the anonymous ARR reviewers for their helpful and constructive comments.

Cite the work

If you make use of the code, models, or algorithm, please cite our paper:

@inproceedings{agrawal-carpuat-2022-imitation,
    title = "An Imitation Learning Curriculum for Text Editing with Non-Autoregressive Models",
    author = "Agrawal, Sweta  and
      Carpuat, Marine",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.520",
    doi = "10.18653/v1/2022.acl-long.520",
    pages = "7550--7563",
    abstract = "We propose a framework for training non-autoregressive sequence-to-sequence models for editing tasks, where the original input sequence is iteratively edited to produce the output. We show that the imitation learning algorithms designed to train such models for machine translation introduces mismatches between training and inference that lead to undertraining and poor generalization in editing scenarios. We address this issue with two complementary strategies: 1) a roll-in policy that exposes the model to intermediate training sequences that it is more likely to encounter during inference, 2) a curriculum that presents easy-to-learn edit operations first, gradually increasing the difficulty of training samples as the model becomes competent. We show the efficacy of these strategies on two challenging English editing tasks: controllable text simplification and abstractive summarization. Our approach significantly improves output quality on both tasks and controls output complexity better on the simplification task.",
}

editingcl's People

Contributors

sweta20 avatar

Stargazers

Divij Pawar avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.