Git Product home page Git Product logo

sacrerouge's Introduction

SacreROUGE

Master

New (2021-08-04): We now have Docker versions of several evaluation metrics included in the library, which makes it even easier to run them as long as you have Docker installed. Our implementations are wrappers around the metrics included in the Repro library. See here for more information about the Dockerized metrics.

SacreROUGE is a library dedicated to the development and use of summarization evaluation metrics. It can be viewed as an AllenNLP for evaluation metrics (with an emphasis on summarization). The inspiration for the library came from SacreBLEU, a library with a standardized implementation of BLEU and dataset readers for common machine translation datasets. See our paper for more details or this Jupyter Notebook that was presented at the NLP-OSS 2020 and Eval4NLP 2020 workshops for a demo of the library.

The development of SacreROUGE was motivated by three problems:

  • The official implementations for various evaluation metrics do not use a common interface, so running many of them on a dataset is frustrating and time consuming. SacreROUGE wraps many popular evaluation metrics in a common interface so it is straightforward and fast to setup and run a new metric.

  • Evaluating metrics can be tricky. There are there are several different correlation coefficients commonly used, there are different levels at which the correlation can be calculated, and comparing system summaries to human summaries requires implementing jackknifing. The evaluation code in SacreROUGE is shared among all of the metrics, so once a new metric implements the common interface, all of the details of the evaluation are taken care of for free.

  • Datasets for evaluating summarization metrics formatted differently and can be hard to parse (e.g., DUC and TAC). SacreROUGE addresses this problem by providing dataset readers to load and reformat the data into a common schema.

The two main uses of SacreROUGE are to evaluate summarization systems and to evaluation the evaluation metrics themselves by calculating their correlations to human judgments.

Installing

The easiest method of using SacreROUGE is to install the pypi library via:

pip install sacrerouge

This will add a new sacrerouge bash command to your path, which serves as the primary interface for the library.

Tutorials

We provide several different tutorials for how to use SacreROUGE based on your use case:

Setting up a Dataset

SacreROUGE contains data to load some summarization datasets and save them in a common format. Run the sacrerouge setup-dataset command to see the available datasets, or check here.

Data Visualization

We have also written two data visualization tools. The first tool visualizes a Pyramid and optional Pyramid annotations on peer summaries. It accepts the pyramid.jsonl and pyramid-annotations.jsonl files which are saved by some of the dataset readers.

The second tool visualizes the n-gram matches that are used to calculate the ROUGE score. It accepts the summaries.jsonl files which are saved by some of the dataset readers.

Papers

Relevant publications which are implemented in the SacreROUGE framework include:

Help

If you have any questions or suggestions, please open an issue or contact me (Dan Deutsch).

Citation

If you use SacreROUGE for your paper, please cite the following paper:

@inproceedings{deutsch-roth-2020-sacrerouge,
    title = {{SacreROUGE: An Open-Source Library for Using and Developing Summarization Evaluation Metrics},
    author = "Deutsch, Daniel  and
      Roth, Dan",
    booktitle = "Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS)",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.nlposs-1.17",
    pages = "120--125"
}

sacrerouge's People

Contributors

danieldeutsch avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.