Git Product home page Git Product logo

axcell's Introduction

AxCell: Automatic Extraction of Results from Machine Learning Papers

PWC PWC

This repository is the official implementation of AxCell: Automatic Extraction of Results from Machine Learning Papers.

pipeline

Requirements

To create a conda environment named axcell and install requirements run:

conda env create -f environment.yml

Additionally, axcell requires docker (that can be run without sudo). Run scripts/pull_docker_images.sh to download necessary images.

Datasets

We publish the following datasets:

See datasets notebook for an example of how to load the datasets provided below. The extraction notebook shows how to use axcell to extract text and tables from papers.

Evaluation

See the evaluation notebook for the full example on how to evaluate AxCell on the PWCLeaderboards dataset.

Training

Pre-trained Models

You can download pretrained models here:

  • axcell โ€” an archive containing the taxonomy, abbreviations, table type classifier and table segmentation model. See the results-extraction notebook for an example of how to load and run the models
  • language model โ€” ULMFiT language model pretrained on the ArxivPapers dataset

Results

AxCell achieves the following performance:

Dataset Macro F1 Micro F1
PWC Leaderboards 21.1 28.7
NLP-TDMS 19.7 25.8

License

AxCell is released under the Apache 2.0 license.

Citation

The pipeline is described in the following paper:

@inproceedings{axcell,
    title={AxCell: Automatic Extraction of Results from Machine Learning Papers},
    author={Marcin Kardas and Piotr Czapla and Pontus Stenetorp and Sebastian Ruder and Sebastian Riedel and Ross Taylor and Robert Stojnic},
    year={2020},
    booktitle={2004.14356}
}

axcell's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.