Git Product home page Git Product logo

chartsumm's Introduction

ChartSumm: A Comprehensive Benchmark for Automatic Chart Summarization of Long and Short Summaries

Authors: Raian Rahman, Rizvi Hasan, Abdullah Al Farhad, Md Tahmid Rahman Laskar, Md. Hamjajul Ashmafee, Abu Raihan Mostofa Kamal Accepted in: The 36th Canadian Conference on Artificial Intelligence (CANAI) Paper link: Link

Updates

  • Add Abstract
  • Add data
  • Add images

Abstract

Automatic chart to text summarization is an effective tool for the visually impaired people along with providing precise insights of tabular data in natural language to the user. A large and well-structured dataset is always a key part for data driven models. In this paper, we propose ChartSumm: a large-scale benchmark dataset consisting of a total of 84,363 charts along with their metadata and descriptions covering a wide range of topics and chart types to generate short and long summaries. Extensive experiments with strong baseline models show that even though these models generate fluent and informative summaries by achieving decent scores in various automatic evaluation metrics, they often face issues like suffering from hallucination, missing out important data points, in addition to incorrect explanation of complex trends in the charts. We also investigated the potential of expanding ChartSumm to other languages using automated translation tools. These make our dataset a challenging benchmark for future research.

Data

Link to data: Drive Link

Data Description

The dataset is provided in JSON format, with each file containing the following fields:

  • x_label: The label for the x-axis of the data.
  • y_label: A list of lists containing the labels for the y-axis.
  • data: A dictionary where each key has a list of values for that specific column on data
  • title: The title or description of the dataset.
  • summary: A summary of the dataset providing key information and statistics related to the voting intention.

chartsumm's People

Contributors

pranonrahman avatar

Stargazers

 avatar Bithika Jain avatar Sazid Farhan avatar Poushi avatar FanqingM avatar Morsalina  Kowmi avatar Etrama avatar  avatar Tahmid Rahman avatar Notonion avatar Yusuke-TOZAKI avatar

Watchers

 avatar

chartsumm's Issues

Code?

Hi Raian,

Thank you for your impressive work. I am also doing on the similar research project. I want to use your amazing dataset, can you upload the code to replicate the results to test on T5 and Bart baseline models? Thank you a lot.

Chart download

I did not see any charts in the download connection you provided. Is there any other link?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.