Git Product home page Git Product logo

franciellevargas / factnews Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 0.0 6.05 MB

FactNews is the first dataset to predict sentence-level factuality of news reporting. Furthemore, we provide baseline results for sentence-level factuality and media bias predicition in Portuguese. The FactNews is composed of 6,191 annotated sentences by factuality and media bias definitions by AllSides.

fact-checking factuality-checking fake-news-classification news news-credibility dataset media-bias portuguese-brazilian

factnews's Introduction

DOI

Sentence-Level Annotated Dataset for Predicting Factuality of News and Bias of Media Outlets


Automated fact-checking and news credibility verification at scale require accurate prediction of news factuality and media bias. Here, we introduce a large sentence-level dataset, titled FactNews, composed of 6,191 sentences expertly annotated according to factuality and media bias definitions proposed by AllSides. We used the FactNews to assess the overall reliability of news sources by formulating two text classification problems for predicting sentence-level factuality of news reporting and bias of media outlets. Our experiments demonstrate that biased sentences present a higher number of words compared to factual sentences, besides having a predominance of emotions. Hence, the fine-grained analysis of subjectivity and impartiality of news articles showed promising results for predicting the reliability of the entire media outlet. Finally, due to the severity of fake news and political polarization in Brazil, and the lack of research for Portuguese, both dataset and baseline were proposed for Brazilian Portuguese. The following table describes in detail the FactNews labels, documents, and stories:



Factual Quotes Biased Total sentences Total news stories Total news documents
4,242 1,391 558 6,161 100 300


Media 1 Media 2 Media 3
Folha de São Paulo Estadão O Globo


Sentence-Level Media Bias Prediction Sentenve-Level Factuality Prediction
67% (F1-Score) by Fine-tuned mBert-case 88% (F1-Score) by Fine-tuned mBert-case

CITING

Vargas, F., Jaidka, K., Pardo, T.A.S., Benevenuto, F. (2023). Predicting Sentence-Level Factuality of News and Bias of Media Outlets. Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pp.1197--1206. Varna, Bulgaria. https://aclanthology.org/2023.ranlp-1.127.


BIBTEX

@inproceedings{vargas-etal-2023-predicting, title = "Predicting Sentence-Level Factuality of News and Bias of Media Outlets", author = "Vargas, Francielle and Jaidka, Kokil and Pardo, Thiago and Benevenuto, Fabr{\'\i}cio", editor = "Mitkov, Ruslan and Angelova, Galia", booktitle = "Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing", month = sep, year = "2023", address = "Varna, Bulgaria", publisher = "INCOMA Ltd., Shoumen, Bulgaria", url = "https://aclanthology.org/2023.ranlp-1.127", pages = "1197--1206", }


FUNDING

SSC-logo-300x171


factnews's People

Contributors

franciellevargas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.