The dataset realeased in this repository is the extended version of the HaSpeeDe 2020 dataset
It contains Italian tweets and news headlines annotated with the following labels: hate speech, stereotype, irony and sarcasm. This is the gold version of the dataset, but the disaggregated annotations can be found in the Ontology of Dangerous Speech Messages (O-Dang!) at https://github.com/marcostranisci/o-dang (more info).
The dataset is released protected with password, therefore to access the data, please, contact the main author at [email protected]
If you use this dataset, please cite the following contributions, Thanks!
@article{Frenda_Patti_Rosso_2023, title={Killing me softly: Creative and cognitive aspects of implicitness in abusive language online}, author={Frenda, Simona and Patti, Viviana and Rosso, Paolo}, volume={29}, number={6}, DOI={10.1017/S1351324922000316}, journal={Natural Language Engineering}, year={2023}, pages={1516–1537}}
and
@InProceedings{10.1007/978-3-031-42448-9_4, author="Frenda, Simona and Patti, Viviana and Rosso, Paolo", editor="Arampatzis, Avi and Kanoulas, Evangelos and Tsikrika, Theodora and Vrochidis, Stefanos and Giachanou, Anastasia and Li, Dan and Aliannejadi, Mohammad and Vlachos, Michalis and Faggioli, Guglielmo and Ferro, Nicola", title="When Sarcasm Hurts: Irony-Aware Models for Abusive Language Detection", booktitle="Experimental IR Meets Multilinguality, Multimodality, and Interaction", year="2023", publisher="Springer Nature Switzerland", address="Cham", pages="34--47", isbn="978-3-031-42448-9" }