Although the vast majority of knowledge bases (KBs) are heavily biased towards English, Wikipedias do cover very different topics in different languages. Exploiting this, we introduce a new multilingual dataset (X-WikiRE), framing relation extraction as a multilingual machine reading problem. We show that by leveraging this resource it is possible to robustly transfer models cross-lingually and that multilingual support significantly improves (zero-shot) relation extraction, enabling the population of low-resourced KBs from their well-populated counterparts.
Read the full thesis from the MasterThesis.pdf file.
Check: X-WikiRE repository for the code on how to create the dataset.
Work done while visiting CoAStaL Lab @ the University of Copenhagen.
@inproceedings{abdou-etal-2019-x,
title = "X-{W}iki{RE}: A Large, Multilingual Resource for Relation Extraction as Machine Comprehension",
author = "Abdou, Mostafa and
Sas, Cezar and
Aralikatte, Rahul and
Augenstein, Isabelle and
S{\o}gaard, Anders",
booktitle = "Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)",
month = nov,
year = "2019",
address = "Hong Kong, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D19-6130",
doi = "10.18653/v1/D19-6130",
pages = "265--274",
abstract = "Although the vast majority of knowledge bases (KBs) are heavily biased towards English, Wikipedias do cover very different topics in different languages. Exploiting this, we introduce a new multilingual dataset (X-WikiRE), framing relation extraction as a multilingual machine reading problem. We show that by leveraging this resource it is possible to robustly transfer models cross-lingually and that multilingual support significantly improves (zero-shot) relation extraction, enabling the population of low-resourced KBs from their well-populated counterparts.",
}