UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages

Muhammad, SH; Abdulmumin, I; Ayele, AA; Ousidhoum, N; Adelani, DI; Yimam, SM; Ahmad, IS; ... Arthur, S; + view all (2023) AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. (pp. pp. 13968-13981). Association for Computational Linguistics Green open access

[thumbnail of 2023.emnlp-main.862.pdf]
Preview
Text
2023.emnlp-main.862.pdf - Published Version

Download (1MB) | Preview

Abstract

Africa is home to over 2,000 languages from more than six language families and has the highest linguistic diversity among all continents. These include 75 languages with at least one million speakers each. Yet, there is little NLP research conducted on African languages. Crucial to enabling such research is the availability of high-quality annotated datasets. In this paper, we introduce AfriSenti, a sentiment analysis benchmark that contains a total of >110,000 tweets in 14 African languages (Amharic, Algerian Arabic, Hausa, Igbo, Kinyarwanda, Moroccan Arabic, Mozambican Portuguese, Nigerian Pidgin, Oromo, Swahili, Tigrinya, Twi, Xitsonga, and Yorùbá) from four language families. The tweets were annotated by native speakers and used in the AfriSenti-SemEval shared task 1. We describe the data collection methodology, annotation process, and the challenges we dealt with when curating each dataset. We further report baseline experiments conducted on the different datasets and discuss their usefulness.

Type: Proceedings paper
Title: AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages
Event: Conference on Empirical Methods in Natural Language Processing
ISBN-13: 9798891760608
Open access status: An open access version is available from UCL Discovery
Publisher version: https://aclanthology.org/2023.emnlp-main.862.pdf
Language: English
Additional information: This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10188836
Downloads since deposit
187Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item