UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

Quantifying the Bias of Transformer-Based Language Models for African American English in Masked Language Modeling

Salutari, F; Ramos, J; Rahmani, HA; Linguaglossa, L; Lipani, A; (2023) Quantifying the Bias of Transformer-Based Language Models for African American English in Masked Language Modeling. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). (pp. pp. 532-543). Springer Nature Green open access

[thumbnail of Quantifying_the_Bias___PAKDD_2023.pdf]
Preview
Text
Quantifying_the_Bias___PAKDD_2023.pdf - Accepted Version

Download (446kB) | Preview

Abstract

In recent years, groundbreaking transformer-based language models (LMs) have made tremendous advances in natural language processing (NLP) tasks. However, the measurement of their fairness with respect to different social groups still remains unsolved. In this paper, we propose and thoroughly validate an evaluation technique to assess the quality and bias of language model predictions on transcripts of both spoken African American English (AAE) and Spoken American English (SAE). Our analysis reveals the presence of a bias towards SAE encoded by state-of-the-art LMs such as BERT and DistilBERT and a lower bias in distilled LMs. We also observe a bias towards AAE in RoBERTa and BART. Additionally, we show evidence that this disparity is present across all the LMs when we only consider the grammar and the syntax specific to AAE.

Type: Proceedings paper
Title: Quantifying the Bias of Transformer-Based Language Models for African American English in Masked Language Modeling
Event: Advances in Knowledge Discovery and Data Mining 27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2023
ISBN-13: 9783031333736
Open access status: An open access version is available from UCL Discovery
DOI: 10.1007/978-3-031-33374-3_42
Publisher version: https://doi.org/10.1007/978-3-031-33374-3_42
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
Keywords: Language Model, Transformers, Bias and Fairness, Evaluation
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Civil, Environ and Geomatic Eng
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10184454
Downloads since deposit
231Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item