Milana, Federico;
(2025)
Evaluating Interaction with Machine Learning Text Classifiers and Interpretability Techniques.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
Milana_10206080_Thesis_Redacted.pdf Download (6MB) | Preview |
Abstract
As Machine Learning (ML) becomes increasingly integrated into society and more users interact with ML-driven systems, understanding how they perceive and engage with these technologies grows increasingly important. This Ph.D. thesis explores user experience, usability, interpretability, and cognitive biases of ML text classifiers through two research projects based on a desktop application developed to support thematic analysis, and a separate user evaluation of interpretability techniques. The Thematic Analysis Coding Assistant (TACA) enables users to import an initial thematic analysis, extracts labelled sentences, trains an offline gradient boosting classifier, and generates coding suggestions. Users can iteratively re-train the model after re-labelling individual or batches of sentences. A user study run with 20 non-ML expert participants revealed that participants critically reflected on their analysis, gained new thematic insights, and adapted their interpretative stance, while also showing misconceptions about ML concepts, positivist views, and self-blame for poor model performance. A second study reports on an autoethnography of the use of TACA, revealing different re-labeling and model inspection strategies, reflecting on potential structural changes to the analysis, and examining the positionality of the user as a developer, a researcher, and a participant. The findings provide complementary insights into how ML can support and challenge analytical processes, personal reflections, and perceptions of the model. Building on the findings of the first two studies, a third study evaluates two popular local interpretability methods in text classification, LIME and SHAP, and a proposed global method using LLM-generated summaries based on LIME importance weights. Among 128 participants, those without explanations identified broader topics, while those using LIME and SHAP focused on individual terms, and those using summaries identified more features overall. However, none of the methods significantly improved model prediction accuracy. Together, these studies contribute to seminal research on understanding the perception and interaction with ML and the implications of system and interaction design to improve the understanding of ML concepts.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Evaluating Interaction with Machine Learning Text Classifiers and Interpretability Techniques |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2025. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
Keywords: | HCI, AI, Text Classification, Interactive Machine Learning, Interpretable Machine Learning, User Studies, Autoethnography |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10206080 |
Archive Staff Only
![]() |
View Item |