Using large language models to detect outcomes in qualitative studies of adolescent depression

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Using large language models to detect outcomes in qualitative studies of adolescent depression

Xin, Alison W; Nielson, Dylan M; Krause, Karolin Rose; Fiorini, Guilherme; Midgley, Nick; Pereira, Francisco; Lossio-Ventura, Juan Antonio; (2024) Using large language models to detect outcomes in qualitative studies of adolescent depression. A Scholarly Journal of Informatics in Health and Biomedicine (JAMIA) , Article ocae298. 10.1093/jamia/ocae298. (In press).

[thumbnail of Midgley_outcomes_that_matter_to_depressed_adolescents (002).pdf]

Text
Midgley_outcomes_that_matter_to_depressed_adolescents (002).pdf - Accepted Version
Access restricted to UCL open access staff until 12 December 2025.
Download (826kB)

Abstract

OBJECTIVE: We aim to use large language models (LLMs) to detect mentions of nuanced psychotherapeutic outcomes and impacts than previously considered in transcripts of interviews with adolescent depression. Our clinical authors previously created a novel coding framework containing fine-grained therapy outcomes beyond the binary classification (eg, depression vs control) based on qualitative analysis embedded within a clinical study of depression. Moreover, we seek to demonstrate that embeddings from LLMs are informative enough to accurately label these experiences. MATERIALS AND METHODS: Data were drawn from interviews, where text segments were annotated with different outcome labels. Five different open-source LLMs were evaluated to classify outcomes from the coding framework. Classification experiments were carried out in the original interview transcripts. Furthermore, we repeated those experiments for versions of the data produced by breaking those segments into conversation turns, or keeping non-interviewer utterances (monologues). RESULTS: We used classification models to predict 31 outcomes and 8 derived labels, for 3 different text segmentations. Area under the ROC curve scores ranged between 0.6 and 0.9 for the original segmentation and 0.7 and 1.0 for the monologues and turns. DISCUSSION: LLM-based classification models could identify outcomes important to adolescents, such as friendships or academic and vocational functioning, in text transcripts of patient interviews. By using clinical data, we also aim to better generalize to clinical settings compared to studies based on public social media data. CONCLUSION: Our results demonstrate that fine-grained therapy outcome coding in psychotherapeutic text is feasible, and can be used to support the quantification of important outcomes for downstream uses.

Type:	Article
Title:	Using large language models to detect outcomes in qualitative studies of adolescent depression
Location:	England
DOI:	10.1093/jamia/ocae298
Publisher version:	https://doi.org/10.1093/jamia/ocae298
Language:	English
Additional information:	This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords:	BERT, Llama 2, Llama 3, adolescent depression, depression outcomes, large language models, mental health
UCL classification:	UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences > Clinical, Edu and Hlth Psychology
URI:	https://discovery-pp.ucl.ac.uk/id/eprint/10203098

Downloads since deposit

0Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item