UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

Comparison of performance of automatic recognizers for stutters in speech trained with event or interval markers

Barrett, Liam; Tang, Kevin; Howell, Peter; (2024) Comparison of performance of automatic recognizers for stutters in speech trained with event or interval markers. Frontiers in Psychology , 15 , Article 1155285. 10.3389/fpsyg.2024.1155285. Green open access

[thumbnail of Howell_fpsyg-15-1155285.pdf]
Preview
Text
Howell_fpsyg-15-1155285.pdf

Download (2MB) | Preview

Abstract

Introduction: Automatic recognition of stutters (ARS) from speech recordings can facilitate objective assessment and intervention for people who stutter. However, the performance of ARS systems may depend on how the speech data are segmented and labelled for training and testing. This study compared two segmentation methods: event-based, which delimits speech segments by their fluency status, and interval-based, which uses fixed-length segments regardless of fluency. // Methods: Machine learning models were trained and evaluated on interval-based and event-based stuttered speech corpora. The models used acoustic and linguistic features extracted from the speech signal and the transcriptions generated by a state-of-the-art automatic speech recognition system. // Results: The results showed that event-based segmentation led to better ARS performance than interval-based segmentation, as measured by the area under the curve (AUC) of the receiver operating characteristic. The results suggest differences in the quality and quantity of the data because of segmentation method. The inclusion of linguistic features improved the detection of whole-word repetitions, but not other types of stutters. // Discussion: The findings suggest that event-based segmentation is more suitable for ARS than interval-based segmentation, as it preserves the exact boundaries and types of stutters. The linguistic features provide useful information for separating supra-lexical disfluencies from fluent speech but may not capture the acoustic characteristics of stutters. Future work should explore more robust and diverse features, as well as larger and more representative datasets, for developing effective ARS systems.

Type: Article
Title: Comparison of performance of automatic recognizers for stutters in speech trained with event or interval markers
Open access status: An open access version is available from UCL Discovery
DOI: 10.3389/fpsyg.2024.1155285
Publisher version: https://doi.org/10.3389/fpsyg.2024.1155285
Language: English
Additional information: © 2024 Barrett, Tang and Howell. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Keywords: Stuttering, speech pathology, automatic speech recognition, machine learning, Computational paralinguistics, Language diversity, whisper
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences > Experimental Psychology
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10186812
Downloads since deposit
8Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item