Barrett, Liam;
Tang, Kevin;
Howell, Peter;
(2024)
Comparison of performance of automatic recognizers for stutters in speech trained with event or interval markers.
Frontiers in Psychology
, 15
, Article 1155285. 10.3389/fpsyg.2024.1155285.
Preview |
Text
Howell_fpsyg-15-1155285.pdf Download (2MB) | Preview |
Abstract
Introduction: Automatic recognition of stutters (ARS) from speech recordings can facilitate objective assessment and intervention for people who stutter. However, the performance of ARS systems may depend on how the speech data are segmented and labelled for training and testing. This study compared two segmentation methods: event-based, which delimits speech segments by their fluency status, and interval-based, which uses fixed-length segments regardless of fluency. // Methods: Machine learning models were trained and evaluated on interval-based and event-based stuttered speech corpora. The models used acoustic and linguistic features extracted from the speech signal and the transcriptions generated by a state-of-the-art automatic speech recognition system. // Results: The results showed that event-based segmentation led to better ARS performance than interval-based segmentation, as measured by the area under the curve (AUC) of the receiver operating characteristic. The results suggest differences in the quality and quantity of the data because of segmentation method. The inclusion of linguistic features improved the detection of whole-word repetitions, but not other types of stutters. // Discussion: The findings suggest that event-based segmentation is more suitable for ARS than interval-based segmentation, as it preserves the exact boundaries and types of stutters. The linguistic features provide useful information for separating supra-lexical disfluencies from fluent speech but may not capture the acoustic characteristics of stutters. Future work should explore more robust and diverse features, as well as larger and more representative datasets, for developing effective ARS systems.
Type: | Article |
---|---|
Title: | Comparison of performance of automatic recognizers for stutters in speech trained with event or interval markers |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.3389/fpsyg.2024.1155285 |
Publisher version: | https://doi.org/10.3389/fpsyg.2024.1155285 |
Language: | English |
Additional information: | © 2024 Barrett, Tang and Howell. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
Keywords: | Stuttering, speech pathology, automatic speech recognition, machine learning, Computational paralinguistics, Language diversity, whisper |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences > Experimental Psychology |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10186812 |
Archive Staff Only
![]() |
View Item |