UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

The Korean Speech Recognition Sentences: A Large Corpus for Evaluating Semantic Context and Language Experience in Speech Perception

Song, Jieun; Kim, Byungjun; Kim, Minjeong; Iverson, Paul; (2023) The Korean Speech Recognition Sentences: A Large Corpus for Evaluating Semantic Context and Language Experience in Speech Perception. Journal of Speech, Language, and Hearing Research , 66 (9) pp. 3399-3412. 10.1044/2023_JSLHR-23-00137. Green open access

[thumbnail of song-et-al-2023-the-korean-speech-recognition-sentences-a-large-corpus-for-evaluating-semantic-context-and-language.pdf]
Preview
Text
song-et-al-2023-the-korean-speech-recognition-sentences-a-large-corpus-for-evaluating-semantic-context-and-language.pdf - Published Version

Download (599kB) | Preview

Abstract

PURPOSE: The aim of this study was to develop and validate a large Korean sentence set with varying degrees of semantic predictability that can be used for testing speech recognition and lexical processing. METHOD: Sentences differing in the degree of final-word predictability (predictable, neutral, and anomalous) were created with words selected to be suitable for both native and nonnative speakers of Korean. Semantic predictability was evaluated through a series of cloze tests in which native (n = 56) and nonnative (n = 19) speakers of Korean participated. This study also used a computer language model to evaluate final-word predictabilities; this is a novel approach that the current study adopted to reduce human effort in validating a large number of sentences, which produced results comparable to those of the cloze tests. In a speech recognition task, the sentences were presented to native (n = 23) and nonnative (n = 21) speakers of Korean in speech-shaped noise at two levels of noise. RESULTS: The results of the speech-in-noise experiment demonstrated that the intelligibility of the sentences was similar to that of related English corpora. That is, intelligibility was significantly different depending on the semantic condition, and the sentences had the right degree of difficulty for assessing intelligibility differences depending on noise levels and language experience. CONCLUSIONS: This corpus (1,021 sentences in total) adds to the target languages available in speech research and will allow researchers to investigate a range of issues in speech perception in Korean.

Type: Article
Title: The Korean Speech Recognition Sentences: A Large Corpus for Evaluating Semantic Context and Language Experience in Speech Perception
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1044/2023_JSLHR-23-00137
Publisher version: http://dx.doi.org/10.1044/2023_jslhr-23-00137
Language: English
Additional information: © 2023 The Authors. This work is licensed under a Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences > Speech, Hearing and Phonetic Sciences
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10191026
Downloads since deposit
532Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item