Xu, Anqi;
(2022)
Inside the black box of human vocal learning: A simulation approach.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
Inside the black box of huaman vocal learning3.0final.pdf - Submitted Version Download (35MB) | Preview |
Abstract
Children learn to speak despite age-related anatomical differences that give rise to significant discrepancies between their vocalisations and those of adults. How children overcome these obstacles without explicit instructions remains unclear. Influential accounts suggest that vocal learning is achieved by producing sounds to match auditory memory in both songbirds and humans. However, observational studies alone cannot determine whether auditory-guided learning is the key mechanism. Here, I use computational modelling to test the feasibility of the hypothesis, by training an articulatory synthesiser with three-dimensional vocal tract models of an adult and children at different ages to simulate the learning of English words. The model involves two kinds of auditory guidance: 1) acoustic features to simulate universal perception of phonetic differences in all languages, and 2) a deep-learning-based automatic phoneme recogniser to simulate language-specific perception of sound contrasts in native languages. The results show that words trained by the automatic phoneme recogniser were more intelligible than those trained by acoustic features in the listening experiments, showing that language-specific perception can resolve the longstanding problem of anatomical differences between speakers. It demonstrates that auditory-guided learning is indeed feasible. In contrast with previous simulation attempts that were limited to vowel acquisition, the current model learned words containing consonant-vowel sequences that approach the intelligibility of natural speech. It has also found that the embodied articulatory dynamics limited the scope of vocal practice and somatosensory feedback provided additional benefits. Yet, learning was better and easier by the adult than by the child articulatory systems. The model experienced challenges in learning certain speech sounds, resembling the patterns of child speech development. The study further suggests that it is the vocal learning process that helps forge the link between speech perception and production. The computational approach opens a new path towards examining the cognitive mechanisms behind vocal learning.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Inside the black box of human vocal learning: A simulation approach |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2021. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Div of Psychology and Lang Sciences |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10162208 |
Archive Staff Only
View Item |