Improving Automatic Speech Recognition by Learning from Human Errors
Title | Improving Automatic Speech Recognition by Learning from Human Errors |
Publication Type | Conference Paper |
Year of Publication | 2011 |
Authors | Meyer, B. T. |
Volume | 14 |
Other Numbers | 3208 |
Abstract | This work presents a series of experiments that compare the performance of human speech recognition (HSR) and automaticspeech recognition (ASR). The goal of this line of research is to learn from the differences between HSR and ASR,and to use this knowledge to incorporate new signal processing strategies from the human auditory system in automaticclassifiers. A database with noisy nonsense utterances is used both for HSR and ASR experiments with focus on the influenceof intrinsic variation (arising from changes in speaking rate, effort, and style). A standard ASR system is found toreach human performance level only when the signal-to-noise ratio is increased by 15 dB, which can be seen as thehuman-machine gap for speech recognition on a sub-lexical level. The sources of intrinsic variation are found to severelydegrade phoneme recognition scores both in HSR and in ASR. A comparison of utterances produced at different speakingrates indicates that temporal cues are not optimally exploited in ASR, which results in a strong increase of vowel confusions.Alternative feature extraction methods that take into account temporal and spectro-temporal modulations of speechsignals are discussed. |
Acknowledgment | Significant contributions to the research summarized in this study were madeby Birger Kollmeier, Thomas Brand, Tim J¨urgens, and Thorsten Wesker.It was supported by the DFG (SFB/TRR 31 The active auditory system;URL: http://www.uni-oldenburg.de/sfbtr31). Bernd T. Meyer has been supportedby a post-doctoral fellowship of the German Academic Exchange Service(DAAD). |
URL | http://www.icsi.berkeley.edu/pubs/speech/improvingasrbylearning11.pdf |
Bibliographic Notes | Proceedings of the 162nd Meeting of the Acoustical Society of America, Vol. 14, San Diego, California |
Abbreviated Authors | B. T. Meyer |
ICSI Research Group | Speech |
ICSI Publication Type | Article in conference proceedings |