Incorporating Information From Syllable-Length Time Scales into Automatic Speech Recognition
Title | Incorporating Information From Syllable-Length Time Scales into Automatic Speech Recognition |
Publication Type | Technical Report |
Year of Publication | 1998 |
Authors | Wu, S-L. |
Other Numbers | 1135 |
Keywords | combination, human auditory perception, neural network, reverberation, speech recognition, syllabic onsets, syllable |
Abstract | Incorporating the concept of the syllable into speech recognition may improve recognition accuracy through the integration of information over syllable-length time spans. Evidence from psychoacoustics and phonology suggests that humans use the syllable as a basic perceptual unit. Nonetheless, the explicit use of such long-time-span units is comparatively unusual in automatic speech recognition systems for English. The work described in this thesis explored the utility of information collected over syllable-related time-scales. The first approach involved integrating syllable segmentation information into the speech recognition process. The addition of acoustically-based syllable onset estimates (Shire 1997) resulted in a 10% relative reduction in word-error rate. The second approach began with developing four speech recognition systems based on long-time-span features and units, including modulation spectrogram features (Greenberg & Kingsbury 1997). Error analysis suggested the strategy of combining, which led to the implementation of methods that merged the outputs of syllable-based recognition systems with the phone-oriented baseline system at the frame level, the syllable level and the whole-utterance level. These combined systems exhibited relative improvements of 20-40% compared to the baseline system for clean and reverberant speech test cases. |
URL | http://www.icsi.berkeley.edu/ftp/global/pub/techreports/1998/tr-98-014.pdf |
Bibliographic Notes | ICSI Technical Report TR-98-014 |
Abbreviated Authors | S.-L. Wu |
ICSI Research Group | Speech |
ICSI Publication Type | Technical Report |