Publication Details
Title: The Syllable Re-revisited
Author: A. Hauenstein
Group: ICSI Technical Reports
Date: August 1996
PDF: ftp://ftp.icsi.berkeley.edu/pub/techreports/1996/tr-96-035.pdf
Overview:
In this report an approach to speech recognition using syllables as basic modelling units is compared to a state-of-the-art system employing phonemes. The technological framework is ICSI's hybrid HMM-ANN recognition system applied on small to medium vocabulary recognition tasks. Although the number of units to be classified nearly doubles, it is shown that the syllable can outperform the phoneme slightly but significantly in terms of unit classification capability, measured as frame error rate. Comparing the overall system performance (measured in word error rate) the phoneme-based system still performs obviously better for continuous speech tasks, while the syllable-based system is superior for isolated word recognition tasks on cross-database tests. This suggests the need for further work on the understanding of the interaction of knowledge sources on the frame-, word-, and sentence-level in current recognition systems. Keywords: speech recognition, hybrid HMM-ANN classification, syllable
Bibliographic Information:
ICSI Technical Report TR-96-035
Bibliographic Reference:
A. Hauenstein. The Syllable Re-revisited. ICSI Technical Report TR-96-035, August 1996
Author: A. Hauenstein
Group: ICSI Technical Reports
Date: August 1996
PDF: ftp://ftp.icsi.berkeley.edu/pub/techreports/1996/tr-96-035.pdf
Overview:
In this report an approach to speech recognition using syllables as basic modelling units is compared to a state-of-the-art system employing phonemes. The technological framework is ICSI's hybrid HMM-ANN recognition system applied on small to medium vocabulary recognition tasks. Although the number of units to be classified nearly doubles, it is shown that the syllable can outperform the phoneme slightly but significantly in terms of unit classification capability, measured as frame error rate. Comparing the overall system performance (measured in word error rate) the phoneme-based system still performs obviously better for continuous speech tasks, while the syllable-based system is superior for isolated word recognition tasks on cross-database tests. This suggests the need for further work on the understanding of the interaction of knowledge sources on the frame-, word-, and sentence-level in current recognition systems. Keywords: speech recognition, hybrid HMM-ANN classification, syllable
Bibliographic Information:
ICSI Technical Report TR-96-035
Bibliographic Reference:
A. Hauenstein. The Syllable Re-revisited. ICSI Technical Report TR-96-035, August 1996
