Pitch-based Vocal Tract Length Normalization
Title | Pitch-based Vocal Tract Length Normalization |
Publication Type | Technical Report |
Year of Publication | 2003 |
Authors | Faria, A. |
Other Numbers | 1206 |
Abstract | This paper investigates the correlation between fundamental frequency and resonant frequencies in speech, exploiting this relation for vocal tract length normalization (VTLN). By observing a speaker's average pitch, it is possible to estimate the appropriate frequency warping factor which will transform a spectral representation into one with less variation of the formants. I use a function of pitch that maps to a corresponding frequency warping factor. An exploration of speaker and vowel characteristics in the TIMIT speech corpus is used to optimize the parameters of this function. The approach presented here is a potentially simpler alternative to existing VTLN algorithms which derive the warping factor by other means. Recognizer results indicate that the pitch-based approach compares favorably against other methods; furthermore, performance could be further improved by using a warping function that is not strictly linear. |
URL | http://www.icsi.berkeley.edu/ftp/global/pub/techreports/2003/tr-03-001.pdf |
Bibliographic Notes | ICSI Technical Report TR-03-001 |
Abbreviated Authors | A. Faria |
ICSI Research Group | Speech |
ICSI Publication Type | Technical Report |