Pitch-based Vocal Tract Length Normalization

TitlePitch-based Vocal Tract Length Normalization
Publication TypeTechnical Report
Year of Publication2003
AuthorsFaria, A.
Other Numbers1206
Abstract

This paper investigates the correlation between fundamental frequency and resonant frequencies in speech, exploiting this relation for vocal tract length normalization (VTLN). By observing a speaker's average pitch, it is possible to estimate the appropriate frequency warping factor which will transform a spectral representation into one with less variation of the formants. I use a function of pitch that maps to a corresponding frequency warping factor. An exploration of speaker and vowel characteristics in the TIMIT speech corpus is used to optimize the parameters of this function. The approach presented here is a potentially simpler alternative to existing VTLN algorithms which derive the warping factor by other means. Recognizer results indicate that the pitch-based approach compares favorably against other methods; furthermore, performance could be further improved by using a warping function that is not strictly linear.

URLhttp://www.icsi.berkeley.edu/ftp/global/pub/techreports/2003/tr-03-001.pdf
Bibliographic Notes

ICSI Technical Report TR-03-001

Abbreviated Authors

A. Faria

ICSI Research Group

Speech

ICSI Publication Type

Technical Report