"MULTI-RATE HIDDEN MARKOV MODELS AND THEIR APPLICATION TO ACOUSTIC
MODELING FOR SPEECH RECOGNITION"
Many natural signals slowly change in their temporal domains and exhibit long-term dependence, e.g. speech and natural language. In this talk, we will introduce multi-rate Hidden Markov models (multi-rate HMMs) as an extension of the widely used hidden Markov modeling framework to multiple time scales, for accurate and parsimonious modeling of such signals. The multi-rate HMM decomposes process variability into scale-based components and characterizes both the intra-scale temporal evolution of the process within each scale-based part and the inter-scale interactions among them. We have previously applied multi-rate HMMs to machine tool-wear monitoring problem, and in this talk we will present an application of the multi-rate HMMs to speech recognition, in which they are used for acoustic modeling of speech at multiple time scales. Unlike the conventional HMM-based paradigm modeling only phone-scale phenomena, we propose a two-rate HMM architecture for simultaneous modeling of speech at short- and long-term scales using feature sequences and modeling units corresponding to the phone- and syllable-scale phenomena. Experimental results comparing HMMs and multi-rate HMMs in a conversational speech recognition task will also be presented.
Speaker Bio:
Ozgur Cetin is a PhD student at the University of Washington Department of Electrical Engineering, where he is a member of the Signal, Speech and Language Interpretation Laboratory. He received his BS degree from Bilkent University, Turkey, in 1998 and his MS degree from University of Washington in 2000, both in electrical engineering.