International Computer Science Institute Talks Talks at the International Computer Science Institute

The International Computer Science Institute
is pleased to present a talk:

Non-Stationary Multi-Stream Processing Towards Robust and Adaptive
Speech Recognition

Herve Bourlard
bourlard idiap.ch

Friday, December 1, 2000
ICSI, Rm 607
11:00 am - 12:00 pm

Abstract:

Multi-stream automatic speech recognition (ASR) extends the standard hidden Markov model (HMM) based approach by assuming that the speech signal is processed by different (independent) "experts", each expert focusing on a different characteristic of the signal, and that the different stream likelihoods (or posteriors) are combined at some (temporal) stage to yield a global recognition output. The most successful approach developed so far consists in combining the stream likelihoods through integration over all possible stream combinations (i.e., over all possible values of a hidden variable representing the position of the most reliable streams). As a particular case of this approach, subband-based speech recognition will also be discussed.

In this framework, we will introduce different mathematical models and discuss some interesting relationships with psycho-acoustic evidence. As a further extension to multi-stream ASR, we will also introduce a new approach, referred to as HMM2 (actually HMM mixtures), where the HMM emission probabilities are estimated via state specific feature based HMMs responsible for merging the stream information and modeling their possible correlation. For each case, recognition results achieved on non-stationary noise will be presented, and possibilities of fast adaptation (of a limited number of parameters) will be illustrated through specific examples. Automatic formant tracking based on HMM2 will also be discussed and illustrated.

This talk will be held in the Main Lecture Hall at ICSI.
1947 Center Street, Sixth Floor, Berkeley, CA 94704-1198
(on Center between Milvia and Martin Luther King Jr. Way)
Click here for a map