Discriminant Training of Front-End and Acoustic Modeling Stages to Heterogeneous Acoustic Environments for Multi-Stream Automatic Speech Recognition

Title	Discriminant Training of Front-End and Acoustic Modeling Stages to Heterogeneous Acoustic Environments for Multi-Stream Automatic Speech Recognition
Publication Type	Technical Report
Year of Publication	2000
Authors	Shire, M. Lee
Other Numbers	1187
Abstract	The performance of Automatic Speech Recognition (ASR) systems degrades in the presence of adverse acoustic conditions. A possible shortcoming of the typical ASR system is the reliance on a single stream of front-end acoustic features and acoustic modeling feature probabilities. A single front-end feature extraction algorithm may not be capable of maintaining robustness to arbitrary acoustic environments. Acoustic modeling will also degrade due to distributional changes caused by the acoustic environment. This report explores the parallel use of multiple front-end and acoustic modeling elements to improve upon this shortcoming. Each ASR acoustic modeling component is trained to estimate class posterior probabilities in a particular acoustic environment. In addition to discriminative training of the probability estimator, the temporal processing of existing feature extraction algorithms are modified in such a way as to improve class discrimination in the training environment. Probability streams are generated using multiple front-end acoustic modeling stages trained to heterogeneous acoustic environments. In new sample acoustic environments, simple combinations of these probability streams give rise to word recognition rates that are superior to the individual streams.
URL	http://www.icsi.berkeley.edu/ftp/global/pub/techreports/2000/tr-00-012.pdf
Bibliographic Notes	ICSI Technical Report TR-00-012
Abbreviated Authors	M. L. Shire
ICSI Research Group	Speech
ICSI Publication Type	Technical Report

Google Scholar