Discriminant Training of Front-End and Acoustic Modeling Stages to Heterogeneous Acoustic Environments for Multi-Stream Automatic Speech Recognition

TitleDiscriminant Training of Front-End and Acoustic Modeling Stages to Heterogeneous Acoustic Environments for Multi-Stream Automatic Speech Recognition
Publication TypeTechnical Report
Year of Publication2000
AuthorsShire, M. Lee
Other Numbers1187
Abstract

The performance of Automatic Speech Recognition (ASR) systems degrades in the presence of adverse acoustic conditions. A possible shortcoming of the typical ASR system is the reliance on a single stream of front-end acoustic features and acoustic modeling feature probabilities. A single front-end feature extraction algorithm may not be capable of maintaining robustness to arbitrary acoustic environments. Acoustic modeling will also degrade due to distributional changes caused by the acoustic environment. This report explores the parallel use of multiple front-end and acoustic modeling elements to improve upon this shortcoming. Each ASR acoustic modeling component is trained to estimate class posterior probabilities in a particular acoustic environment. In addition to discriminative training of the probability estimator, the temporal processing of existing feature extraction algorithms are modified in such a way as to improve class discrimination in the training environment. Probability streams are generated using multiple front-end acoustic modeling stages trained to heterogeneous acoustic environments. In new sample acoustic environments, simple combinations of these probability streams give rise to word recognition rates that are superior to the individual streams.

URLhttp://www.icsi.berkeley.edu/ftp/global/pub/techreports/2000/tr-00-012.pdf
Bibliographic Notes

ICSI Technical Report TR-00-012

Abbreviated Authors

M. L. Shire

ICSI Research Group

Speech

ICSI Publication Type

Technical Report