"Discriminative Training and Adaptation of Large Vocabulary ASR Systems"
We have worked on discriminative training for HMM-based large vocabulary ASR systems at Cambridge since the mid 1990's and have developed techniques that are now being widely applied to constructing the acoustic models for state-of-the-art large vocabulary recognition systems. This talk will provide a brief overview of this work, and focus on our recent work in this area. Topics covered will include the basic lattice-based framework for discriminative training using the extended Baum-Welch algorithm; the minimum phone error objective function; extensions for discriminative MAP; supervised and unsupervised discriminative transform-based adaptation. The typical performance of a number of these techniques will be presented for a variety of large vocabulary speech recognition tasks.
Speaker Bio:
Phil Woodland is currently Professor of Information Engineering at Cambridge University Engineering Department where he has been a member of faculty staff since 1989. His research interests are in the area of speech technology, with a focus on all aspects of large vocabulary speech recognition systems. Other work has included auditory modelling, statistical speech synthesis, named entity recognition, and spoken document retrieval. His work on acoustic model adaptation using maximum likelihood linear regression and on discriminative training is particularly well-known. He was an original co-devloper of the widely-used HTK Hidden Markov Model Toolkit, and continues to play a leading role in its development. Since 1992 he has participated in DARPA/NIST speech recognition evaluations. He is the Principal Investigator of the Cambridge HTK Rich Audio Transcription project funded under the DARPA EARS program.