Robust CNN - based Speech Recognition With Gabor Filter Kernels

TitleRobust CNN - based Speech Recognition With Gabor Filter Kernels
Publication TypeConference Paper
Year of Publication2016
AuthorsYiin-Chang, S., & Morgan N.
Published inProceedings of Interspeech 2016
Abstract

As has been extensively shown, acoustic features for speech recognition can be learned from neural networks with multiple hidden layers. However, the learned transformations may not sufficiently generalize to test sets that have a significant mismatch to the training data. Gabor features, on the other hand, are generated fromspectro-temporal filters designed to model human auditory processing.In previous work, these features are used as inputs to neural networks, which improved word accuracy for speech recognition in the presence of noise. Here we propose a neural network architecture called a Gabor Convolutional Neural Network (GCNN) that incorporates Gabor functions into convolutional filter kernels. In this architecture, a variety of Gabor features served as the multiple feature maps of the convolutional layer. The filter coefficients are further tuned by back-propagation training.Experiments used two noisy versions of the WSJ corpus: Aurora 4, and RATS re-noised WSJ. In both cases, the proposed architecture performs better than other noise-robust features that we have tried,namely, ETSI-AFE, PNCC, Gabor features without the CNN-based approach,and our best neural network features that don’t incorporate Gabor functions
 

Acknowledgment

We thank Adam Janin and Dan Ellis for the helpful discussion.We thank Hans-Günter Hirsch for help in setting up the Aurora 4. We also thank Chanwoo Kim and Richard Stern for use of PNCC. This material is based on work supported by theDefense Advanced Research Projects Agency (DARPA) under Contract No. D10PC20024. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the view of the DARPA or its Contracting Agent, the U.S. Department of the Interior, National Business Center, Acquisition & Property Management Division, Southwest Branch
 

URLhttps://pdfs.semanticscholar.org/1d34/0fe19026b0359bde23fcd7299a99a240bd15.pdf
ICSI Research Group

Speech