Comparing Different Flavors of Spectro-Temporal Features for ASR
Title | Comparing Different Flavors of Spectro-Temporal Features for ASR |
Publication Type | Conference Paper |
Year of Publication | 2011 |
Authors | Meyer, B. T., Ravuri S., Schädler M. René, & Morgan N. |
Page(s) | 1269-1272 |
Other Numbers | 3181 |
Abstract | In the last decade, several studies have shown that the robustnessof ASR systems can be increased when 2D Gabor filtersare used to extract specific modulation frequencies from theinput pattern. This paper analyzes important design parametersfor spectro-temporal features based on a Gabor filter bank:We perform experiments with filters that exhibit different phasesensitivity. Further, we analyze if non-linear weighting with amulti-layer perceptron (MLP) and a subsequent concatenationwith mel-frequency cepstral coefficients (MFCCs) has beneficialeffects. For the Aurora2 noisy digit recognition task, the useof phase sensitive filters improved the MFCC baseline, whereasusing filters that neglect phase information did not. While MLPprocessing alone did not have a large effect on the overall performance,the best results were obtained for MLP-processedphase sensitive filters and added MFCCs, with relative error reductions |
Acknowledgment | This work was partially funded by the Deutscher Akademischer Austausch Diesnst (DAAD) through a postdoctoral fellowship granted to Bernd Meyer. Further support was provided to Suman Ravuri by the National Defense Science and Engineering Graduate Fellowship (NDSEG); and to Nelson Morgan by Cisco Systems. |
URL | http://www.icsi.berkeley.edu/pubs/speech/comparingdifferent11.pdf |
Bibliographic Notes | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 1269-1272 |
Abbreviated Authors | B. T. Meyer, S. V. Ravuri, M. R. Schaedler, and N. Morgan |
ICSI Research Group | Speech |
ICSI Publication Type | Article in conference proceedings |