Hooking Up Spectro-Temporal Filters with Auditory-Inspired Representations for Robust Automatic Speech Recognition
Title | Hooking Up Spectro-Temporal Filters with Auditory-Inspired Representations for Robust Automatic Speech Recognition |
Publication Type | Conference Paper |
Year of Publication | 2012 |
Authors | Meyer, B. T., Spille C., Kollmeier B., & Morgan N. |
Other Numbers | 3325 |
Abstract | Spectro-temporal filtering has been shown to result infeatures that can help to increase the robustness of automaticspeech recognition (ASR) in the past. We replacethe spectro-temporal representation used in previouswork with spectrograms that incorporate knowledgeabout the signal processing of the human auditory systemand which are derived from Power-Normalized CepstralCoefficients (PNCCs). 2D-Gabor filters are appliedto these spectrograms to extract features evaluated on anoisy digit recognition task. The filter bank is adapted tothe new representation by optimizing the spectral modulationfrequencies associated with each Gabor function.A comparison of optimized parameters and the spectralmodulation of vowels shows a good match between optimizedand expected range of frequencies. When processedwith a non-linear neural net and combined withPNCCs, Gabor features decrease the error rate compared |
URL | http://www.icsi.berkeley.edu/pubs/speech/ICSI_hookingup12.pdf |
Bibliographic Notes | Proceedings of the 13th Annual Conference of the International Speech Communication Association (InterSpeech 2012), Portland, Oregon |
Abbreviated Authors | B. Meyer, C. Spille, B. Kollmeier, and N. Morgan |
ICSI Research Group | Speech |
ICSI Publication Type | Article in conference proceedings |