Using Fast and Slow Modulations to Model Human Hearing of Fast and Slow Speech
Title | Using Fast and Slow Modulations to Model Human Hearing of Fast and Slow Speech |
Publication Type | Technical Report |
Year of Publication | 2015 |
Authors | Chang, S-Y., Morgan N., Raju A., Alwan A., & Kreiman J. |
Other Numbers | 3804 |
Abstract | A collaboration between the Speech Processing and Auditory Perception laboratory at UCLAand the Speech Group at ICSI focused on the refinement of the simple models used in ASR withrepresentations that have been filtered in the modulation domain to better match humanperception. To quantitatively measure the effects of this modification, UCLA collected CVCstimuli uttered quickly and more slowly, and conducted perceptual tests for clean and noisyversions of the stimuli. The ICSI team then conducted tests to determine if inclusion of Gaborfilteredspectrograms with lower or higher temporal modulations could be used to correlatebetter with human perception. Here we report on results that confirmed an improvement inthis correlation, particularly for noisy and rapid speech, while also improving the accuracy.Overall accuracies in noise for all systems tested, though, were quite poor, suggesting thatfurther auditory modeling might be necessary to improve the modeling of human performanceon this task. |
Acknowledgment | We are indebted to Bernd Meyer and Marc Schädler for their versions of Gabor filters that we routinely use. And last but not least, we acknowledge the support of NSF Award 1248047.Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors or originators and do not necessarily reflect the views of the National Science Foundation.Note: title of the NSF grant at ICSI is Towards Modeling Human Speech Confusions in Noise. This was a project of the Speech Group. |
URL | https://www.icsi.berkeley.edu/pubs/techreports/TR-15-003.pdf |
Bibliographic Notes | ICSI Technical Report TR-15-003 |
Abbreviated Authors | S.-Y. Chang, N. Morgan, A. Raju, A. Alwan, and J. Kreiman |
ICSI Research Group | Speech |
ICSI Publication Type | Technical Report |