Phone Recognition for Mixed Speech Signals: Comparison of Human Auditory Cortex and Machine Performance

TitlePhone Recognition for Mixed Speech Signals: Comparison of Human Auditory Cortex and Machine Performance
Publication TypeTechnical Report
Year of Publication2015
AuthorsChang, S-Y., Edwards E., Morgan N., Ellis D. P. W., Mesgarani N., & Chang E.
Other Numbers3802
Abstract

It is well known that human beings can often attend to a single sound source within a mixed signal from multiple sources, and that unaided automatic speech recognition (without the benefit of effective blind source separation) is quite poor at this task. Here we report on the analysis of human cortical signals to demonstrate the relative robustness of these signals to the mixed signal phenomenon, which is contrasted to a deep neural network-based ASR system. Confirming this difference with a carefully designed experiment is the first step towards ultimately improving blind source separation for the purpose of speech recognition; in particular, the design of features extracted from the neural signals is leading to insights about the corresponding feature extraction on the acoustic side, e.g., for CASA systems of the future.

Acknowledgment

This work was partially supported by funding provided to ICSI through National Science Foundation grant IIS: 1320260 (“Towards Modeling Source Separation from Measured Cortical Responses”). Additional funding was provided to UC San Francisco through National Science Foundation grant IIS: 1320366 (“Towards Modeling Source Separation from Measured Cortical Responses”). E.F.C. was supported by National Institutes of Health grant R01-DC012379, and McKnight Foundation. Edward Chang is a New York Stem Cell Foundation - Robertson Investigator. This research was supported by The New York Stem Cell Foundation. We also thank Liberty Hamilton and Zack Greenberg for their assistance with the ECoG work. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors or originators and do not necessarily reflect the views of the National Science Foundation, the National Institutes of Health, or the New York Stem Cell Foundation

URLhttps://www.icsi.berkeley.edu/pubs/techreports/TR-15-002.pdf
Bibliographic Notes

ICSI Technical Report TR-15-002

Abbreviated Authors

S.-Y. Chang, E. Edwards, N. Morgan, D. Ellis, N. Mesgarani, and E. Chang

ICSI Research Group

Speech

ICSI Publication Type

Technical Report