Phone Recognition for Mixed Speech Signals: Comparison of Human Auditory Cortex and Machine Performance
Title | Phone Recognition for Mixed Speech Signals: Comparison of Human Auditory Cortex and Machine Performance |
Publication Type | Technical Report |
Year of Publication | 2015 |
Authors | Chang, S-Y., Edwards E., Morgan N., Ellis D. P. W., Mesgarani N., & Chang E. |
Other Numbers | 3802 |
Abstract | It is well known that human beings can often attend to a single sound source within a mixed signal from multiple sources, and that unaided automatic speech recognition (without the benefit of effective blind source separation) is quite poor at this task. Here we report on the analysis of human cortical signals to demonstrate the relative robustness of these signals to the mixed signal phenomenon, which is contrasted to a deep neural network-based ASR system. Confirming this difference with a carefully designed experiment is the first step towards ultimately improving blind source separation for the purpose of speech recognition; in particular, the design of features extracted from the neural signals is leading to insights about the corresponding feature extraction on the acoustic side, e.g., for CASA systems of the future. |
Acknowledgment | This work was partially supported by funding provided to ICSI through National Science Foundation grant IIS: 1320260 (Towards Modeling Source Separation from Measured Cortical Responses). Additional funding was provided to UC San Francisco through National Science Foundation grant IIS: 1320366 (Towards Modeling Source Separation from Measured Cortical Responses). E.F.C. was supported by National Institutes of Health grant R01-DC012379, and McKnight Foundation. Edward Chang is a New York Stem Cell Foundation - Robertson Investigator. This research was supported by The New York Stem Cell Foundation. We also thank Liberty Hamilton and Zack Greenberg for their assistance with the ECoG work. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors or originators and do not necessarily reflect the views of the National Science Foundation, the National Institutes of Health, or the New York Stem Cell Foundation |
URL | https://www.icsi.berkeley.edu/pubs/techreports/TR-15-002.pdf |
Bibliographic Notes | ICSI Technical Report TR-15-002 |
Abbreviated Authors | S.-Y. Chang, E. Edwards, N. Morgan, D. Ellis, N. Mesgarani, and E. Chang |
ICSI Research Group | Speech |
ICSI Publication Type | Technical Report |