Research Project Seeks to Identify Underlying Challenges to Current ASR Technology

June 12, 2012
A new research project at ICSI is focused on exploring automatic speech recognition (ASR) to understand the limitations and challenges from current technologies. Sponsored by the Intelligence Advanced Research Projects Activity (IARPA) via the Air Force Research Lab (AFRL), the research aims to use its conclusions to lead to new methods for improving ASR technology.

"This is a unique research project in that we are qualitatively and quantitatively exploring what is wrong with automatic speech recognition. From that we hope to gain insights into how we can improve ASR, potentially going forward in entirely new directions,'' said ICSI Deputy Director Nelson Morgan, who leads the project. "When you don't know specifically what is wrong with a technology, you are left with a hit-or-miss situation. This research should give us some clarity.'' Morgan is also the leader of the Speech Group.

The research project includes two major parts. The first is an in-depth look at the assumptions behind acoustic modeling, a key component of ASR that creates statistical representations of each of the distinctive sounds that makes up words. This will enable ICSI researchers to discover technical challenges that prevent ASR from being more accurate.

The second part is a broad survey of experts and colleagues in the field, asking for perceptions on where ASR technology is effective, where it fails, and what its shortcomings are. This study will include interviews of practitioners and a review of recent literature to derive community consensus on what approaches don.t work and to develop guidelines for future analysis.

Steven Wegmann and Jordan Cohen serve as co-principal investigators of the research. Wegmann oversees the in-depth acoustic modeling phase, and Cohen the breadth field survey phase.

This research is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Air Force Research Laboratory (AFRL) contract number FA8650-12-C-7217. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, AFRL, or the U.S. Government.

