research

speaker id ...

The work I have done in speaker recognition has focused on the use of high-frequency, habitual words (e.g., discourse markers, filled pauses, and backchannels) to generate speaker models for speaker detection. The modeling is done using Hidden Markov Models (HMMs), with the expectation that such a method better utilizes the sequential nature of the speech data. For more information, please refer to the documents below.

Publications

ICSI's 2005 Speaker Recognition System
N. Mirghafori A.O. Hatch, S. Stafford, K. Boakye, D. Gillick and B. Peskin
Proceedings of ASRU, Puerto Rico, 2005.
paper - (pdf)
Speaker Recognition in the Text-Independent Domain Using Keyword Hidden Markov Models
K. Boakye
M.S. Thesis, University of California at Berkeley, May 2005.
paper - (pdf)
Text-Constrained Speaker Recognition on a Text-Independent Task
K. Boakye and B. Peskin
Odyssey 2004 - The Speaker and Language Recognition Workshop, Toledo, Spain, June 2004.
paper - (pdf)

Presentations

Speaker ID Smorgasbord, or How I Spent My Summer at ICSI
Speech group lunch talk, presented September 20, 2004.
presentation - (ppt)
The ICSI Speaker Recognition System
NIST 2004 Speaker Recognition Evaluation workshop, presented June 4, 2004 in Toledo, Spain.
presentation - (pdf)
modified presentation - (pdf)
Text-constrained Speaker Recognition Using Hidden Markov Models
Speech group lunch talk, presented August 12, 2003.
presentation - (ppt)

Additional Documents

ICSI 2004 Speaker Recognition Evaluation System Description - (txt)

Demos

Speaker ID Retreat Demo
Presented July 22, 2004 at the ICSI/SRE Speaker ID retreat
click here for demo

meetings ...

My present work seeks to address the phenomena of crosstalk and overlapped speech in multiparty meetings, both of which are significant sources of errors for automatic speech recognition (ASR) systems applied in this domain. With regard to crosstalk, I'm examining the effectiveness of various features for an HMM based segmenter that is intended to segment local speech from nonspeech and crosstalk on individual headset microphone (IHM) channels. The work on overlapped speech is divided into two components. The first seeks to identify features useful for the detection of overlapped speech within an HMM based segmenter as in the crosstalk case. The second analyzes the ability of speech separation techniques to process the overlapped speech to improve speech recognition accuracy.

The details of this work can be found in my thesis proposal, under "Additional Documents", below

Project Notes

Here is where I intend to put specific experiments that I've performed and general thoughts and questions I need to further explore.
view notes

Publications

Improved Speech Activity Detection Using Cross-Channel Features for Recognition of Multiparty Meetings
K. Boakye and A. Stolcke
Proc. ICSLP-Interspeech 2006, Pittsburgh, 2006.
paper - (pdf)
Further Progress in Meeting Recognition: The ICSI-SRI Spring 2005 Speech-to-Text Evaluation System
A. Stolcke, X. Anguera, K. Boakye, O. Cetin, F. Grezl, A. Janin, A. Mandal, B. Peskin, C. Wooters, and J. Zheng
Proc. NIST MLMI Meeting Recognition Workshop, Edinburgh, 2005.
paper - (pdf)

Presentations

Speech Detection, Classification, and Processing for Improved Automatic Speech Recognition in Multiparty Meetings
Qualifying exam talk, presented January 17, 2007.
presentation - (ppt)
Features for Improved Speech Activity Detection for Recognition of Multiparty Meetings
Speech group lunch talk, presented May 30, 2006.
presentation - (ppt)
Mixed Signals: Speech Activity Detection and Crosstalk in the Meetings Domain
Speech group lunch talk, presented June 14, 2005.
presentation - (ppt)

Additional Documents

Speech Detection, Classification, and Processing for Improved Automatic Speech Recognition in Multiparty Meetings
Thesis proposal
paper - (pdf)