Talks at the International Computer Science Institute

The International Computer Science Institute
is pleased to present a talk:


"Using Relative Frequencies of Phone Bigrams as Features for Speaker ID"

Andy Hatch
ICSI
ahatch [Graphic] icsi.berkeley.edu

Tuesday, June 29, 2004
ICSI, Conference Room 5A
12:30 pm

Abstract:

I'll be discussing a research framework where counts of acoustic state occupancies are used as features for performing speaker ID. In my current system, these "acoustic state occupancies" are simply relative frequencies of phone bigrams, which are obtained by running an open-loop phone decoding on each input conversation side. The relative frequencies are used as features for training various types of speaker models, including models based on support vector machines (SVMs). In my talk, I'll be focusing on some recent experiments which have yielded large gains over the previous state-of-the-art in phone-based modeling. I will also discuss some new work on optimizing SVMs for the purposes of training phone-based speaker models.