Talks at the International Computer Science Institute

The International Computer Science Institute
is pleased to present a talk:


"Using Syntactic Information in Speaker Identification"

Nikki Mirghafori
ICSI
nikki [Graphic] icsi.berkeley.edu

Monday, November 1, 2004
ICSI, Conference Room 5A
11:30 a.m.

Abstract:

There are various clues in identifying speakers: the acoustic quality of their voice, their prosody, their pronunciation, their choice of words, and potentially, their stylistic and grammatical usage. In this work, I explored using syntactic information, in the form of Super-ARV parse tags. SARV (Wang 2003) is an almost-parsing language model, based on the Constraint Dependency Grammar, and allows a richer representation compared to part-of-speech tagging, for example. I employed n-grams and SVMs to build models of (multiple levels of) SARV parse tags. The syntactic systems were evaluated in isolation, and in combination with both acoustic and word-based models. The results so far show that, although the addition of syntactic information improves the performance of acoustic based systems, this improvement is subsumed by that of a word-based system. Future directions towards more integrated modeling approaches to combining prosodic and lexical information are discussed.