Publication Details

Title: The ICSI/SRI/UW RT04 Structural Metadata Extraction System
Author: Y. Liu, E. Shriberg, A. Stolcke, B. Peskin, and M. Harper
Group: Speech
Date: January 2004
PDF: [Not available online]

Overview:
Both human and automatic processing of speech require recognizing more than just the words. We describe the ICSI-SRI-UW metadata detection system in both broadcast news and spontaneous telephone conversations, developed as part of the DARPA EARS Rich Transcription program. System tasks include sentence boundary detection, filler word detection, and detection/correction of disfluencies. To achieve best performance, we combine information from different types of textual knowledge sources (based on words, part-of-speech classes, and automatically induced classes) with information from a prosodic classifier. The prosodic classifier employs bagging and ensemble approaches to better estimate posterior probabilities. In addition to our previous HMM approach, we investigate using a maximum entropy (Maxent) and a conditional random field (CRF) approach for various tasks. Results using these techniques are presented for the 2004 NIST Rich Transcription metadata tasks.

Bibliographic Information:
RT-04 EARS Workshop

Bibliographic Reference:
Y. Liu, E. Shriberg, A. Stolcke, B. Peskin, and M. Harper. The ICSI/SRI/UW RT04 Structural Metadata Extraction System. RT-04 EARS Workshop, January 2004