Using Conditional Random Fields For Sentence Boundary Detection in Speech

TitleUsing Conditional Random Fields For Sentence Boundary Detection in Speech
Publication TypeConference Paper
Year of Publication2005
AuthorsLiu, Y., Stolcke A., Shriberg E., & Harper M. P.
Published inProceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005)
Page(s)451-458
Other Numbers1497
Abstract

Sentence boundary detection in speech is important for enriching speech recognition output, making it easier for humans to read and downstream modules to process. In previous work, we have developed hidden Markov model (HMM) and maximum entropy (Maxent) classifiers that integrate textual and prosodic knowledge sources for detecting sentence boundaries. In this paper, we evaluate the use of a conditional random field (CRF) for this task and relate results with this model to our prior work. We evaluate across two corpora (conversational telephone speech and broadcast news speech) on both human transcriptions and speech recognition output. In general, our CRF model yields a lower error rate than the HMM and Maxent models on the NIST sentence boundary detection task in speech, although it is interesting to note that the best results are achieved by three-way voting among the classifiers. This probably occurs because each model has different strengths and weaknesses for modeling the knowledge sources.

URLhttp://www.icsi.berkeley.edu/pubs/speech/conrandomfields05.pdf
Bibliographic Notes

Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), Ann Arbor, Michigan, pp. 451-458

Abbreviated Authors

Y. Liu, A. Stolcke, E. Shriberg, and M. Harper

ICSI Research Group

Speech

ICSI Publication Type

Article in conference proceedings