Language Modeling in the ICSI-SRI Spring 2005 Meeting Speech Recognition Evaluation System
Title | Language Modeling in the ICSI-SRI Spring 2005 Meeting Speech Recognition Evaluation System |
Publication Type | Technical Report |
Year of Publication | 2005 |
Authors | etin, Ö. Ç., & Stolcke A. |
Other Numbers | 1611 |
Abstract | In this report, we describe the language models (LMs) used in the ICSI-SRI system for the NIST Spring 2005 Meeting Rich Transcription (RT-05S) evaluation. Our LMs are linear interpolations of $n$-gram models trained on a small number of in-domain sources and a large number of out-of-domain sources, which include conference proceedings and newly collected web data, in addition to other commonly-used corpora. Despite the lack of any training data for the lecture recognition task in the evaluation, effective LMs for this task are designed. As compared to the LMs of the ICSI-SRI-UW system for the NIST Spring 2004 Meeting Rich Transcription (RT-04S) evaluation, significant improvements in perplexity and word error rate (WER) are obtained, which are mainly due to the additional training data from the web and conference proceedings. |
URL | http://www.icsi.berkeley.edu/ftp/global/pub/techreports/2005/tr-05-006.pdf |
Bibliographic Notes | ICSI Technical Report TR-05-006 |
Abbreviated Authors | O. Cetin and A. Stolcke |
ICSI Research Group | Speech |
ICSI Publication Type | Technical Report |