| Multispeaker Speech Activity Detection for the ICSI Meeting Recorder | T. Pfau, D. Ellis, and A. Stolcke | Proceedings of Automatic Speech Recognition and Understanding Workshop (ASRU 2001),
Madonna di Campiglio, Italy, pp. 107-110 | December 2001 | Speech | [PDF]
|
| Multiresolution Channel Normalization for ASR in Reverberant Environments | C. Avendano, S. Tibrewala, and H. Hermansky | Proceedings of the Fifth European Conference on Speech Communication and Technology (Eurospeech '97), Rhodes, Greece | September 1997 | Speech | |
| Multiple-State Context-Dependent Phonetic Modeling with MLPs | M. Cohen, H. Franco, N. Morgan, D. Rumelhart, and V. Abrash | Proceedings of the Speech Research Symposium XII, Rutgers University, Camden, New Jersey | 1992 | Speech | |
| Multiple-Pronunciation Lexical Modeling in a Speaker Independent Speech Understanding System | C. Wooters and A. Stolcke | Proceedings of the Third International Conference on Spoken Language Processing (ICSLP 94), Yokohama, Japan, pp. 1963-1966 | September 1994 | Speech | [PDF]
|
| Multimodal Speaker Diarization Using Oriented Optical Flow Histograms | M. Knox and G. Friedland | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 290-293 | September 2010 | Speech | [PDF]
|
| Multimodal Model Integration for Sentence Unit Detection | L. Chen, Y. Liu, M. Harper, and E. Shriberg | Sixth International Conference on Multimodal Interfaces, October 2004 | 2004 | Speech | |
| Multimodal Location Estimation on Flickr Videos | G. Friedland, J. Choi, H. Lei, and A. Janin | Proceedings of the ACM International Workshop on Social Media (WSM11), Scottsdale, Arizona | November 2011 | Speech | [PDF]
|
| Multimodal Location Estimation of Consumer Media – Dealing with Sparse Training Data | J. Choi, G. Friedland, V. Ekambaram, and K. Ramchandran | Proceedings of the IEEE International Conference on Multimedia and Expo, Melbourne, Australia, pp. 43-48 | July 2012 | Speech | [PDF]
|
| Multimodal Location Estimation | G. Friedland, O. Vinyals, and T. Darrell | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 1245-1251 | October 2010 | Speech | [PDF]
|
| Multimodal Interfaces for Automotive Applications (MIAA) | C. Müller and G. Friedland | Proceedings of the ACM International Conference on Intelligent User Interfaces (IUI 2009), Sanibel, Florida, pp. 493-494 | February 2009 | Speech | |
| Multimodal Indoor Localization: An Audio-Wireless-Based Approach | O. Vinyals, E. Martin, and G. Friedland | Proceedings of the Fourth IEEE International Conference on Semantic Computing (ICSC-2010), Pittsburgh, Pennsylvania, pp. 120-125 | September 2010 | Speech | [PDF]
|
| Multimodal City-Verification on Flickr Videos Using Acoustic and Textual Features | H. Lei, J. Choi, and G. Friedland | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, Japan | March 2012 | Speech | [PDF]
|
| Multimedia Technologies for E-Learning 2007 | G. Friedland, L. Knipping, and N. Ludwig (eds.) | Special Issue of Interactive Technology Smart Education (ITSE), Vol. 4, Issue 4 | November 2007 | Speech | |
| Multimedia Technologies for E-learning | G. Friedland and L. Knipping (editors) | Special issue of International Journal of Interactive Technology Smart Education (ITSE), Vol 4, No 1, Troubador Publishing Ltd., United Kingdom | March 2007 | Speech | |
| Multimedia Information Extraction Roadmap | G. Myers, G. Tür, L. Voss, B. Bolles, S. Kajarekar, E. Shriberg, and D. Hakkani-Tür | Proceedings of the AAAI Fall Symposium on Multimedia Information Extraction, Arlington, Virginia | November 2008 | Speech | [PDF]
|
| Multimedia Education—Can We Find Unity in Diversity? | G. Friedland, W. Hürst, and L. Knipping | Proceedings of the 16th ACM International Conference on Multimedia, Vancouver, Canada, pp. 1115-1116 | October 2008 | Speech | [PDF]
|
| Multimedia Education in Computer Science -- A Little Bit of Everything Is Not Enough | G. Friedland, L. Knipping, and W. Huerst | IEEE Multimedia Magazine, Vol. 15, Issue 2, pp. 78-82 | April 2008 | Speech | [PDF]
|
| Multimedia Data Formats and Semantic Computing: A Practical Example and its Implications for the Future | G. Friedland | IEEE International Conference on Semantic Computing, Irvine, California | September 2007 | Speech | |
| Multiband Audio Modeling for Single-Channel Acoustic Source Separation | M.J. Reyes-Gomez, D. Ellis, and N. Jojic | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '04), Montreal, Canada, Vol.5, pp. 641-644 | May 2004 | Speech | [PDF]
|
| Multi-View Semi-Supervised Learning for Dialog Act Segmentation of Speech | U. Guz, S. Cuendet, G. Tur, and D. Hakkani-Tür | IEEE Transactions on Audio, Speech and Language Processing, Vol. 18, Issue 2, pp. 320-329 | February 2010 | Speech | [PDF]
|
| Multi-Stream to Many-Stream: Using Spectro-Temporal Features for ASR | S. Y. Zhao, S. Ravuri, and N. Morgan | Proceedings of the 10th International Conference of the International Speech Communication Association (Interspeech 2009), Brighton, United Kingdom, pp. 2951-2954 | September 2009 | Speech | [PDF]
|
| Multi-stream Speech Recognition: Ready for Prime Time? | A. Janin, D. Ellis, and N. Morgan | Proceedings of the 6th European Conference on Speech Communication and Technology (Eurospeech '99), Budapest, Hungary, pp. II-591-594 | September 1999 | Speech | [PDF]
|
| Multi-Stream Spectro-Temporal Features for Robust Speech Recognition | S. Y. Zhao and N. Morgan | Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 898-901 | September 2008 | Speech | [PDF]
|
| Multi-Stream Speaker Diarization Systems for the Meetings Domain | A. Gallardo-Antolin, X. Anguera, and C. Wooters | Proceedings of the 9th International Conference on Spoken Language Processing (Interspeech 2006—ICSLP), Philadelphia, Pennsylvania, pp. 2186-2189 | September 2006 | Speech | [PDF]
|
| Multi-Stream ASR trained with Heterogeneous Reverberant Environments | M.L. Shire | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2001), Salt Lake City, Utah | May 2001 | Speech | [PDF]
|
| Multi-Speaker Language Modeling | G. Ji and J. Bilmes | Proceedings of the Human Language Technology Conference at the North American Chapter of the Association for Computational Linguistics, Boston, Massachusetts, pp. 133-136 | May 2004 | Speech | [PDF]
|
| Multi-Rate and Variable-Rate Modeling of Speech at Phone and Syllable Time Scales | O. Cetin and M. Ostendorf | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, Pennsylvania, pp. 665-668 | March 2005 | Speech | |
| Multi-modal Speaker Diarization of Real-world Meetings Using Compressed-domain Video Features | G. Friedland, H. Hung, and C. Yeo | ICSI Technical Report TR-08-007, October 2008 | October 2008 | Speech | [PDF]
|
| Multi-Modal Speaker Diarization of Real-World Meeting Using Compressed-Domain Video Features | G. Friedland, H. Hung, and C. Yeo | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), Taipei, Taiwan, pp. 4069-4072 | April 2009 | Speech | [PDF]
|
| Multi-Microphone Signal Processing for Automatic Speech Recognition in Meeting Rooms | M. Ferras Font | M.S. Thesis, Universitat Politecnica de Catalunya, Barcelona, Spain | July 2005 | Speech | [PDF]
|
| Multi-Level Decision Trees for Static and Dynamic Pronunciation Models | E. Fosler-Lussier | Proceedings of the 6th European Conference on Speech Communication and Technology (Eurospeech '99), Budapest, Hungary, pp. I-463-466 | September 1999 | Speech | [PDF]
|
| Multi-Channel Source Separation by Factorial HMMs | M.J. Reyes-gomez, B. Raj, and D. Ellis | Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003), Hong Kong | April 2003 | Speech | [PDF]
|
| Morph-Based Speech Recognition and Modeling of Out-of-Vocabulary Words Across Languages | M. Creutz, T. Hirsimäki, M. Kurimo, A. Puurula, J. Pylkkönen, V. Siivola, M. Varjokallio, E. Arisoy, M. Saraclar, and A. Stolcke | ACM Transactions on Speech and Language Processing, Vol. 5, Issue 1, pp. 1-29 | December 2007 | Speech | [PDF]
|
| Modulation Spectrogram Features for Speaker Diarization | O. Vinyals and G. Friedland | Proceedings of the 9th Annual Conference of the International Speech Communication
Association (Interspeech 2008), Brisbane, Australia, pp. 630-633 | September 2008 | Speech | |
| Modeling Prosodic Feature Sequences for Speaker Recognition | E. Shriberg, L. Ferrer, S. Kajarekar, A. Venkataraman, and A. Stolcke | Speech Communication, Vol. 46, Issues 3-4, pp. 455-472 | July 2005 | Speech | |
| Modeling Other Talkers for Improved Dialog Act Recognition in Meetings | K. Laskowski and E. Shriberg | Proceedings of the 10th International Conference of the International Speech Communication Association (Interspeech 2009), Brighton, United Kingdom, pp. 2783-2786 | September 2009 | Speech | [PDF]
|
| Modeling NERFs for Speaker Recognition | S. Kajarekar, L. Ferrer, K. Sonmez, J. Zheng, E. Shriberg, and A. Stolcke | Proceedings of the Speaker and Language Recognition Workshop (Odyssey 2004), Toledo, Spain, pp. 51-56 | May 2004 | Speech | [PDF]
|
| Modeling Dynamics in Connectionist Speech Recognition - the Time Index Model | Y. Konig and N. Morgan | Proceedings of the Third International Conference on Spoken Language Processing (ICSLP 94), Yokohama, Japan, pp. 1523-1526 | September 1994 | Speech | [PDF]
|
| Modeling Dynamic Prosodic Variation for Speaker Verification | K. Sonmez, E. Shriberg, L. Heck, and M. Weintraub | Proceedings of the Fifth International Conference on Spoken Language Processing (ICSLP'98), Sydney, Australia, Vol. 7, p. 3189 | November 1998 | Speech | |
| Modeling Consistency in a Speaker Independent Continuous Speech Recognition System | Y. Konig, N. Morgan, C. Wooters, V. Abrash, M. Cohen, and H. Franco | Advances in Neural Information Processing Systems, Vol. V, pp. 682-687 | 1993 | Speech | |
| Model Complexity Selection and Cross-validation EM Training for Robust Speaker Diarization | X. Anguera, T. Shinozaki, C. Wooters, and J. Hernando | Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, Hawaii, Vol. 4 pp. 273-276 | April 2007 | Speech | [PDF]
|
| Model Adaptation for Sentence Segmentation from Speech | S. Cuendet, D. Hakkani-Tur, and G. Tur | Proceedings of the IEEE 2006 Workshop on Spoken Language Technology (SLT 2006), Palm Beach, Aruba, pp. 102-105 | December 2006 | Speech | [PDF]
|
| Model Adaptation for Dialog Act Tagging | G. Tur, U. Guz, and D. Hakkani-Tur | Proceedings of the IEEE 2006 Workshop on Spoken Language Technology (SLT 2006), Palm Beach, Aruba, pp. 94-97 | December 2006 | Speech | [PDF]
|
| MLP-Based Feature Extraction for Speech Transcription | N. Morgan, A. Faria, S. Ravuri, and S. Zhao | Handbook of Natural Language Processing and Machine Translation, J. Olive, ed., Springer, in press | 2010 | Speech | |
| MLLR Transforms as Features in Speaker Recognition | A. Stolcke, L. Ferrer, S. Kajarekar, E. Shriberg, and A. Venkataraman | Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005-Eurospeech 2005), Lisboa, Portugal, pp. 2425-2428 | September 2005 | Speech | |
| Midlevel Representations for Computational Auditory Scene Analysis: The Weft Element | D. Ellis and D. Rosenthal | Computational Auditory Scene Analysis, D.F. Rosenthal & H.G. Okuno, eds., Lawrence Erlbaum, pp. 257-272 | 1998 | Speech | |
| Merging Multilayer Perceptrons & Hidden Markov Models: Some Experiments in Continuous Speech Recognition | H. Bourlard and N. Morgan | ICSI Technical Report TR-089-033 | 1989 | Speech | |
| Merging Multilayer Perceptrons & Hidden Markov Models: Some Experiments in Continuous Speech Recognition | H. Bourlard and N. Morgan | Artificial Neural Networks: Advances and Applications | 1990 | Speech | |
| Mel, Linear, and Antimel Frequency Cepstral Coefficients in Broad Phonetic Regions for Telephone Speaker Recognition | H. Lei and E. Lopez-Gonzalo | Proceedings of the 10th International Conference of the International Speech Communication Association (Interspeech 2009), Brighton, United Kingdom, pp. 2323-2326 | September 2009 | Speech | [PDF]
|
| Meetings About Meetings: Research at ICSI on Speech in Multiparty Conversations | N. Morgan, D. Baron, S. Bhagat, H. Carvey, R. Dhillon, J. Edwards, D. Gelbart, A. Janin, A. Krupski, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, and C. Wooters | Proceedings of ICASSP-2003, Hong Kong | April 2003 | Speech | [PDF]
|