| Word-Level Confidence Estimation for Automatic Speech Recognition | A. Hatch | M.S. Thesis, University of California at Berkeley | August 2001 | Speech | [PDF]
|
| Word-Conditioned Phone N-Grams for Speaker Recognition | H. Lei and N. Mirghafori | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, Hawaii, pp. 253-256 | April 2007 | Speech | [PDF]
|
| Word-Conditioned HMM Supervectors for Speaker Recognition | H. Lei and N. Mirghafori | Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp, Belgium, pp. 746-749 | August 2007 | Speech | [PDF]
|
| Word Fragments Identification Using Acoustic-Prosodic Features in Conversational Speech | Y. Liu | Proceedings of HLT/NAACL, Student Session, Edmonton, Alberta | 2003 | Speech | |
| Within-Class Covariance Normalization for SVM-Based Speaker Recognition | A. O. Hatch, S. Kajarekar, and A. Stolcke | Proceedings of the 9th International Conference on Spoken Language Processing (ICSLP-Interspeech 2006), Pittsburgh, Pennsylvania, pp. 1471-1474 | September 2006 | Speech | [PDF]
|
| Wide-Band Perceptual Audio Coding Based on Frequency-Domain Linear Prediction | P. Motlicek, V. Ullal, and H. Hermansky | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, Hawaii, Vol. 1, pp. 265-268 | April 2007 | Speech | |
| Why Is ASR Harder For Fast Speech And What Can We Do About It? | N. Mirghafori, E. Fosler, and N. Morgan | IEEE Snowbird Workshop '95 | 1995 | Speech | [PDF]
|
| Why Has (Reasonably Accurate) Automatic Speech Recognition Been So Hard to Achieve? | S. Wegmann and L. Gillick | ArXiv.org under CoRR abs/1003.0206 | February 2010 | Speech | [PDF]
|
| Who, What, When, Where, Why? Comparing Multiple Approaches to the Cross-Lingual 5W Task | K. Parton, K. R. McKeown, R. Coyne, M. T. Diab, R. Grishman, D. Hakkani-Tür, M. Harper, H. Ji, W. Y. Ma, A. Meyers, S. Stolbach, A. Sun, G. Tur, W. Xu, and S. Yaman | Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the Fourth International Joint Conference on Natural Lanaguage Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2009), Singapore, pp. 423-431 | August 2009 | Speech | [PDF]
|
| Whither Speech Technology? - A Twenty-First Century Perspective | S. Greenberg | Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech 2001), Aalborg, Denmark | September 2001 | Speech | [PDF]
|
| Where did I go Wrong?: Identifying Troublesome Segments for Speaker Diarization Systems | M. T. Knox, N. Mirghafori, and G. Friedland | Proceedings of the 13th Annual Conference of the International Speech Communication Association (InterSpeech 2012), Portland, Oregon | September 2012 | Speech | [PDF]
|
| When a Mismatch Can Be Good: Large Vocabulary Speech Recognition Trained with Idealized Tandem Features | A. Faria and N. Morgan | Proceedings of the ACM Symposium on Applied Computing, Fortaleza, Brazil, pp. 1574-1577 | March 2008 | Speech | [PDF]
|
| What's New in Government-Sponsored Speech Recognition Research | N. Morgan | Speech Technology Magazine, Vol. 7, No. 5 | September 2002 | Speech | |
| Web-Scale Features for Full-Scale Parsing | M. Bansal and D. Klein | Proceedings of the 49th annual Meeting of the Association for Computational Linguistics, pp. 693-702, Portland, Oregon | June 2011 | Speech | [PDF]
|
| Vowel Height is Intimately Associated with Stress Accent in Spontaneous American English Discourse | L. Hitchcock and S. Greenberg | Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech 2001), Aalborg, Denmark | September 2001 | Speech | [PDF]
|
| Vocabulary and Language Model Adaptation Using Information Retrieval | B. Bigi, Y. Huang, and R. De Mori | Proceedings of International Conference on Spoken Language Processing, Jeju, Korea, October 2004. | October 2004 | Speech | [PDF]
|
| Visualizing Large-Screen Electronic Chalkboard Content on Handheld Devices | A. Lüning, G. Friedland, L. Knipping, and R. Rojas | Proceedings of the Second IEEE International Workshop on Multimedia Technologies for E-Learning at 9th IEEE Symposium on Multimedia, Taichung, Taiwan, pp. 369-375 | December 2007 | Speech | |
| Visual Speaker Localization Aided by Acoustic Models | G. Friedland, C. Yeo, and H. Hung | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2009), Beijing, China, pp. 195-202 | October 2009 | Speech | [PDF]
|
| Video2GPS: A Demo of Multimodal Location Estimation on Flickr Videos | G. Friedland, J. Choi, and A. Janin | Proceedings of the ACM Multimedia Conference (MM'11), Scottsdale, Arizona | November 2011 | Speech | [PDF]
|
| Using Symbolic Prominence to Help Design Feature Subsets for Topic Classification and Clustering of Natural Human-Human Conversations | C. Boulis and M. Ostendof | Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005-Eurospeech 2005), Lisbon, Portugal | September 2005 | Speech | [PDF]
|
| Using Spectro-Temporal Features to Improve AFE Feature Extraction for ASR | S. Ravuri and N. Morgan | Proceedings of the 11th Internationational Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 1181-1184 | September 2010 | Speech | |
| Using Prosody for Automatic Sentence Segmentation of Multi-Party Meetings | J. Kolar, E. Shriberg, and Y. Liu | Proceedings of 9th International Conference on Text, Speech and Dialogue (TSD 2006), Brno, Czech Republic, pp. 629-636 | September 2006 | Speech | [PDF]
|
| Using Prosodic and Lexical Information for Speaker Identification | F. Weber, L. Manganaro, B. Peskin, and E. Shriberg | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2002), Orlando, Florida | May 2002 | Speech | [PDF]
|
| Using Prosodic and Conversational Features for High-Performance Speaker Recognition: Report From JHU WS'02. | B. Peskin, J. Navratil, J. Abramson, D. Jones, D. Klusacek, D. Reynolds, and B. Xiang | Proceedings of ICASSP-2003, Hong Kong | April 2003 | Speech | [PDF]
|
| Using Mutual Information to Design Feature Combinations | D. Ellis and J. Bilmes | Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP 2000), Beijing, China | October 2000 | Speech | [PDF]
|
| Using MLP Features in SRI's Conversational Speech Recognition System | Q. Zhu, A. Stolcke, B.Y. Chen, and N. Morgan | Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005-Eurospeech 2005), Lisboa, Portugal, pp. 2141-2144 | September 2005 | Speech | [PDF]
|
| Using Machine Learning to Cope with Imbalanced Classes in Natural Speech: Evidence from Sentence Boundary and Disfluency Detection | Y. Liu, E. Shriberg, A. Stolcke, and M. Harper | Proceedings of International Conference on Spoken Language Processing, Jeju, Korea, October 2004. | 2004 | Speech | [PDF]
|
| Using Knowledge to Organize Sound: The Prediction-driven Approach to Computational Auditory Scene Analysis and Its Application to Speech/Nonspeech Mixtures | D. Ellis | Speech Communication, Vol. 27, Issue 3-4, pp. 281-298 | 1999 | Speech | |
| Using Corpus and Knowledge-Based Similarity Measure in Maximum Marginal Relevance for Meeting Summarization | S. Xie and Y. Liu | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), Las Vegas, Nevada, pp. 4985-4988 | March 2008 | Speech | [PDF]
|
| Using Conditional Random Fields For Sentence Boundary Detection in Speech | Y. Liu, A. Stolcke, E. Shriberg, and M. Harper | Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), Ann Arbor, Michigan, pp. 451-458 | June 2005 | Speech | [PDF]
|
| Using Boosting to Improve a Hybrid HMM/Neural Network Speech Recognizer | H. Schwenk | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1999), Phoenix, Arizona, pp. II-1009-1012 | March 1999 | Speech | [PDF]
|
| Using Audio and Video Features to Classify the Most Dominant Person in Meetings | H. Hung, D. Jayagopi, C. Yeo, G. Friedland, S. Ba, J-M. Odobez, K. Ramchandran, N. Mirghafori, and D. Gatica-Perez | Proceedings of ACM Multimedia 2007, Augsburg, Germany, pp. 835-838 | September 2007 | Speech | |
| Using Artistic Markers and Speaker Identification for Narrative-Theme Navigation of Seinfeld Episodes | G. Friedland, L. Gottlieb, and A. Janin | Proceedings of the 11th IEEE International Symposium on Multimedia (ISM2009), San Diego, California, pp. 511-516 | December 2009 | Speech | [PDF]
|
| Using Acoustic Condition Clustering to Improve Acoustic Change Detection on Broadcast News | J.F. Lopez and D. Ellis | Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP 2000), Beijing, China, Vol. 4, pp. 568-571 | October 2000 | Speech | [PDF]
|
| Using A Stochastic Context-Free Grammar as a Language Model for Speech Recognition | D. Jurafsky, C. Wooters, J. Segal, A. Stolcke, E. Fosler, G. Tajchman, and N. Morgan | Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 95), Detroit, Michigan | May 1995 | Speech | [PDF]
|
| Using A Million Connections for Continuous Speech Recognition | N. Morgan | Invited paper for the International Conference on Neural Information Processing (ICONIP' 94), Seoul, South Korea, pp. 1439-1444 | October 1994 | Speech | |
| User Verification: Matching the Uploaders of Videos Across Accounts | H. Lei, J. Choi, A. Janin, and G. Friedland | Proceedings of the IEEE International Conference on Acoustic, Speech, and Signal Processing (ICASSP 2011), Prague, Czech Republic, pp. 2404-2407 | May 2011 | Speech | [PDF]
|
| Updated MINDS Report on Speech Recognition and Understanding, Part 2 | J. Baker, L. Deng, S. Khudanpur, C.-H. Lee, J. Glass, N. Morgan, and D. O'Shgughnessy | IEEE Signal Processing Magazine, Vol. 26, No. 4, pp. 78-85 | July 2009 | Speech | [PDF]
|
| Unsupervised Learning of Edit Parameters for Matching Name Variants | D. Gillick, D. Hakkani-Tur, and M. Levit. | Proceedings of the 9th Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 467-470 | September 2008 | Speech | [PDF]
|
| Unknown-Multiple Speaker Clustering Using HMM | J. Ajmera, H. Bourlard, I. Lapidot, and I. McCowan | Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP 2002), Denver, Colorado | May 2002 | Speech | |
| Understanding Speech Understanding | S. Greenberg | Proceedings of the ESCA Workshop on the "Auditory Basis of Speech Perception," Keele University, Staffordshire, UK, pp. 1-8 | 1996 | Speech | [PDF]
|
| Two's a Crowd: Improving Speaker Diarization by Automatically Identifying and Excluding Overlapped Speech Authors | K. Boakye, O. Vinyals, and G. Friedland | Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 32-35 | September 2008 | Speech | |
| Tuning-Robust Initialization Methods for Speaker Diarization | D. Imseng and G. Friedland | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 18, Issue 8, pp. 2028-2037 | November 2010 | Speech | [PDF]
|
| TRAPping Conversational Speech: Extending TRAP/Tandem Approaches to Conversational Telephone Speech Recognition | N. Morgan, B. Y. Chen, Q. Zhu, and A. Stolcke | Proceedings of IEEE ICASSP, Montreal | May 2004 | Speech | [PDF]
|
| Transmissions and Transitions: A Study of Two Common Assumptions in Multi-Band ASR | N. Mirghafori and N. Morgan | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1998), Seattle, Washington, pp. 713-716 | 1998 | Speech | [PDF]
|
| Transition-Based Statistical Training for ASR | N. Morgan, Y. Konig, S.L. Wu, and H. Bourlard | IEEE Snowbird Workshop '95 | 1995 | Speech | [PDF]
|
| Training Neural Networks with SPERT-II | K. Asanovic, J. Beck, D. Johnson, B. Kingsbury, N. Morgan, and J. Wawrzynek | Chapter in Parallel Architectures for Artificial Networks - Paradigms and Implementations, eds. N. Sundararajan and P. Saratchandran, IEEE Computer Society Press, pp. 345-364 | 1998 | Speech | |
| Towards Subband-Based Speech Recognition | H. Bourlard, S. Dupont, H. Hermansky, and N. Morgan | Proceedings of the VIII European Signal Processing Conference (EUSIPCO '96), Trieste, Italy, pp. 1579-1582 | 1996 | Speech | |
| Towards Structured Approaches to Arbitrary Data Selection and Performance Prediction for Speaker Recognition | H. Lei | Proceedings of the Third IAPR/IEEE International Conference on Biometrics (ICB 2009), Alghero, Italy | June 2009 | Speech | [PDF]
|
| Towards Semantic Analysis of Conversations: A System for the Live Identification of Speakers in Meetings | O. Vinyals and G. Friedland | Proceedings of IEEE International Conference on Semantic Computing, Santa Clara, pp. 426-431 | August 2008 | Speech | [PDF]
|