| Friends and Enemies: A Novel Initialization for Speaker Diarization | X. Anguera, C. Wooters, and J. Hernando | Proceedings of the 9th International Conference on Spoken Language Processing (ICSLP-Interspeech 2006), Pittsburgh, Pennsylvania, pp. 689-692 | September 2006 | Speech | [PDF]
|
| Multi-Stream Speaker Diarization Systems for the Meetings Domain | A. Gallardo-Antolin, X. Anguera, and C. Wooters | Proceedings of the 9th International Conference on Spoken Language Processing (Interspeech 2006—ICSLP), Philadelphia, Pennsylvania, pp. 2186-2189 | September 2006 | Speech | [PDF]
|
| Multimedia Information Extraction Roadmap | G. Myers, G. Tür, L. Voss, B. Bolles, S. Kajarekar, E. Shriberg, and D. Hakkani-Tür | Proceedings of the AAAI Fall Symposium on Multimedia Information Extraction, Arlington, Virginia | November 2008 | Speech | [PDF]
|
| Mutaphrase: Paraphrasing with FrameNet | M. Ellsworth and A. Janin | Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing (TextEntail), Prague, Czech Republic, pp. 143-150 | June 2007 | Speech | [PDF]
|
| Multimodal Interfaces for Automotive Applications (MIAA) | C. Müller and G. Friedland | Proceedings of the ACM International Conference on Intelligent User Interfaces (IUI 2009), Sanibel, Florida, pp. 493-494 | February 2009 | Speech | |
| Joke-o-Mat: Browsing Sitcoms Punchline by Punchline | G. Friedland, L. Gottlieb, and A. Janin | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2009), Beijing, China, pp. 1115-1116 | October 2009 | Speech | [PDF]
|
| Visual Speaker Localization Aided by Acoustic Models | G. Friedland, C. Yeo, and H. Hung | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2009), Beijing, China, pp. 195-202 | October 2009 | Speech | [PDF]
|
| Multimodal Location Estimation | G. Friedland, O. Vinyals, and T. Darrell | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 1245-1251 | October 2010 | Speech | [PDF]
|
| Joke-O-Mat HD: Browsing Sitcoms with Human Derived Transcripts | A. Janin, L. Gottlieb, and G. Friedland | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 1591-1594 | October 2010 | Speech | [PDF]
|
| Precise Indoor Localization Using Smart Phones | E. Martin, O. Vinyals, G. Friedland, and R. Bajcsy | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 787-790 | October 2010 | Speech | [PDF]
|
| Automatic Tagging and Geo-Tagging in Video Collections and Communities | M. Larson, M. Soleymani, P. Serdyukov, S. Rudinac, C. Wartena, V. Murdock, G. Friedland, R. Ordelman, and G. J. F. Jones | Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR 2011), Trento, Italy, April 2011 | April 2011 | Speech | [PDF]
|
| There is No Data Like Less Data: Percepts for Video Concept Detection on Consumer-Produced Media | Benjamin Elizalde; Gerald Friedland; Howard Lei; Ajay Divakaran | Proceedings of the ACM International Workshop on Audio and Multimedia Methods for Large-Scale Video Analysis (AMVA) at ACM Multimedia 2012 (MM'12), Nara, Japan, pp. 27-32 | October 2012 | Speech | [PDF]
|
| Acoustic Super Models for Large Scale Video Event Detection | R. Mertens, H. Lei, L. Gottlieb, G. Friedland, and A. Divakaran | Proceedings of the ACM International Workshop on Events in Multimedia (EiMM11), Scottsdale, Arizona | November 2011 | Speech | [PDF]
|
| Multimodal Location Estimation on Flickr Videos | G. Friedland, J. Choi, H. Lei, and A. Janin | Proceedings of the ACM International Workshop on Social Media (WSM11), Scottsdale, Arizona | November 2011 | Speech | [PDF]
|
| Video2GPS: A Demo of Multimodal Location Estimation on Flickr Videos | G. Friedland, J. Choi, and A. Janin | Proceedings of the ACM Multimedia Conference (MM'11), Scottsdale, Arizona | November 2011 | Speech | [PDF]
|
| When a Mismatch Can Be Good: Large Vocabulary Speech Recognition Trained with Idealized Tandem Features | A. Faria and N. Morgan | Proceedings of the ACM Symposium on Applied Computing, Fortaleza, Brazil, pp. 1574-1577 | March 2008 | Speech | [PDF]
|
| Pushing the Limits of Mechanical Turk: Qualifying the Crowd for Video Geo-Location | L. Gottlieb, J. Choi, P. Kelm, T. Sikora, and G. Friedland | Proceedings of the ACM Workshop on Crowdsourcing for Multimedia (CrowdMM 2012), held in conjunction with ACM Multimedia 2012, pp. 23-28, Nara, Japan | October 2012 | Speech | [PDF]
|
| Educational Multimedia Systems: The Past, the Present, and a Glimpse into the Future | G. Friedland, W. Huerst, and L. Knipping | Proceedings of the ACM Workshop on Educational Multimedia and Multimedia Education at ACM Multimedia 2007, Augsburg, Germany, pp. 1-4 | September 2007 | Speech | |
| A Low-Cost Mobile Pointing and Drawing Device | K. Jantz, G. Friedland, L. Knipping, and R. Rojas | Proceedings of the ACM Workshop on Educational Multimedia and Multimedia Education at ACM Multimedia 2007, Augsburg, Germany, pp. 121-122 | September 2007 | Speech | |
| REMAP: Recursive Estimation and Maximization of A Posteriori Probabilities - Application to Transition-Based Connectionist Speech Recognition | Y. Konig, H. Bourlard, and N. Morgan | Proceedings of the Advances in Neural Information Processing Systems 8 Conference (NIPS 8), Denver, Colorado, pp. 388-394 | November 1995 | Speech | |
| SPERT-II: A Vector Microprocessor System and Its Application to Large Problems in Backpropagation Training | J. Wawrzynek, K. Asanovic, B. Kingsbury, J. Beck, D. Johnson, and N. Morgan | Proceedings of the Advances in Neural Information Processing Systems 8 Conference (NIPS 8), Denver, Colorado, pp. 619-625. Also in IEEE Computer, Vol. 29, No. 3, pp 79-86, March 1996. | November 1995 | Speech | |
| Development of the SRI/Nightingale Arabic ASR system | D. Vergyri, A. Mandal, W. Wang, A. Stolcke, J. Zheng, M. Graciarena, D. Rybach, C. Gollan, R. Schlater, K. Kirchoff, A. Faria, and N. Morgan | Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 1437-1440 | September 2008 | Speech | |
| Packing the Meeting Summarization Knapsack | K. Riedhammer, D. Gillick, B. Favre, and D. Hakkani-Tur | Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 2434-2437 | September 2008 | Speech | [PDF]
|
| Speech-Overlapped Acoustic Event Detection for Automotive Applications | C. Müller, J. I. Biel, E. Kim, and D. Rosario | Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 2590-2593 | September 2008 | Speech | [PDF]
|
| Two's a Crowd: Improving Speaker Diarization by Automatically Identifying and Excluding Overlapped Speech Authors | K. Boakye, O. Vinyals, and G. Friedland | Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 32-35 | September 2008 | Speech | |
| Getting the Last Laugh: Automatic Laughter Segmentation in Meetings | M. Knox, N. Morgan, and N. Mirghafori | Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 797-800 | September 2008 | Speech | [PDF]
|
| Multi-Stream Spectro-Temporal Features for Robust Speech Recognition | S. Y. Zhao and N. Morgan | Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 898-901 | September 2008 | Speech | [PDF]
|
| Reducing the Effect of Room Acoustics on Human-Computer Interaction | D. Gelbart | Proceedings of the Applied Voice Input/Output Society (AVIOS 2002), San Jose, California | May 2002 | Speech | [PDF]
|
| Improved Recognition by Combining Different Features and Different Systems | D.P.W. Ellis | Proceedings of the Applied Voice Input/Output Society (AVIOS-2000), San Jose, California | May 2000 | Speech | [PDF]
|
| Meeting Recorder | A. Janin | Proceedings of the Applied Voice Input/Output Society, San Jose, California | April 2001 | Speech | [PDF]
|
| Evaluating Long-term Spectral Subtraction for Reverberant ASR | D. Gelbart and N. Morgan | Proceedings of the Automatic Speech Recognition and Understanding Workshop (ASRU 2001), Madonna di Campiglio, Italy | December 2001 | Speech | [PDF]
|
| Don't Multiply Lightly: Quantifying Problems with the Acoustic Model Assumptions in Speech Recognition | D. Gillick, L. Gillick, and S. Wegmann | Proceedings of the Automatic Speech Recognition and Understanding Workshop (ASRU), Big Island, Hawaii | December 2011 | Speech | [PDF]
|
| The Uninvited Guest: Information's Role in Guiding the Production of Spontaneous Speech | S. Greenberg and E. Fosler-Lussier | Proceedings of the Crest Workshop on Models of Speech Production: Motor Planning and Articulatory Modelling, Kloster Seeon, Germany | May 2000 | Speech | [PDF]
|
| Not Just What, But Also When: Guided Automatic Pronunciation Modeling for Broadcast News | E. Fosler-Lussier and G. Williams | Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Herndon, Virginia | February 1999 | Speech | [PDF]
|
| Reducing Errors by Increasing the Error Rate: MLP Acoustic Modeling for Broadcast News Transcription | N. Morgan, D. Ellis, E. Fosler-Lussier, A. Janin, and B. Kingsbury | Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Herndon, Virginia | February 1999 | Speech | [PDF]
|
| An Overview of the SPRACH System for the Transcription of Broadcast News | G. Cook, J. Christie, D. Ellis, E. Fosler-Lussier, Y. Gotoh, B. Kingsbury, N. Morgan, S. Renals, T. Robinson, and G. Williams | Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Herndon, Virginia | February 1999 | Speech | [PDF]
|
| Incorporating Tandem/HATs MLP Features into SRI's Conversational Speech Recognition System | Q. Zhu, A. Stolcke, B. Y. Chen, and N. Morgan | Proceedings of the EARS RT-04F Workshop, Palisades, New York, November 2004. | November 2004 | Speech | [PDF]
|
| Spotting "Hot Spots" in Meetings: Human Judgments and Prosodic Cues | B. Wrede and E. Shriberg | Proceedings of the Eighth European Conference on Speech Communication and Technology (EUROSPEECH 2003), Geneva, Switzerland, pp. 2805-2808 | September 2003 | Speech | [PDF]
|
| On the Origins of Speech Intelligibility in the Real World | S. Greenberg | Proceedings of the ESCA Workshop of Robust Speech Recognition, Pont-a-Mousson, France, pp. 23-32 | 1997 | Speech | [PDF]
|
| Robust Features and Environmental Compensation: A Few Comments | N. Morgan | Proceedings of the ESCA Workshop of Robust Speech Recognition, Pont-a-Mousson, France, pp. 43-44 | 1997 | Speech | [PDF]
|
| Improving ASR Performance for Reverberant Speech | B. Kingsbury, N. Morgan, and S. Greenberg | Proceedings of the ESCA Workshop of Robust Speech Recognition, Pont-a-Mousson, France, pp. 87-90 | 1997 | Speech | [PDF]
|
| Speaking in Shorthand - A Syllable-Centric Perspective for Understanding Pronunciation Variation | S. Greenberg | Proceedings of the ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition, Kekrade, Netherlands, pp. 47-56 | 1998 | Speech | [PDF]
|
| Effects of Speaking Rate and Word Predictability on Conversational Pronunciations | E. Fosler-Lussier and N. Morgan | Proceedings of the ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition, Kerkrade, Netherlands | May 1998 | Speech | [PDF]
|
| Prediction-driven Computational Auditory Scene Analysis for Dense Sound Mixtures | D. Ellis | Proceedings of the ESCA Workshop on the "Auditory Basis of Speech Perception," Keele University, Staffordshire, UK | 1996 | Speech | [PDF]
|
| Understanding Speech Understanding | S. Greenberg | Proceedings of the ESCA Workshop on the "Auditory Basis of Speech Perception," Keele University, Staffordshire, UK, pp. 1-8 | 1996 | Speech | [PDF]
|
| A New Algorithm for High Speed Speech and Audio Coding | U. Guz, H. Gurkan, and B.S. Yarman | Proceedings of the European Conference on Circuit Theory and Design, IEEE Circuits and Systems Society and the European Circuit Society, Seville, Spain | August 2007 | Speech | |
| EEG Signal Compression Based on Classified Signature and Envelope Vector Sets | H. Gurkan, U. Guz, and B.S. Yarman | Proceedings of the European Conference on Circuit Theory and Design, IEEE Circuits and Systems Society and the European Circuit Society, Seville, Spain, pp. 420-423 | August 2007 | Speech | |
| Multiresolution Channel Normalization for ASR in Reverberant Environments | C. Avendano, S. Tibrewala, and H. Hermansky | Proceedings of the Fifth European Conference on Speech Communication and Technology (Eurospeech '97), Rhodes, Greece | September 1997 | Speech | |
| Data-Driven Design of RASTA-like Filters | S. van Vuuren and H. Hermansky | Proceedings of the Fifth European Conference on Speech Communication and Technology (Eurospeech '97), Rhodes, Greece | September 1997 | Speech | |
| Estimation of Global Posteriors and Forward-Backward Training of Hybrid HMM/ANN Systems | L. Hennebert, C. Ris, H. Bourlard, S Renals, and N. Morgan | Proceedings of the Fifth European Conference on Speech Communication and Technology (Eurospeech '97), Rhodes, Greece, pp. 1951-1954 | September 1997 | Speech | |