| Cover Song Detection: From High Scores to General Classification | S. Ravuri and D. Ellis | Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, pp. 65-68 | March 2010 | Speech | [PDF]
|
| Cross-Domain and Cross-Language Portability of Acoustic Features Estimated by Multilayer Perceptrons | A. Stolcke, F. Grezl, M.-Y. Hwang, X. Lei, N. Morgan, and D. Vergyri | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2006), Toulouse, France, pp. 321-324 | May 2006 | Speech | [PDF]
|
| Cross-Genre Feature Comparisons for Spoken Sentence Segmentation | S. Cuendet, D. Hakkani-Tur, E. Shriberg, J. Fung, and B. Favre | Proceedings of International Conference on Semantic Computing, IEEE Computer Society, pp. 265-274, Irvine, California. Also published in International Journal of Semantic Computing, Volume 1, Issue 3, World Scientific, USA, pp. 335-346 | September 2007 | Speech | [PDF]
|
| Cross-Lingual Sentence Extraction for Information Distillation | A. Singla and D. Hakkani-Tur | Proceedings of the 9th Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 2707-2710 | September 2008 | Speech | [PDF]
|
| CUDA-Level Performance with Python-Level Productivity for Gaussian Mixture Model Applications | H. Cook, E. Gonina, S. Kamil, G. Friedland, D. Patterson, and A. Fox | Proceedings of the Third USENIX Workshop on Hot Topics in Parallelism (HotPar ’11), Berkeley, California | May 2011 | Speech | [PDF]
|
| Current Research in Acoustically Robust Speech Recognition | N. Morgan | Proceedings of American Voice Input/Output Society (AVIOS), pp. 207-214 | September 1994 | Speech | |
| Cybercasing the Joint: Language Technologies, Multimedia Retrieval, and Online Privacy | G. Friedland | Presented at the Language Technologies Institute Colloquium, Carnegie Mellon University, Pittsburgh, Pennsylvania | April 13 2012 | Speech | [PDF]
|
| Data Selection with Kurtosis and Nasality features for Speaker Recognition | H. Lei and N. Mirghafori | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 2753-2756 | August 2011 | Speech | [PDF]
|
| Data-Driven Design of RASTA-like Filters | S. van Vuuren and H. Hermansky | Proceedings of the Fifth European Conference on Speech Communication and Technology (Eurospeech '97), Rhodes, Greece | September 1997 | Speech | |
| Data-Driven Extensions to HMM Statistical Dependencies | J. Bilmes | Proceedings of the Fifth International Conference on Spoken Language Processing (ICSLP '98), Sydney, Australia, pp. 69-72 | November 1998 | Speech | [PDF]
|
| Data-Driven Modulation Filter Design Under Adverse Acoustic Conditions and Using Phonetic and Syllabic Units | M.L. Shire | Proceedings of the 6th European Conference on Speech Communication and Technology (Eurospeech '99), Budapest, Hungary, pp. III-1123-1126 | September 1999 | Speech | [PDF]
|
| Data-driven RASTA Filters in Reverberation | M. Shire and B. Chen | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2000), Istanbul, Turkey, pp. III-1627-1630 | June 2000 | Speech | [PDF]
|
| Data-Driven Speaker and Subword Unit Clustering in Speech Processing | M. Hersch | EPFL Diploma Thesis, ICSI | March 2003 | Speech | [PDF]
|
| Data-Driven vs. Semantic-Technology-Driven Tag-Based Video Location Estimation | J. Choi and G. Friedland | Proceedings of the IEEE International Conference on Semantic Computing (ICSC 2011), Palo Alto, California, pp. 243-246 | September 2011 | Speech | [PDF]
|
| Data-Driven vs. Semantic-Technology-Driven Tag-Based Video Location Estimation | J. Choi and G. Friedland | Proceedings of the Fifth IEEE International Conference on Semantic Computing (ICSC 2011), Palo Alto, California, pp. 243-246 | September 2011 | Speech | [PDF]
|
| Decoding Speech in the Presence of Other Sound Sources | J. Barker, M. Cooke, and D. Ellis | Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP 2000), Beijing, China | October 2000 | Speech | [PDF]
|
| Deep and Wide: Multiple Layers in Automatic Speech Recognition | N. Morgan | IEEE Transactions on Audio, Speech, and Language Processing, Special Issue on Deep Learning | 2011 | Speech | [PDF]
|
| Deep and Wide: Multiple Layers in Automatic Speech Recognition | N. Morgan | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, Issue 1, pp. 7-13 | January 2012 | Speech | [PDF]
|
| Desperately Seeking Impostors: Data-Mining for Competitive Impostor Testing in a Text-Dependent Speaker Verification System | M. Hebert and N. Mirghafori | Proceedings of IEEE ICASSP, Montreal | May 2004 | Speech | [PDF]
|
| Detecting Categories in News Video Using Acoustic, Speech, and Image Features | S. Petrov, A. Faria, P. Michaillat, A. Berg, A. Stolcke, D. Klein, and J. Malik | Presented at the NIST TREC Video Retrieval Workshop, Gaithersburg, Maryland | November 2006 | Speech | [PDF]
|
| Detecting Deception Using Critical Segments | F. Enos, E. Shriberg, M. Graciarena, J. Hirschberg, and A. Stolcke | Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp, Belgium, pp. 2281-2284 | August 2007 | Speech | [PDF]
|
| Detecting Local Semantic Concepts in Environmental Sounds Using Markov Model Based Clustering | K. Lee, D. Ellis, and A. Loui | Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, March 2010 | March 2010 | Speech | [PDF]
|
| Detecting Music in Ambient Audio by Long-Window Autocorrelation | K. Lee and D. Ellis | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), Las Vegas, Nevada, pp. 9-12 | April 2008 | Speech | [PDF]
|
| Detection and Compensation of Sensor Malfunction in Time Delay Based Direction of Arrival Estimation | T. Pirinen, J. Yli-Hietanen, P. Pertilä, and A. Visa | Proceedings of IEEE ISCAS, Vancouver | May 2004 | Speech | [PDF]
|
| Detection of Agreement vs. Disagreement In Meetings: Training With Unlabeled Data | D. Hillard, M. Ostendorf, and E. Shriberg | Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2003), Edmonton, Canada | May 2003 | Speech | [PDF]
|
| Development of the SRI/Nightingale Arabic ASR system | D. Vergyri, A. Mandal, W. Wang, A. Stolcke, J. Zheng, M. Graciarena, D. Rybach, C. Gollan, R. Schlater, K. Kirchoff, A. Faria, and N. Morgan | Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 1437-1440 | September 2008 | Speech | |
| Dialocalizaton: Acoustic Speaker Diarization and Visual Localization as Joint Optimization Problem | G. Friedland, C. Yeo, and H. Hung | ACM Transactions on Multimedia Computing, Communications, and Applications, Vol. 6, No. 4, Article 27 | November 2010 | Speech | [PDF]
|
| Dialog Act Tagging Using Graphical Models | G. Ji and J. Bilmes | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, Pennsylvania, Vol. 1, pp. 33-36 | March 2005 | Speech | [PDF]
|
| Digit Recognition with Stochastic Perceptual Models | N. Morgan, S.L. Wu, and H. Bourlard | Proceedings of the Fourth European Conference on Speech Communication and Technology (Eurospeech '95), Madrid, Spain | September 1995 | Speech | [PDF]
|
| Direct Modeling of Prosody: An Overview of Applications in Automatic Speech Processing | E. Shriberg and A. Stolcke | Proceedings of the International Conference on Speech Prosody, Nara, Japan, March 2004. | March 2004 | Speech | [PDF]
|
| Discourse Segmentation of Multi-party Conversation | M. Galley, K. McKeown, E. Fosler-Lussier, and H. Jing | Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL-03), Sapporo, Japan | July 2003 | Speech | [PDF]
|
| Discriminant Training of Front-End and Acoustic Modeling Stages to Heterogeneous Acoustic Environments for Multi-stream Automatic Speech Recognition | M. Shire | Ph.D Dissertation, University of California at Berkeley, Fall 2000 | 2000 | Speech | [PDF]
|
| Discriminative Pronunciation Learning Using Phonetic Decoder and Minimum-Classification-Error Criterion | O. Vinyals, L. Deng, D. Yu, and A. Acero | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), Taipei, Taiwan, pp. 4445-4448 | April 2009 | Speech | [PDF]
|
| Discriminative Training for Hierarchical Clustering in Speaker Diarization | O. Vinyals, G. Friedland, and N. Morgan | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 2326-2329 | September 2010 | Speech | [PDF]
|
| Discriminative Training for Speech Recognition is Compensating for Statistical Dependence on the HMM Framework | D. Gillick and S. Wegmann, L. Gillick | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, Japan | March 2012 | Speech | [PDF]
|
| Does Active Learning Help Automatic Dialog Act Tagging in Meeting Data? | A. Venkataraman, Y. Liu, E. Shriberg, and A. Stolcke | Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005-Eurospeech 2005), Lisboa, Portugal, pp. 2777-2780 | September 2005 | Speech | [PDF]
|
| Does Session Variability Compensation in Speaker Recognition Model Intrinsic Variation Under Mismatched Conditions? | E. Shriberg, S. Kajarekar, and N. Scheffer | Proceedings of the 10th International Conference of the International Speech Communication Association (Interspeech 2009), Brighton, United Kingdom, pp. 1551-1554 | September 2009 | Speech | [PDF]
|
| Don't Multiply Lightly: Quantifying Problems with the Acoustic Model Assumptions in Speech Recognition | D. Gillick, L. Gillick, and S. Wegmann | Proceedings of the Automatic Speech Recognition and Understanding Workshop (ASRU), Big Island, Hawaii | December 2011 | Speech | [PDF]
|
| Double the Trouble: Handling Noise and Reverberation in Far-Field Automatic Speech Recognition | D. Gelbart and N. Morgan | Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP 2002), Denver, Colorado | September 2002 | Speech | [PDF]
|
| Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification | G. Tur, E. Shriberg, A. Stolcke, and S. Kajarekar | Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech--Eurospeech 2008), Antwerp, Belgium, pp. 2049-2052 | August 2007 | Speech | [PDF]
|
| Dynamic Classifier Combinations in Hybrid Speech Recognition Systems Using Utterance-Level Confidence Values | K. Kirchhoff and J. Bilmes | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1999), Phoenix, Arizona, pp. II-693-696 | March 1999 | Speech | [PDF]
|
| Dynamic Pronunciation Models for Autmoatic Speech Recognition | E. Fosler-Lussier | Ph.D. Thesis, UC Berkeley, Fall 1999, ICSI Technical Report TR-99-015 | September 1999 | Speech | [PDF]
|
| Dynamic Pronunciation Models for Automatic Speech Recognition | E. Fosler-Lussier | Ph.D Dissertation, University of California at Berkeley | August 1999 | Speech | [PDF]
|
| Easy Does It: Robust Spectro-Temporal Many-Stream ASR Without Fine Tuning Streams | S. Ravuri and N. Morgan | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, Japan | March 2012 | Speech | |
| Educational Multimedia | G. Friedland, L. Knipping, and W. Huerst (guest editors) | Special Section in IEEE Multimedia Magazine, pp. 54-74, July-Sept. 2008 | July 2008 | Speech | [PDF]
|
| Educational Multimedia Systems: The Past, the Present, and a Glimpse into the Future | G. Friedland, W. Huerst, and L. Knipping | Proceedings of the ACM Workshop on Educational Multimedia and Multimedia Education at ACM Multimedia 2007, Augsburg, Germany, pp. 1-4 | September 2007 | Speech | |
| EEG Signal Compression Based on Classified Signature and Envelope Vector Sets | H. Gurkan, U. Guz, and B.S. Yarman | Proceedings of the European Conference on Circuit Theory and Design, IEEE Circuits and Systems Society and the European Circuit Society, Seville, Spain, pp. 420-423 | August 2007 | Speech | |
| Effective Arabic Dialect Classification Using Diverse Phonotactic Models | M. Akbacak, D. Vergyri, A. Stolcke, N. Scheffer, and A. Mandal | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 737-740 | August 2011 | Speech | [PDF]
|
| Effects of Speaking Rate and Word Frequency on Conversational Pronunciations | E. Fosler-Lussier and N. Morgan | Speech Communication Vol. 29, No. 2-4, pp. 137-158 | November 1999 | Speech | [PDF]
|
| Effects of Speaking Rate and Word Predictability on Conversational Pronunciations | E. Fosler-Lussier and N. Morgan | Proceedings of the ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition, Kerkrade, Netherlands | May 1998 | Speech | [PDF]
|