| Mel, Linear, and Antimel Frequency Cepstral Coefficients in Broad Phonetic Regions for Telephone Speaker Recognition | H. Lei and E. Lopez-Gonzalo | Proceedings of the 10th International Conference of the International Speech Communication Association (Interspeech 2009), Brighton, United Kingdom, pp. 2323-2326 | September 2009 | Speech | [PDF]
|
| Word-Conditioned Phone N-Grams for Speaker Recognition | H. Lei and N. Mirghafori | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, Hawaii, pp. 253-256 | April 2007 | Speech | [PDF]
|
| Word-Conditioned HMM Supervectors for Speaker Recognition | H. Lei and N. Mirghafori | Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp, Belgium, pp. 746-749 | August 2007 | Speech | [PDF]
|
| Comparisons of Recent Speaker Recognition Approaches Based on Word Conditioning | H. Lei and N. Mirghafori | Proceedings of Odyssey 2008, Stellenbosch, South Africa | January 2008 | Speech | [PDF]
|
| Data Selection with Kurtosis and Nasality features for Speaker Recognition | H. Lei and N. Mirghafori | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 2753-2756 | August 2011 | Speech | [PDF]
|
| Spectro-Temporal Gabor Features for Speaker Recognition | H. Lei, B. T. Meyer, and N. Mirghafori | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, Japan | March 2012 | Speech | [PDF]
|
| User Verification: Matching the Uploaders of Videos Across Accounts | H. Lei, J. Choi, A. Janin, and G. Friedland | Proceedings of the IEEE International Conference on Acoustic, Speech, and Signal Processing (ICASSP 2011), Prague, Czech Republic, pp. 2404-2407 | May 2011 | Speech | [PDF]
|
| Multimodal City-Verification on Flickr Videos Using Acoustic and Textual Features | H. Lei, J. Choi, and G. Friedland | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, Japan | March 2012 | Speech | [PDF]
|
| Using Boosting to Improve a Hybrid HMM/Neural Network Speech Recognizer | H. Schwenk | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1999), Phoenix, Arizona, pp. II-1009-1012 | March 1999 | Speech | [PDF]
|
| The Value of Auditory Offset Adaptation and Appropriate Acoustic Modeling | H. Wang, D. Gelbart, H.G. Hirsch, and W. Hemmert | Proceedings of the 9th Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 902-905 | September 2008 | Speech | [PDF]
|
| Relevance of Time-Frequency Features for Phonetic and SpeakerChannel Classification | H.H. Yan, S. Sharma, S. van Vuuren, and H. Hermansky | Speech Communication,Vol. 1, No. 31, pp. 35-50 | May 2000 | Speech | [PDF]
|
| Search for Information Bearing Components in Speech | H.H. Yang and H. Hermansky | Advances in Neural Information Processing Systems, Vol. 12, S.A. Solla, T.K. Leen and K.-R. Muller, eds., MIT Press | 2000 | Speech | |
| Relevancy of Time Frequency Features for Phonetic Classification Measured by Mutual Information | H.H. Yang, S. van Vuuren, and H. Hermansky | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1999), Phoenix, Arizona | March 1999 | Speech | |
| Getting more mileage from web text sources for conversational speech language modeling using class-dependent mixtures | I. Bulyko, M. Ostendorf, and A. Stolcke | Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2003), Edmonton, Canada, Vol. 2, pp. 7-9 | May 2003 | Speech | [PDF]
|
| The ICSI Meeting Corpus: Close-Talking and Far-Field, Multi-Channel Transcriptions for Speech and Language Researchers | J. A. Edwards | Proceedings of the Workshop on Compiling and Processing Spoken Language Corpora at the Fourth International Conference on Language Resources and Evaluation (LREC 2004), pp. 8-11 | May 2004 | Speech | [PDF]
|
| A Robust Speaker Clustering Algorithm | J. Ajmera and C. Wooters | Proceedings of IEEE Speech Recognition and Understanding Workshop, St. Thomas, U.S. Virgin Islands | December 2003 | Speech | [PDF]
|
| Unknown-Multiple Speaker Clustering Using HMM | J. Ajmera, H. Bourlard, I. Lapidot, and I. McCowan | Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP 2002), Denver, Colorado | May 2002 | Speech | |
| Prosody-Based Automatic Detection of Annoyance and Frustration in Human-Computer Dialog | J. Ang, R. Dhillon, A. Krupski, E. Shriberg, and A. Stolcke | Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP 2002), Denver, Colorado | September 2002 | Speech | |
| Automatic Dialog Act Segmentation and Classification in Multiparty Meetings | J. Ang, Y. Liu, and E. Shriberg | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, Pennsylvania, pp. 1061-1064 | March 2005 | Speech | [PDF]
|
| Research Developments and Directions in Speech Recognition and Understanding, Part 1 | J. Baker, L. Deng, J. Glass, S. Khudanpur, C.-H. Lee, N. Morgan, and D. O'Shaughnessy | IEEE Signal Processing Magazine, Vol. 26, No. 3, pp. 75-80 | May 2009 | Speech | |
| Updated MINDS Report on Speech Recognition and Understanding, Part 2 | J. Baker, L. Deng, S. Khudanpur, C.-H. Lee, J. Glass, N. Morgan, and D. O'Shgughnessy | IEEE Signal Processing Magazine, Vol. 26, No. 4, pp. 78-85 | July 2009 | Speech | [PDF]
|
| Combining Bottom-Up and Top-Down Constraints for Robust ASR: The Multiscore Decoder | J. Barker, M. Cooke, and D. Ellis | Proceedings of the Workshop on Consistent and Reliable Acoustic Cues (CRAC-2001), Aalborg, Denmark | September 2001 | Speech | |
| Decoding Speech in the Presence of Other Sound Sources | J. Barker, M. Cooke, and D. Ellis | Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP 2000), Beijing, China | October 2000 | Speech | [PDF]
|
| A Multi-DSP Ring Array for Connectionist Simulations | J. Beck, N. Morgan, A. Allman, and J. Beer | Proceedings of 23rd Asilomar Conference on Signals, Systems & Computers | 1989 | Speech | |
| Natural Statistical Models for Automatic Speech Recognition | J. Bilmes | Ph.D. Thesis, University of California at Berkeley, Fall 1999. Also ICSI Technical Report TR-99-016 | October 1999 | Speech | [PDF]
|
| Buried Markov Models for Speech Recognition | J. Bilmes | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1999), Phoenix, Arizona, pp. II-713-716 | March 1999 | Speech | [PDF]
|
| Data-Driven Extensions to HMM Statistical Dependencies | J. Bilmes | Proceedings of the Fifth International Conference on Spoken Language Processing (ICSLP '98), Sydney, Australia, pp. 69-72 | November 1998 | Speech | [PDF]
|
| Maximum Mutual Information Based Reduction Strategies for Cross-Correlation Based Joint Distributional Modeling | J. Bilmes | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1998), Seattle, Washington, pp. 469-472 | May 1998 | Speech | [PDF]
|
| Joint Distributional Modeling with Cross-Correlation Based Features | J. Bilmes | Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings (ASRU-97), Santa Barbara, California, pp.148-155 | 1997 | Speech | [PDF]
|
| Factored Language Models and Generalized Parallel Backoff | J. Bilmes and K. Kirchhoff | Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2003), Edmonton, Canada, p. 1 | May 2003 | Speech | [PDF]
|
| Stochastic Perceptual Speech Models with Durational Dependence | J. Bilmes, N. Morgan, S.L. Wu, and H. Bourlard | Proceedings of the Fourth International Conference on Spoken Language Processing (CSLP-96), Philadelphia, Pennsylvania | 1996 | Speech | [PDF]
|
| Data-Driven vs. Semantic-Technology-Driven Tag-Based Video Location Estimation | J. Choi and G. Friedland | Proceedings of the IEEE International Conference on Semantic Computing (ICSC 2011), Palo Alto, California, pp. 243-246 | September 2011 | Speech | [PDF]
|
| Data-Driven vs. Semantic-Technology-Driven Tag-Based Video Location Estimation | J. Choi and G. Friedland | Proceedings of the Fifth IEEE International Conference on Semantic Computing (ICSC 2011), Palo Alto, California, pp. 243-246 | September 2011 | Speech | [PDF]
|
| The 2010 ICSI Video Location Estimation System | J. Choi, A. Janin, and G. Friedland | Proceedings of the MediaEval 2010 Workshop, Pisa Italy | October 2010 | Speech | [PDF]
|
| Multimodal Location Estimation of Consumer Media – Dealing with Sparse Training Data | J. Choi, G. Friedland, V. Ekambaram, and K. Ramchandran | Proceedings of the IEEE International Conference on Multimedia and Expo, Melbourne, Australia, pp. 43-48 | July 2012 | Speech | [PDF]
|
| The 2011 ICSI Video Location Estimation System | J. Choi, H. Lei, and G. Friedland | Proceedings of the MediaEval 2011 Workshop, Pisa, Italy | September 2011 | Speech | [PDF]
|
| The 2012 ICSI/Berkeley Video Location Estimation System | J. Choi, V. Ekambaram, G. Friedland, and K. Ramchandran | Presented at the MediaEval 2012 Workshop, Pisa, Italy | October 2012 | Speech | [PDF]
|
| Opportunities and Challenges of Parallelizing Speech Recognition | J. Chong, G. Friedland, A. Janin, and N. Morgan | Proceedings of the Second USENIX Workshop on Hot Topics in Parallelism (HotPar '10), Berkeley, California | June 2010 | Speech | [PDF]
|
| Sampling Alignment Structure Under a Bayesian Translation Model | J. DeNero, A. Bouchard-Côté, and D. Klein | Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), Waikiki, Honolulu, Hawaii, pp. 314-323 | October 2008 | Speech | [PDF]
|
| Asynchronous Binarization for Synchronous Grammars | J. DeNero, A. Pauls, and D. Klein | Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the Fourth International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2009), Singapore | August 2009 | Speech | [PDF]
|
| Fast Consensus Decoding over Translation Forests | J. DeNero, D. Chiang, and K. Knight | Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the Fourth International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2009), Singapore | August 2009 | Speech | [PDF]
|
| Efficient Parsing for Transducer Grammars | J. DeNero, M. Bansal, A. Pauls, and D. Klein | Proceedings of North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL HLT 2009), Boulder, Colorado, pp. 227-235. | May 2009 | Speech | [PDF]
|
| Chapter 17: The Transcription of Discourse | J. Edwards | The Handbook of Discourse Analysis, D. Shriffrin, D. Tannen and H. Hamilton, eds. Oxford: Blackwell, pp. 321-348 | 2001 | Speech | |
| Prosodic Features and Feature Selection for Multi-lingual Sentence Segmentation | J. Fung, D. Hakkani-Tur, M. Magimai-Doss, E. Shriberg, S. Cuendet, and N. Mirghafori | Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp, Belgium, pp. 2585-2588 | August 2007 | Speech | [PDF]
|
| How Good Is the Crowd at "Real" WSD? | J. Hong and C. F. Baker | Proceedings of the Fifth Linguistic Annotation Workshop (LAW-V), Portland, Oregon | June 2011 | Speech | [PDF]
|
| Integrating RASTA-PLP into Speech Recognition | J. Koehler, N. Morgan, H. Hermansky, H.G. Hirsch, and G. Tong | Proceedings of IEEE International Conference on Acoustics, Speech & Signal Processing, pp. I-421-424 | 1994 | Speech | |
| Using Prosody for Automatic Sentence Segmentation of Multi-Party Meetings | J. Kolar, E. Shriberg, and Y. Liu | Proceedings of 9th International Conference on Text, Speech and Dialogue (TSD 2006), Brno, Czech Republic, pp. 629-636 | September 2006 | Speech | [PDF]
|
| On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings | J. Kolar, E. Shriberg, and Y. Liu | Proceedings of the 9th International Conference on Spoken Language Processing (ICSLP-Interspeech 2006), Pittsburgh, Pennsylvania, pp. 2014-2017 | September 2006 | Speech | [PDF]
|
| Speaker Adaptation of Language Models for Automatic Dialog Act Segmentation of Meetings | J. Kolar, Y. Liu, and E. Shriberg | Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp, Belgium, pp. 1621-1624 | August 2007 | Speech | [PDF]
|
| Genre Effects on Automatic Sentee Segmentation of Speech: A Comparison of Broadcast News and Broadcast Conversationsnc | J. Kolar, Y. Liu, and E. Shriberg | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), Taipei, Taiwan, pp. 4701-4704 | April 2009 | Speech | [PDF]
|