| Data-Driven vs. Semantic-Technology-Driven Tag-Based Video Location Estimation | J. Choi and G. Friedland | Proceedings of the Fifth IEEE International Conference on Semantic Computing (ICSC 2011), Palo Alto, California, pp. 243-246 | September 2011 | Speech | [PDF]
|
| Stochastic Perceptual Speech Models with Durational Dependence | J. Bilmes, N. Morgan, S.L. Wu, and H. Bourlard | Proceedings of the Fourth International Conference on Spoken Language Processing (CSLP-96), Philadelphia, Pennsylvania | 1996 | Speech | [PDF]
|
| Factored Language Models and Generalized Parallel Backoff | J. Bilmes and K. Kirchhoff | Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2003), Edmonton, Canada, p. 1 | May 2003 | Speech | [PDF]
|
| Natural Statistical Models for Automatic Speech Recognition | J. Bilmes | Ph.D. Thesis, University of California at Berkeley, Fall 1999. Also ICSI Technical Report TR-99-016 | October 1999 | Speech | [PDF]
|
| Buried Markov Models for Speech Recognition | J. Bilmes | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1999), Phoenix, Arizona, pp. II-713-716 | March 1999 | Speech | [PDF]
|
| Data-Driven Extensions to HMM Statistical Dependencies | J. Bilmes | Proceedings of the Fifth International Conference on Spoken Language Processing (ICSLP '98), Sydney, Australia, pp. 69-72 | November 1998 | Speech | [PDF]
|
| Maximum Mutual Information Based Reduction Strategies for Cross-Correlation Based Joint Distributional Modeling | J. Bilmes | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1998), Seattle, Washington, pp. 469-472 | May 1998 | Speech | [PDF]
|
| Joint Distributional Modeling with Cross-Correlation Based Features | J. Bilmes | Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings (ASRU-97), Santa Barbara, California, pp.148-155 | 1997 | Speech | [PDF]
|
| A Multi-DSP Ring Array for Connectionist Simulations | J. Beck, N. Morgan, A. Allman, and J. Beer | Proceedings of 23rd Asilomar Conference on Signals, Systems & Computers | 1989 | Speech | |
| Combining Bottom-Up and Top-Down Constraints for Robust ASR: The Multiscore Decoder | J. Barker, M. Cooke, and D. Ellis | Proceedings of the Workshop on Consistent and Reliable Acoustic Cues (CRAC-2001), Aalborg, Denmark | September 2001 | Speech | |
| Decoding Speech in the Presence of Other Sound Sources | J. Barker, M. Cooke, and D. Ellis | Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP 2000), Beijing, China | October 2000 | Speech | [PDF]
|
| Updated MINDS Report on Speech Recognition and Understanding, Part 2 | J. Baker, L. Deng, S. Khudanpur, C.-H. Lee, J. Glass, N. Morgan, and D. O'Shgughnessy | IEEE Signal Processing Magazine, Vol. 26, No. 4, pp. 78-85 | July 2009 | Speech | [PDF]
|
| Research Developments and Directions in Speech Recognition and Understanding, Part 1 | J. Baker, L. Deng, J. Glass, S. Khudanpur, C.-H. Lee, N. Morgan, and D. O'Shaughnessy | IEEE Signal Processing Magazine, Vol. 26, No. 3, pp. 75-80 | May 2009 | Speech | |
| Automatic Dialog Act Segmentation and Classification in Multiparty Meetings | J. Ang, Y. Liu, and E. Shriberg | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, Pennsylvania, pp. 1061-1064 | March 2005 | Speech | [PDF]
|
| Prosody-Based Automatic Detection of Annoyance and Frustration in Human-Computer Dialog | J. Ang, R. Dhillon, A. Krupski, E. Shriberg, and A. Stolcke | Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP 2002), Denver, Colorado | September 2002 | Speech | |
| Unknown-Multiple Speaker Clustering Using HMM | J. Ajmera, H. Bourlard, I. Lapidot, and I. McCowan | Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP 2002), Denver, Colorado | May 2002 | Speech | |
| A Robust Speaker Clustering Algorithm | J. Ajmera and C. Wooters | Proceedings of IEEE Speech Recognition and Understanding Workshop, St. Thomas, U.S. Virgin Islands | December 2003 | Speech | [PDF]
|
| The ICSI Meeting Corpus: Close-Talking and Far-Field, Multi-Channel Transcriptions for Speech and Language Researchers | J. A. Edwards | Proceedings of the Workshop on Compiling and Processing Spoken Language Corpora at the Fourth International Conference on Language Resources and Evaluation (LREC 2004), pp. 8-11 | May 2004 | Speech | [PDF]
|
| Getting more mileage from web text sources for conversational speech language modeling using class-dependent mixtures | I. Bulyko, M. Ostendorf, and A. Stolcke | Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2003), Edmonton, Canada, Vol. 2, pp. 7-9 | May 2003 | Speech | [PDF]
|
| Relevancy of Time Frequency Features for Phonetic Classification Measured by Mutual Information | H.H. Yang, S. van Vuuren, and H. Hermansky | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1999), Phoenix, Arizona | March 1999 | Speech | |
| Search for Information Bearing Components in Speech | H.H. Yang and H. Hermansky | Advances in Neural Information Processing Systems, Vol. 12, S.A. Solla, T.K. Leen and K.-R. Muller, eds., MIT Press | 2000 | Speech | |
| Relevance of Time-Frequency Features for Phonetic and SpeakerChannel Classification | H.H. Yan, S. Sharma, S. van Vuuren, and H. Hermansky | Speech Communication,Vol. 1, No. 31, pp. 35-50 | May 2000 | Speech | [PDF]
|
| The Value of Auditory Offset Adaptation and Appropriate Acoustic Modeling | H. Wang, D. Gelbart, H.G. Hirsch, and W. Hemmert | Proceedings of the 9th Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 902-905 | September 2008 | Speech | [PDF]
|
| Using Boosting to Improve a Hybrid HMM/Neural Network Speech Recognizer | H. Schwenk | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1999), Phoenix, Arizona, pp. II-1009-1012 | March 1999 | Speech | [PDF]
|
| Multimodal City-Verification on Flickr Videos Using Acoustic and Textual Features | H. Lei, J. Choi, and G. Friedland | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, Japan | March 2012 | Speech | [PDF]
|
| User Verification: Matching the Uploaders of Videos Across Accounts | H. Lei, J. Choi, A. Janin, and G. Friedland | Proceedings of the IEEE International Conference on Acoustic, Speech, and Signal Processing (ICASSP 2011), Prague, Czech Republic, pp. 2404-2407 | May 2011 | Speech | [PDF]
|
| Spectro-Temporal Gabor Features for Speaker Recognition | H. Lei, B. T. Meyer, and N. Mirghafori | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, Japan | March 2012 | Speech | [PDF]
|
| Word-Conditioned Phone N-Grams for Speaker Recognition | H. Lei and N. Mirghafori | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, Hawaii, pp. 253-256 | April 2007 | Speech | [PDF]
|
| Word-Conditioned HMM Supervectors for Speaker Recognition | H. Lei and N. Mirghafori | Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp, Belgium, pp. 746-749 | August 2007 | Speech | [PDF]
|
| Comparisons of Recent Speaker Recognition Approaches Based on Word Conditioning | H. Lei and N. Mirghafori | Proceedings of Odyssey 2008, Stellenbosch, South Africa | January 2008 | Speech | [PDF]
|
| Data Selection with Kurtosis and Nasality features for Speaker Recognition | H. Lei and N. Mirghafori | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 2753-2756 | August 2011 | Speech | [PDF]
|
| Importance of Nasality Measures for Speaker Recognition Data Selection and Performance Prediction | H. Lei and E. Lopez-Gonzalo | Proceedings of the 10th International Conference of the International Speech Communication Association (Interspeech 2009), Brighton, United Kingdom, pp. 888-891 | September 2009 | Speech | [PDF]
|
| Mel, Linear, and Antimel Frequency Cepstral Coefficients in Broad Phonetic Regions for Telephone Speaker Recognition | H. Lei and E. Lopez-Gonzalo | Proceedings of the 10th International Conference of the International Speech Communication Association (Interspeech 2009), Brighton, United Kingdom, pp. 2323-2326 | September 2009 | Speech | [PDF]
|
| ICSI System Description for SRE2008 Submission | H. Lei and D.V. Leeuwen | Speaker Recognition Evaluation 2008, National Institute of Standards and Technology | 2008 | Speech | [PDF]
|
| Applications of Keyword-Constraining in Speaker Recognition | H. Lei | MS Thesis, University of California-Berkeley | July 2007 | Speech | [PDF]
|
| Towards Structured Approaches to Arbitrary Data Selection and Performance Prediction for Speaker Recognition | H. Lei | Proceedings of the Third IAPR/IEEE International Conference on Biometrics (ICB 2009), Alghero, Italy | June 2009 | Speech | [PDF]
|
| Structured Approaches to Data Selection for Speaker Recognition | H. Lei | UC Berkeley dissertation | December 2010 | Speech | [PDF]
|
| Estimating the Dominant Person in Multi-Party Conversations Using Speaker Diarization Strategies | H. Hung, Y. Huang, G. Friedland, and D. Gatica-Perez | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, Nevada, pp. 2197-2200 | April 2008 | Speech | [PDF]
|
| Estimating Dominance in Multi-Party Meetings Using Speaker Diarization from a Single Microphone | H. Hung, Y. Huang, G. Friedland, and D. Gatica-Perez | IEEE Transactions on Audio, Speech and Language Processing, Vol. 19, No. 4, pp. 847–860 | May 2011 | Speech | |
| Computationally Efficient Clustering of Audio-Visual Meeting Data | H. Hung, G. Friedland, and C. Yeo | In Multimedia Interaction and Intelligent User Interfaces: Principles, Methods, and Applications, M. Etho, J. Luo, and L. Shao, eds., pp. 25-59 | 2010 | Speech | |
| Using Audio and Video Features to Classify the Most Dominant Person in Meetings | H. Hung, D. Jayagopi, C. Yeo, G. Friedland, S. Ba, J-M. Odobez, K. Ramchandran, N. Mirghafori, and D. Gatica-Perez | Proceedings of ACM Multimedia 2007, Augsburg, Germany, pp. 835-838 | September 2007 | Speech | |
| Towards Audio-Visual On-Line Diarization of Participants in Group Meetings | H. Hung and G. Friedland | Proceedings of European Conference on Computer Vision (ECCV), Marseille, France | October 2008 | Speech | [PDF]
|
| Recognition of Speech in Additive and Convolutional Noise Based on RASTA Spectral Processing | H. Hermansky, N. Morgan, and H.G. Hirsch | Proceedings of the IEEE Conference on Acoustics, Speech & Signal Processing, Minneapolis, Minnesota, pp. II-83-86 | 1993 | Speech | |
| The Challenge of Inverse-E: The RASTA-PLP Method | H. Hermansky, N. Morgan, A. Bayya, and P. Kohn | Proceedings of the 25th Asilomar Conference on Signals, Systems, & Computers, Pacific Grove, California, pp. 800-804 | November 1991 | Speech | |
| RASTA-PLP Speech Analysis Technique | H. Hermansky, N. Morgan, A. Bayya, and P. Kohn | Proceedings of IEEE International Conference on Acoustics, Speech & Signal Processing, San Francisco, California, pp. I-121-124 | 1992 | Speech | |
| Tandem Connectionist Feature Stream Extraction for Conventional HMM Systems | H. Hermansky, D. Ellis, and S. Sharma | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2000), Istanbul, Turkey, pp. III-1635-1638 | June 2000 | Speech | [PDF]
|
| Automatic Speech Recognition | H. Hermansky, and N. Morgan | Encyclopedia of Cognitive Science, Nature Publishing Group, London | 2003 | Speech | |
| Compensation for the effect of the communication channel in Perceptual Linear Predictive (PLP) analysis of speech | H. Hermansky, A. Bayya, N. Morgan, P. Kohn | Proceedings of the Second European Conference on Speech Communication and Technology (Eurospeech '91), Genova, Italy, pp. 1367-1370 | 1991 | Speech | |
| Temporal Patterns (TRAPS) in ASR of Noisy Speech | H. Hermansky and S. Sharma | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1999), Phoenix, Arizona | March 1999 | Speech | |
| Show What You Know: Musings on the Reporting of Negative Results in Speech Recognition Research | H. Hermansky and N. Morgan | Journal of Negative Results in Speech and Audio Sciences, Vol. 1, Issue 1 | 2004 | Speech | [PDF]
|