| Improved Phonetic Speaker Recognition Using Lattice Decoding | A. O. Hatch, B. Peskin, and A. Stolcke | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, Pennsylvania, pp. 169-172 | March 2005 | Speech | [PDF]
|
| Automatic Dialog Act Segmentation and Classification in Multiparty Meetings | J. Ang, Y. Liu, and E. Shriberg | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, Pennsylvania, pp. 1061-1064 | March 2005 | Speech | [PDF]
|
| Continuous Speech Recognition Using PLP Analysis with Multilayer Perceptrons | N. Morgan, H. Hermansky, H. Bourlard, P. Kohn, and C. Wooters | Proceedings of IEEE International Conference on Acoustics, Speech & Signal Processing, Toronto, Canada, pp. 49-52 | 1991 | Speech | |
| CDNN: A Context Dependent Neural Network for Continuous Speech Recognition | H. Bourlard, N. Morgan, C. Wooters, and S. Renals | Proceedings of IEEE International Conference on Acoustics, Speech & Signal Processing, San Francisco, California, pp. II-349-352 | 1992 | Speech | |
| RASTA-PLP Speech Analysis Technique | H. Hermansky, N. Morgan, A. Bayya, and P. Kohn | Proceedings of IEEE International Conference on Acoustics, Speech & Signal Processing, San Francisco, California, pp. I-121-124 | 1992 | Speech | |
| Integrating RASTA-PLP into Speech Recognition | J. Koehler, N. Morgan, H. Hermansky, H.G. Hirsch, and G. Tong | Proceedings of IEEE International Conference on Acoustics, Speech & Signal Processing, pp. I-421-424 | 1994 | Speech | |
| Desperately Seeking Impostors: Data-Mining for Competitive Impostor Testing in a Text-Dependent Speaker Verification System | M. Hebert and N. Mirghafori | Proceedings of IEEE ICASSP, Montreal | May 2004 | Speech | [PDF]
|
| Parameterization of the Score Threshold for a Text-Dependent Adaptive Speaker Verification System | N. Mirghafori and M. Hebert | Proceedings of IEEE ICASSP, Montreal | May 2004 | Speech | [PDF]
|
| TRAPping Conversational Speech: Extending TRAP/Tandem Approaches to Conversational Telephone Speech Recognition | N. Morgan, B. Y. Chen, Q. Zhu, and A. Stolcke | Proceedings of IEEE ICASSP, Montreal | May 2004 | Speech | [PDF]
|
| The ICSI Meeting Corpus | A. Janin, D. Baron, J. Edwards, D. Ellis, D. Gelbart, N. Morgan, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, and C. Wooters | Proceedings of ICASSP-2003, Hong Kong | April 2003 | Speech | [PDF]
|
| Meetings About Meetings: Research at ICSI on Speech in Multiparty Conversations | N. Morgan, D. Baron, S. Bhagat, H. Carvey, R. Dhillon, J. Edwards, D. Gelbart, A. Janin, A. Krupski, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, and C. Wooters | Proceedings of ICASSP-2003, Hong Kong | April 2003 | Speech | [PDF]
|
| Using Prosodic and Conversational Features for High-Performance Speaker Recognition: Report From JHU WS'02. | B. Peskin, J. Navratil, J. Abramson, D. Jones, D. Klusacek, D. Reynolds, and B. Xiang | Proceedings of ICASSP-2003, Hong Kong | April 2003 | Speech | [PDF]
|
| The SuperSID Project: Exploiting High-Level Information for High-Accuracy Speaker Recognition | D. Reynolds, W. Andrews, J. Campbell, J. Navratil, B. Peskin, A. Adami, Q. Jin, D. Klusacek, J. Abramson, R. Mihaescu, J. Godfrey, D. Jones, and B. Xiang | Proceedings of ICASSP-2003, Hong Kong | April 2003 | Speech | [PDF]
|
| Experiments with Linear and Nonlinear Feature Transformations in HMM Based Phone Recognition | P. Somervuo | Proceedings of ICASSP-2003, Hong Kong | April 2003 | Speech | [PDF]
|
| Word Fragments Identification Using Acoustic-Prosodic Features in Conversational Speech | Y. Liu | Proceedings of HLT/NAACL, Student Session, Edmonton, Alberta | 2003 | Speech | |
| Improving Automatic Sentence Boundary Detection with Confusion Networks | D. Hillard, M. Ostendorf, A. Stolcke, Y. Liu, and E. Shriberg | Proceedings of HLT-NAACL Conference, Boston | April 2004 | Speech | [PDF]
|
| Automatic Labeling Inconsistencies Detection And Correction For Sentence Unit Segmentation In Conversational Speech | S. Cuendet, D. Hakkani-Tur, and E. Shriberg | Proceedings of Fourth International Conference on Machine Learning and Multimodal Interaction, Brno, Czech Republic, pp. 144-155 | June 2007 | Speech | [PDF]
|
| Towards Robust Speaker Segmentation: The ICSI-SRI Fall 2004 Diarization System | C. Wooters, J. Fung, B. Peskin, and X. Anguera | Proceedings of Fall 2004 Rich Transcription Workshop (RT-04F), Nov. 2004 | November 2004 | Speech | [PDF]
|
| Learning Discriminative Temporal Patterns in Speech: Development of Novel TRAPS-Like Classifiers | B. Chen, S. Chang, and S. Sivadas | Proceedings of EUROSPEECH 2003, Geneva | September 2003 | Speech | [PDF]
|
| Automatic Disfluency Identification in Conversational Speech Using Multiple Knowledge Sources | Y. Liu, E. Shriberg, and A. Stolcke | Proceedings of EUROSPEECH 2003, Geneva | September 2003 | Speech | [PDF]
|
| Feature Transformations and Combinations for Improving ASR Performance | P. Somervuo, B. Chen, and Q. Zhu | Proceedings of EUROSPEECH 2003, Geneva | September 2003 | Speech | [PDF]
|
| Towards Audio-Visual On-Line Diarization of Participants in Group Meetings | H. Hung and G. Friedland | Proceedings of European Conference on Computer Vision (ECCV), Marseille, France | October 2008 | Speech | [PDF]
|
| Should Recognizers Have Ears? | H. Hermansky | Proceedings of ESCA Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communication Channels, Pont-a-Mousson, France, pp. 1-10 | April 1997 | Speech | |
| Far-Field ASR on Inexpensive Microphones | L. Docio, D. Gelbart, and N. Morgan | Proceedings of Eighth European Conference on Speech Communication and Technology (EUROSPEECH 2003), Geneva, Switzerland, pp. 2141-2144 | September 2003 | Speech | [PDF]
|
| Comparing and Combining Generative and Posterior Probability Models: Some Advances in Sentence Boundary Detection in Speech | Y. Liu, A. Stolcke, E. Shriberg, and M. Harper | Proceedings of Conference on Empirical Methods in Natural Language Processing, Barcelona | July 2004 | Speech | [PDF]
|
| Sampling Alignment Structure Under a Bayesian Translation Model | J. DeNero, A. Bouchard-Côté, and D. Klein | Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), Waikiki, Honolulu, Hawaii, pp. 314-323 | October 2008 | Speech | [PDF]
|
| Multispeaker Speech Activity Detection for the ICSI Meeting Recorder | T. Pfau, D. Ellis, and A. Stolcke | Proceedings of Automatic Speech Recognition and Understanding Workshop (ASRU 2001),
Madonna di Campiglio, Italy, pp. 107-110 | December 2001 | Speech | [PDF]
|
| Current Research in Acoustically Robust Speech Recognition | N. Morgan | Proceedings of American Voice Input/Output Society (AVIOS), pp. 207-214 | September 1994 | Speech | |
| Using Audio and Video Features to Classify the Most Dominant Person in Meetings | H. Hung, D. Jayagopi, C. Yeo, G. Friedland, S. Ba, J-M. Odobez, K. Ramchandran, N. Mirghafori, and D. Gatica-Perez | Proceedings of ACM Multimedia 2007, Augsburg, Germany, pp. 835-838 | September 2007 | Speech | |
| Using Prosody for Automatic Sentence Segmentation of Multi-Party Meetings | J. Kolar, E. Shriberg, and Y. Liu | Proceedings of 9th International Conference on Text, Speech and Dialogue (TSD 2006), Brno, Czech Republic, pp. 629-636 | September 2006 | Speech | [PDF]
|
| Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections | M. Huijbregts, C. Wooters, and R. Ordelman | Proceedings of 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp, Belgium, pp. 2925-2928 | August 2007 | Speech | |
| The Blame Game: Performance Analysis of Speaker Diarization System Components | M. Huijbregts and C. Wooters | Proceedings of 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp, Belgium, pp. 1857-1860 | August 2007 | Speech | |
| A Multi-DSP Ring Array for Connectionist Simulations | J. Beck, N. Morgan, A. Allman, and J. Beer | Proceedings of 23rd Asilomar Conference on Signals, Systems & Computers | 1989 | Speech | |
| Role Recognition for Meeting Participants: An Approach Based on Lexical Information and Social Network Analysis | N. Garg, S. Favre, H. Salamin, D. Hakkani-Tur, and A. Vinciarelli | Proceedings of 16th ACM International Conference on Multimedia, Vancouver, Canada, pp. 693-696. | October 2008 | Speech | [PDF]
|
| Autoregressive Modeling of Hilbert Envelopes for Wide-Band Audio Coding | S. Ganapathy, P. Motlicek, H. Hermansky, and H. Garudadri | Proceedings of 124th Convention of Audio Engineering Society (AES), Amsterdam, the Netherlands, paper 7481 | May 2008 | Speech | |
| Perceptually Motivated Sub-Band Decomposition for FDLP Audio Coding | P. Motlicek, S. Ganapathy, H. Hermansky, H. Garudadri, and M. Athineos | Proceedings of 11th International Conference on Text, Speech, and Dialogue (TSD 2008), Brno, Czech Republic, pp. 435-442 | September 2008 | Speech | [PDF]
|
| Comparison of Grammar Based and Statistical Language Models Trained on the Same Data | B.A. Hockey and M. Rayner | Presented at the Workshop on Spoken Language Understanding at the 20th AIII National Conference on Artificial Intelligence, Pittsburgh, Pennsylvania | July 2005 | Speech | |
| How to Build a Spoken Dialog System with Limited (or No) Resources | M. Plauché, O. Cetin, and N. Uhdaykumar | Presented at the Workshop on AI in ICT for Development at the 20th International Joint Conference on AI (IJCAI07), Hyderabad, India | January 2007 | Speech | |
| Detecting Categories in News Video Using Acoustic, Speech, and Image Features | S. Petrov, A. Faria, P. Michaillat, A. Berg, A. Stolcke, D. Klein, and J. Malik | Presented at the NIST TREC Video Retrieval Workshop, Gaithersburg, Maryland | November 2006 | Speech | [PDF]
|
| From AUDREY to Siri: Is Speech Recognition A Solved Problem? | R. Pieraccini | Presented at the Mobile Voice Conference, San Francisco, California | March 2012 | Speech | [PDF]
|
| The 2012 ICSI/Berkeley Video Location Estimation System | J. Choi, V. Ekambaram, G. Friedland, and K. Ramchandran | Presented at the MediaEval 2012 Workshop, Pisa, Italy | October 2012 | Speech | [PDF]
|
| Cybercasing the Joint: Language Technologies, Multimedia Retrieval, and Online Privacy | G. Friedland | Presented at the Language Technologies Institute Colloquium, Carnegie Mellon University, Pittsburgh, Pennsylvania | April 13 2012 | Speech | [PDF]
|
| Efficient Parsing of Syntactic and Semantic Dependency Structures | B. Bohnet | Presented at the 13th Conference on Computational Natural Language Learning (CoNLL-2009), Boulder, Colorado | June 2009 | Speech | [PDF]
|
| A Syllable, Articulatory-Feature, and Stress-Accent Model of Speech Recognition | S. Chang | Ph.D. Thesis, University of California at Berkeley. Also ICSI Technical Report TR-02-007 | September 2002 | Speech | [PDF]
|
| Incorporating Information from Syllable-length Time Scales into Automatic Speech Recognition | S.L. Wu | Ph.D. Thesis, University of California at Berkeley, Spring 1998. Also ICSI Technical Report TR-98-014 | 1998 | Speech | [PDF]
|
| Natural Statistical Models for Automatic Speech Recognition | J. Bilmes | Ph.D. Thesis, University of California at Berkeley, Fall 1999. Also ICSI Technical Report TR-99-016 | October 1999 | Speech | [PDF]
|
| Learning Discriminant Narrow-Band Temporal Patterns for Automatic Recognition of Conversational Telephone Speech | B.Y. Chen | Ph.D. Thesis, University of California at Berkeley | May 2005 | Speech | [PDF]
|
| Speech recognition on vector architectures | A. Janin | Ph.D. Thesis, University of California at Berkeley | 2004 | Speech | [PDF]
|
| Dynamic Pronunciation Models for Autmoatic Speech Recognition | E. Fosler-Lussier | Ph.D. Thesis, UC Berkeley, Fall 1999, ICSI Technical Report TR-99-015 | September 1999 | Speech | [PDF]
|
| Global Posterior Probability Estimates as Decision Confidence Measures in an Automatic Speech Recognition System | W. Warren | Ph.D. Dissertation, University of California at Berkeley | December 2000 | Speech | |