| The Modulation Spectrogram: In Pursuit of an Invariant Representation of Speech | S. Greenberg and B. Kingsbury | The 22nd International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1997), Munich, Germany, Vol. 3, pp. 1647-1650 | April 1997 | Speech | [PDF]
|
| From Here to Utility - Melding Phonetic Insight with Speech Technology | S. Greenberg | Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech 2001), Aalborg, Denmark | September 2001 | Speech | [PDF]
|
| Whither Speech Technology? - A Twenty-First Century Perspective | S. Greenberg | Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech 2001), Aalborg, Denmark | September 2001 | Speech | [PDF]
|
| Recognition in a New Key - Towards a Science of Spoken Language | S. Greenberg | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1998), Seattle, Washington, pp. 1041-1045 | May 1998 | Speech | [PDF]
|
| Speaking in Shorthand - A Syllable-Centric Perspective for Understanding Pronunciation Variation | S. Greenberg | Proceedings of the ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition, Kekrade, Netherlands, pp. 47-56 | 1998 | Speech | [PDF]
|
| On the Origins of Speech Intelligibility in the Real World | S. Greenberg | Proceedings of the ESCA Workshop of Robust Speech Recognition, Pont-a-Mousson, France, pp. 23-32 | 1997 | Speech | [PDF]
|
| Understanding Speech Understanding | S. Greenberg | Proceedings of the ESCA Workshop on the "Auditory Basis of Speech Perception," Keele University, Staffordshire, UK, pp. 1-8 | 1996 | Speech | [PDF]
|
| Temporal Masking for Bit-Rate Reduction in Audio Codec based on Frequency Domain Linear Prediction | S. Ganapathy, P. Motlicek, H. Hermansky, and H. Garudadri | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), Las Vegas, Nevada, pp. 4781-4784 | April 2008 | Speech | [PDF]
|
| Autoregressive Modeling of Hilbert Envelopes for Wide-Band Audio Coding | S. Ganapathy, P. Motlicek, H. Hermansky, and H. Garudadri | Proceedings of 124th Convention of Audio Engineering Society (AES), Amsterdam, the Netherlands, paper 7481 | May 2008 | Speech | |
| Spectral Noise Shaping: Improvements in Speech/Audio Codec Based on Linear Prediction in Spectral Domain | S. Ganapathy, P. Motlicek, H. Hermansky, and H. Garudadri | Proceedings of the 9th Annual Conference of the International Speech Communication
Association (Interspeech 2008), Brisbane, Australia | September 2008 | Speech | |
| An Analysis of Sentence Segmentation Features for Broadcast News, Broadcast Conversations, and Meetings | S. Cuendet, E. Shriberg, B. Favre, J. Fung, and D. Hakkani-Tür | Proceedings of the SIGIR Workshop on Searching Conversational Spontaneous Speech, Amsterdam, Netherlands, pp. 43-59 | July 2007 | Speech | |
| Model Adaptation for Sentence Segmentation from Speech | S. Cuendet, D. Hakkani-Tur, and G. Tur | Proceedings of the IEEE 2006 Workshop on Spoken Language Technology (SLT 2006), Palm Beach, Aruba, pp. 102-105 | December 2006 | Speech | [PDF]
|
| Automatic Labeling Inconsistencies Detection And Correction For Sentence Unit Segmentation In Conversational Speech | S. Cuendet, D. Hakkani-Tur, and E. Shriberg | Proceedings of Fourth International Conference on Machine Learning and Multimodal Interaction, Brno, Czech Republic, pp. 144-155 | June 2007 | Speech | [PDF]
|
| Cross-Genre Feature Comparisons for Spoken Sentence Segmentation | S. Cuendet, D. Hakkani-Tur, E. Shriberg, J. Fung, and B. Favre | Proceedings of International Conference on Semantic Computing, IEEE Computer Society, pp. 265-274, Irvine, California. Also published in International Journal of Semantic Computing, Volume 1, Issue 3, World Scientific, USA, pp. 335-346 | September 2007 | Speech | [PDF]
|
| An Elitist Approach to Articulatory-Acoustic Feature Classification | S. Chang, S. Greenberg, and M. Wester | Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech 2001), Aalborg, Denmark | September 2001 | Speech | [PDF]
|
| Automatic Phonetic Transcription of Spontaneous Speech American English | S. Chang, L. Shastri, and S. Greenberg | Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP 2000), Beijing, China | October 2000 | Speech | [PDF]
|
| A Syllable, Articulatory-Feature, and Stress-Accent Model of Speech Recognition | S. Chang | Ph.D. Thesis, University of California at Berkeley. Also ICSI Technical Report TR-02-007 | September 2002 | Speech | [PDF]
|
| System Output Combination for Improved Speaker Diarization | S. Bozonnet, N. Evans, X. Anguera, O. Vinyals, G. Friedland, and C. Fredouille | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 2642-2645 | September 2010 | Speech | [PDF]
|
| Automatically Generated Prosodic Cues to Lexically Ambiguous Dialog Acts in Multiparty Meetings | S. Bhagat, H. Carvey, and E. Shriberg | Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003), Barcelona, Spain | August 2003 | Speech | [PDF]
|
| Source Separation Based on Binaural Cues and Source Model Constraints | R. Weiss, M. Mandel, and D. Ellis | Proceedings of the 9th International Conference of the ISCA (Interspeech 2008), Brisbane, Australia, pp. 419-422 | September 2008 | Speech | [PDF]
|
| Temporal Constraints on Speech Intelligibility as Deduced From Exceedingly Sparse Spectral Representations | R. Silipo, S. Greenberg, and T. Arai | Proceedings of the 6th European Conference on Speech Communication and Technology (Eurospeech '99), Budapest, Hungary, pp. VI-2687-2690 | September 1999 | Speech | [PDF]
|
| Prosodic Stress Revisited: Reassessing the Fole of Fundamental Frequency | R. Silipo and S. Greenberg | Proceedings of the National Institute of Standards and Technology Speech Transcription Workshop, College Park, Maryland | May 2000 | Speech | [PDF]
|
| Automatic Transcription of Prosodic Stress for Spontaneous English Discourse | R. Silipo and S. Greenberg | Proceedings of the International Congress of Phonetic Sciences, San Francisco, California, Vol. 3, pp. 2351-2354 | August 1999 | Speech | [PDF]
|
| Introduction to the Special Issue on Processing Morphologically Rich Languages | R. Sarikaya, K. Kirchhoff, T. Schultz, and D. Hakkani-Tür | IEEE Transactions on Audio, Speech and Language Processing, Special Issue on Processing Morphologically Rich Languages, Vol. 17, No. 5, pp. 861-862 | July 2009 | Speech | [PDF]
|
| From AUDREY to Siri: Is Speech Recognition A Solved Problem? | R. Pieraccini | Presented at the Mobile Voice Conference, San Francisco, California | March 2012 | Speech | [PDF]
|
| A Human Benchmark for Language Recognition | R. Orr and D. A. Van Leeuwen | Proceedings of the 10th International Conference of the International Speech Communication Association (Interspeech 2009), Brighton, United Kingdom, pp. 2175-2178 | September 2009 | Speech | |
| On the Applicability of Speaker Diarization to Audio Concept Detection for Multimedia Retrieval | R. Mertens, P.-S. Huang, L. Gottlieb, G. Friedland, and A. Divakaran | Proceedings of the IEEE International Symposium on Multimedia, Dana Point, California, pp. 446-451 | December 2011 | Speech | [PDF]
|
| Acoustic Super Models for Large Scale Video Event Detection | R. Mertens, H. Lei, L. Gottlieb, G. Friedland, and A. Divakaran | Proceedings of the ACM International Workshop on Events in Multimedia (EiMM11), Scottsdale, Arizona | November 2011 | Speech | [PDF]
|
| Features Based on Auditory Physiology and Perception | R. M. Stern and N. Morgan | In Techniques for Noise Robustness in Automatic Speech Recognition, T. Virtanen, B. Raj, and R. Singh, Wiley Publishing | 2012 | Speech | |
| Hearing is Believing: Biologically-Inspired Feature Extraction for Robust Automatic Speech Recognition | R. M. Stern and N. Morgan | Signal Processing Magazine, Vol. 29, No. 6, pp. 34-43 | November 2012 | Speech | [PDF]
|
| An Improved Approximation Algorithm for Vertex Cover with Hard Capacities | R. Gandhi, E. Halperin, S. Khuller, G. Kortsarz, and A. Srinivasan | Proceedings of the 30th International Colloquium on Automata, Languages and Programming (ICALP 2003), Eindhoven, The Netherlands, pp. 164-175 | June 2003 | Speech | [PDF]
|
| Meeting Recorder Project: Dialog Act Labeling Guide | R. Dhillon, S. Bhagat, H. Carvey, and E. Shriberg | ICSI Technical Report TR-04-002 | February 2004 | Speech | [PDF]
|
| Automated Information Extraction in Production | R. Desutter, J.P. Evain, G. Friedland, A. Messina, and M. Sano | Special issue in Multimedia Tools and Applications, Springer | 2011 | Speech | |
| The Challenge of Spoken Language Systems: Research Directions for the Nineties | R. Cole, L. Hirschman, L. Atlas, M. Beckman, A. Biermann, M. Bush, M. Clements, J. Cohen, O. Garcia, B. Hanson, H. Hermansky, S. Levinson, K. McKeown, N. Morgan, D. Novick, M. Ostendorf, S. Oviatt, P. Price, H. Silverman, J. Spitz, A. Waibel, C. Weinstein, S. Zahorian, and V. Zue | IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, pp. 1-21 | January 1995 | Speech | |
| Meeting Acts: A Labeling System for Group Interaction in Meetings | R. Bates, P. Menning, E. Willingham, and C. Kuyper | Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005-Eurospeech 2005), Lisbon, Portugal | September 2005 | Speech | [PDF]
|
| On Using MLP Features in LVCSR | Q. Zhu, B. Chen, N. Morgan. and A. Stolcke | Proceedings of International Conference on Spoken Language Processing, Jeju, Korea, October 2004. | October 2004 | Speech | [PDF]
|
| Tandem Connectionist Feature Extraction for Conversational Speech Recognition | Q. Zhu, B. Chen, N. Morgan, and A.Stolcke | Proceedings of the First International Workshop on Machine Learning for Multimodal Interaction (MLMI 2004), Martigny, Switzerland | June 2004 | Speech | |
| Improved MLP Structures for Data-Driven Feature Extraction for ASR | Q. Zhu, B. Chen, F. Grezl, and N. Morgan | Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005-Eurospeech 2005), Lisboa, Portugal, pp. 2129-2132 | September 2005 | Speech | [PDF]
|
| Improved MLP Structures for Data-Driven Feature Extraction for ASR | Q. Zhu, B. Chen, F. Grezl, and N. Morgan | Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005-Eurospeech 2005), Lisboa, Portugal, pp. 2129-2132 | September 2005 | Speech | |
| Using MLP Features in SRI's Conversational Speech Recognition System | Q. Zhu, A. Stolcke, B.Y. Chen, and N. Morgan | Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005-Eurospeech 2005), Lisboa, Portugal, pp. 2141-2144 | September 2005 | Speech | [PDF]
|
| Incorporating Tandem/HATs MLP Features into SRI's Conversational Speech Recognition System | Q. Zhu, A. Stolcke, B. Y. Chen, and N. Morgan | Proceedings of the EARS RT-04F Workshop, Palisades, New York, November 2004. | November 2004 | Speech | [PDF]
|
| How to Put It Into Words - Using Random Forests to Extract Symbol Level Descriptions from Audio Content for Concept Detection | P.-S. Huang, R. Mertens, A. Divakaran, G. Friedland, and M. Hasegawa-Johns | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, Japan | March 2012 | Speech | [PDF]
|
| Feature Transformations and Combinations for Improving ASR Performance | P. Somervuo, B. Chen, and Q. Zhu | Proceedings of EUROSPEECH 2003, Geneva | September 2003 | Speech | [PDF]
|
| Experiments with Linear and Nonlinear Feature Transformations in HMM Based Phone Recognition | P. Somervuo | Proceedings of ICASSP-2003, Hong Kong | April 2003 | Speech | [PDF]
|
| Speech Modeling Using Variational Bayesian Mixture of Gaussians | P. Somervuo | Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP 2002), Denver, Colorado | September 2002 | Speech | [PDF]
|
| Wide-Band Perceptual Audio Coding Based on Frequency-Domain Linear Prediction | P. Motlicek, V. Ullal, and H. Hermansky | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, Hawaii, Vol. 1, pp. 265-268 | April 2007 | Speech | |
| Perceptually Motivated Sub-Band Decomposition for FDLP Audio Coding | P. Motlicek, S. Ganapathy, H. Hermansky, H. Garudadri, and M. Athineos | Proceedings of 11th International Conference on Text, Speech, and Dialogue (TSD 2008), Brno, Czech Republic, pp. 435-442 | September 2008 | Speech | [PDF]
|
| A Methodology for Comparing Grammar-Based and Robust Approaches to Speech Understanding | P. Bouillon, N. Chatzichrisafis, B.A. Hockey, M. Rayner, M. Santaholma, M. Starlander, H. Isahara, K. Kanzaki, and Y. Nakao | Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005-Eurospeech 2005), Lisboa, Portugal, pp. 1877-1880 | September 2005 | Speech | |
| A Generic Multi-Lingual Open Source Platform for Limited-Domain Medical Speech Translation | P. Bouillon, M. Rayner, N. Chatzichrisafis, B.A. Hockey, M. Santaholma, M. Starlander, H. Isahara, K. Kanzaki, and Y. Nakao | Proceedings of the 10th Annual Conference of the European Association of Machine Translation (EAMT 2005), Budapest, Hungary, pp. 5-58 | May 2005 | Speech | |
| A Multilingual Shared Grammar for Recognition and Generation (in French) | P. Bouillon, M. Rayner, B. Novellas, Y. Nakao, M. Santaholma, M. Starlander, and N. Chatzichrisafis | Proceedings of the 13th Conference on Natural Language Processing (TALN 2006), Leuwen, Belgium, pp. 93-102 | April 2006 | Speech | |