| Multimedia Data Formats and Semantic Computing: A Practical Example and its Implications for the Future | G. Friedland | IEEE International Conference on Semantic Computing, Irvine, California | September 2007 | Speech | |
| Multimedia Education in Computer Science -- A Little Bit of Everything Is Not Enough | G. Friedland, L. Knipping, and W. Huerst | IEEE Multimedia Magazine, Vol. 15, Issue 2, pp. 78-82 | April 2008 | Speech | [PDF]
|
| The challenges of IT research in developing regions | E. Brewer, M. Demmer, M. Ho, R.J. Honicky, J. Pal, M. Plauché, and S. Surana | IEEE Pervasive Computing, Vol. 5, No. 2, pp. 15-23 | April 2006 | Speech | |
| A Training Algorithm for Statistical Sequence Recognition with Applications to Transition-Based Speech Recognition | H. Bourlard, Y. Konig, and N. Morgan | IEEE Signal Processing Letters, pp. 203-205 | July 1996 | Speech | |
| An Introduction to Hybrid HMM/Connectionist Continuous Speech Recognition | N. Morgan and H. Bourlard | IEEE Signal Processing Magazine, pp. 25-42 | May 1995 | Speech | [PDF]
|
| Pushing the Envelope - Aside | N. Morgan, Q. Zhu, A. Stolcke, K. Sonmez, S. Sivadas, T. Shinozaki, M. Ostendorf, P. Jain, H. Hermansky, D. Ellis, G. Doddington, B. Chen, O. Cetin, H. Bourlard, and M. Athineos | IEEE Signal Processing Magazine, Vol. 22, No. 5, pp. 81-88 | September 2005 | Speech | |
| Speech Segmentation and Spoken Document Processing | M. Ostendorf, B. Favre, R. Grishman, D. Hakkani-Tur, M. Harper, D. Hillard, J. Hirschberg, J. Heng, J. G. Kahn, Y. Liu, S. Maskey, E. Matusov, H. Ney, A. Rosenberg, E. Shriberg, W. Wang, and C. Wooters | IEEE Signal Processing Magazine, Vol. 25, Issue 3, pp. 59-69 | May 2008 | Speech | [PDF]
|
| Research Developments and Directions in Speech Recognition and Understanding, Part 1 | J. Baker, L. Deng, J. Glass, S. Khudanpur, C.-H. Lee, N. Morgan, and D. O'Shaughnessy | IEEE Signal Processing Magazine, Vol. 26, No. 3, pp. 75-80 | May 2009 | Speech | |
| Updated MINDS Report on Speech Recognition and Understanding, Part 2 | J. Baker, L. Deng, S. Khudanpur, C.-H. Lee, J. Glass, N. Morgan, and D. O'Shgughnessy | IEEE Signal Processing Magazine, Vol. 26, No. 4, pp. 78-85 | July 2009 | Speech | [PDF]
|
| Why Is ASR Harder For Fast Speech And What Can We Do About It? | N. Mirghafori, E. Fosler, and N. Morgan | IEEE Snowbird Workshop '95 | 1995 | Speech | [PDF]
|
| Transition-Based Statistical Training for ASR | N. Morgan, Y. Konig, S.L. Wu, and H. Bourlard | IEEE Snowbird Workshop '95 | 1995 | Speech | [PDF]
|
| Special Section on New Frontiers in Rich Transcription | G. Friedland, J. Fiscus, T. Hain, and S. Furui (eds) | IEEE Transactions in Audio, Speech, and Language Processing, Vol. 20, No. 2 | February 2012 | Speech | |
| Introduction to the Special Issue on Processing Morphologically Rich Languages | R. Sarikaya, K. Kirchhoff, T. Schultz, and D. Hakkani-Tür | IEEE Transactions on Audio, Speech and Language Processing, Special Issue on Processing Morphologically Rich Languages, Vol. 17, No. 5, pp. 861-862 | July 2009 | Speech | [PDF]
|
| Enriching Speech Recognition with Automatic Detection of Sentence Boundaries and Disfluencies | Y. Liu, E. Shriberg, A. Stolcke, D. Hillard, M. Ostendorf, and M. Harper | IEEE Transactions on Audio, Speech and Language Processing, Vol. 14, Issue 5, pp. 1526-1540 | September 2006 | Speech | [PDF]
|
| Recent Innovations in Speech-to-Text Transcription at SRI-ICSI-UW | A. Stolcke, B. Chen, H. Franco, V.R.R. Gadde, M. Graciarena, M.-Y. Hwang, K. Kirchhoff, N. Morgan, X. Lin, T. Ng, M. Ostendorf, K. Sönmez, A. Venkataraman, D. Vergyri, W. Wang, J. Zheng, and Q. Zhu | IEEE Transactions on Audio, Speech and Language Processing, Vol. 14, Issue 5, pp. 1729-1744 | September 2006 | Speech | [PDF]
|
| Acoustic Beamforming for Speaker Diarization of Meetings | X. Anguera, C. Wooters, and J. Hernando | IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, Issue 7, IEEE Computer Society, California, pp. 2011-2022 | September 2007 | Speech | |
| Multi-View Semi-Supervised Learning for Dialog Act Segmentation of Speech | U. Guz, S. Cuendet, G. Tur, and D. Hakkani-Tür | IEEE Transactions on Audio, Speech and Language Processing, Vol. 18, Issue 2, pp. 320-329 | February 2010 | Speech | [PDF]
|
| Estimating Dominance in Multi-Party Meetings Using Speaker Diarization from a Single Microphone | H. Hung, Y. Huang, G. Friedland, and D. Gatica-Perez | IEEE Transactions on Audio, Speech and Language Processing, Vol. 19, No. 4, pp. 847–860 | May 2011 | Speech | |
| Deep and Wide: Multiple Layers in Automatic Speech Recognition | N. Morgan | IEEE Transactions on Audio, Speech, and Language Processing, Special Issue on Deep Learning | 2011 | Speech | [PDF]
|
| Prosodic and Other Long-Term Features for Speaker Diarization | G. Friedland, O. Vinyals, Y. Huang, and C. Müller | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, No. 5, pp. 985-993 | July 2009 | Speech | [PDF]
|
| Audio-Based Semantic Concept Classification for Consumer Video | K. Lee and D. Ellis | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 18, Issue 6, pp. 1406-1416 | August 2010 | Speech | [PDF]
|
| The CALO Meeting Assistant System | G. Tur, A. Stolcke, L. Voss, S. Peters, D. Hakkani-Tür, J. Dowding, B. Favre, R. Fernandez, M. Frampton, M. Frandsen, C. Frederickson, M. Graciarena, D. Kintzing, K. Leveque, S. Mason, J. Niekrasz, M. Purver, K. Riedhammer, E. Shriberg, J. Tien, D. Vergyri, and F. Yang | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 18, Issue 6, pp. 1601-1611 | August 2010 | Speech | [PDF]
|
| Tuning-Robust Initialization Methods for Speaker Diarization | D. Imseng and G. Friedland | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 18, Issue 8, pp. 2028-2037 | November 2010 | Speech | [PDF]
|
| Introduction to the Special Section on Deep Learning for Speech and Language Processing | D. Yu, G. Hinton, N. Morgan, J.-T. Chien, and S. Sagayama | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, Issue 1, pp. 4-6 | January 2012 | Speech | [PDF]
|
| Deep and Wide: Multiple Layers in Automatic Speech Recognition | N. Morgan | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, Issue 1, pp. 7-13 | January 2012 | Speech | [PDF]
|
| Speaker Diarization: A Review of Recent Research | X. Anguera, S. Bozonnet, N. Evans, C. Fredouille, G. Friedland, and O. Vinyals | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, Issue 2, pp. 356-370 | February 2012 | Speech | [PDF]
|
| The ICSI RT-09 Speaker Diarization System | G. Friedland, A. Janin, D. Imseng, X. Anguera, L. Gottlieb, M. Huijbregts, M. Knox, and O. Vinyals | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, Issue 2, pp. 371-381 | February 2012 | Speech | [PDF]
|
| Speaker Recognition with Session Variability Normalization Based on MLLR Adaptation Transforms | A. Stolcke, S. Kajarekar, L. Ferrer, and E. Shriberg | IEEE Transactions on Audio, Speech, and Language Processing. Special issue on speaker and language recognition, Vol. 15, Issue 7, IEEE Computer Society, California, pp. 1987-1998 | September 2007 | Speech | [PDF]
|
| Speaker Diarization For Multiple-distant-microphone Meetings Using Several Sources of Information | J. M. Pardo, X. Anguera, and C. Wooters | IEEE Transactions on Computers, Vol. 56, Issue 9, IEEE Computer Society, California, pp. 1212-1224 | September 2007 | Speech | [PDF]
|
| Continuous Speech Recognition by Connectionist Statistical Methods | H. Bourlard and N. Morgan | IEEE Transactions on Neural Networks, Vol. 4, No. 6, pp. 893-909 | November 1993 | Speech | |
| Connectionist Probability Estimators in HMM Speech Recognition | S. Renals, N. Morgan, H. Bourlard, M. Cohen, and H. Franco | IEEE Transactions on Speech and Audio Processing, pp. II-161-174, | January 1993 | Speech | |
| RASTA Processing of Speech | H. Hermansky and N. Morgan | IEEE Transactions on Speech and Audio Processing, special issue on Robust Speech Recognition, Vol. 2, No. 4, pp. 578-589 | October 1994 | Speech | |
| Adaptive Language Modeling with Varied Sources to Cover New Vocabulary Items | S. Schwarm, I. Bulyko, and M. Ostendorf | IEEE Transactions on Speech and Audio Processing, Vol. 12, No. 3, pp. 334-342 | May 2004 | Speech | [PDF]
|
| Automatic Speech Recognition with an Adaptation Model Motivated by Auditory Processing | M. Holmberg, D. Gelbart, and W. Hemmert | IEEE Transactions on Speech and Audio Processing, Vol. 14, Issue 1, pp. 44-49 | January 2006 | Speech | [PDF]
|
| The Challenge of Spoken Language Systems: Research Directions for the Nineties | R. Cole, L. Hirschman, L. Atlas, M. Beckman, A. Biermann, M. Bush, M. Clements, J. Cohen, O. Garcia, B. Hanson, H. Hermansky, S. Levinson, K. McKeown, N. Morgan, D. Novick, M. Ostendorf, S. Oviatt, P. Price, H. Silverman, J. Spitz, A. Waibel, C. Weinstein, S. Zahorian, and V. Zue | IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, pp. 1-21 | January 1995 | Speech | |
| Generative and Discriminative Methods Using Morphological Information for Sentence Segmentation of Turkish | U. Guz, B. Favre, D. Hakkani-Tur, and G. Tur | IEEE Transactions on Speech, Audio and Language Processing, Special Issue on Processing Morphologically Rich Languages, Vol. 17, No. 5, pp. 895-903 | July 2009 | Speech | [PDF]
|
| Interpretation of Spatial Language in a Map Navigation Task | M. Levit and D. Roy | IEEE Transactions on Systems, Man and Cybernetics, Part B, vol. 37, no. 3, IEEE Systems, man, and Cybernetics Society, pp.667-679 | June 2007 | Speech | |
| Term-Weighting for Summarization of Multi-Party Spoken Dialogues | G. Murray and S. Renals | In Machine Learning for Multimodal Interaction IV (Lecture Notes in Computer Science, Vol. 4892), pp. 155-166, Springer | 2007 | Speech | |
| Computationally Efficient Clustering of Audio-Visual Meeting Data | H. Hung, G. Friedland, and C. Yeo | In Multimedia Interaction and Intelligent User Interfaces: Principles, Methods, and Applications, M. Etho, J. Luo, and L. Shao, eds., pp. 25-59 | 2010 | Speech | |
| Speaker Diarization | G. Friedland and F. Valente | In Multimodal Signal Processing: Human Interactions in Meetings, S. Reynals, H. Bourlard, J. Carletta, and A. Popescu-Belis, eds., Cambridge University Press | June 2012 | Speech | |
| The Grammar of Hitting and Breaking | C. J. Fillmore | In Readings in English Transformational Grammar, R. Jacobs and P. Rosenbaum, eds., pp. 120-133, Georgetown University Press. | June 1970 | Speech | [PDF]
|
| The ICSI-SRI Spring 2006 Meeting Recognition System | A. Janin, A. Stolcke, X. Anguera, K. Boakye, O. Cetin, J. Frankel, and J. Zheng | In S. Renals and S. Bengio, editors, Machine Learning for Multimodal Interaction: Third International Workshop (MLMI 2006), Lecture Notes in Computer Science. Springer | 2006 | Speech | [PDF]
|
| Speaker Recognition and Diarization | G. Friedland and D. van Leeuwen | In Semantic Computing, P. Sheu, H. Yu, C. V. Ramamamoorthy, A. K. Joshi, and L. A. Zadeh, eds., pp. 115-130, IEEE Press/Wiley | 2010 | Speech | |
| SmartKom English: From Robust Recognition to Felicitous Interaction | D. Gelbart, J. Bryants, A. Stolcke, R. Porzel, M. Baudis, and N. Morgan | In SmartKom--Foundations of Multimodal Dialogue Systems, W. Wahlster, ed., pp. 453-470, Springer | November 2004 | Speech | [PDF]
|
| Speaker Diarization | G. Friedland | In Speech and Audio Signal Processing, 2nd edition, B. Gold, N. Morgan, D. Ellis, eds., Wiley | 2011 | Speech | |
| Syllable Models for Mandarin Speech Recognition: Exploiting Character Language Models | X. Liu, J. L. Hieronymus, M. J. F. Gales, and P. C. Woodland | In submission | 2012 | Speech | |
| Features Based on Auditory Physiology and Perception | R. M. Stern and N. Morgan | In Techniques for Noise Robustness in Automatic Speech Recognition, T. Virtanen, B. Raj, and R. Singh, Wiley Publishing | 2012 | Speech | |
| Semantic Computing and Privacy: A Case Study Using Inferred Geo-Tagging | G. Friedland and J. Choi | International Journal of Semantic Computing, Vol. 5, No. 1, pp. 79-93. Also Best Poster in the Electrical and Computer Science and Engineering Track at the Korean Student Technical and Leadership Conference, Chicago, Illinois, March 2012. DOI: 10.1142/S1793351X11001171 | March 2011 | Speech | [PDF]
|
| Object Cut and Paste in Images and Videos | G. Friedland, K. Jantz, T. Lenz, F. Wiesel, and R. Rojas | International Journal of Semantic Computing, World Scientific, Vol. 1, Issue 2, pp. 221-247, USA | July 2007 | Speech | |
| Best Papers from the Second IEEE International Conference on Semantic Computing (IJSC) | G. Friedland and C. Martell, eds. | International Journal on Semantic Computing (IJSC), Vol. 2, Issue 3 | September 2008 | Speech | |