| Hierarchical Processing of the Modulation Spectrum for GALE Mandarin LVCSR System | F. Valente, M. Magimai-Doss, C. Plahl, and S. Ravuri | Proceedings of the 10th International Conference of the International Speech Communication Association (Interspeech 2009), Brighton, United Kingdom, pp. 2963-2966 | September 2009 | Speech | [PDF]
|
| An Anticorrelation Kernel for Subsystem Training in Multiple Classifier Systems | L. Ferrer, K. Sönmez, and E. Shriberg | Journal of Machine Learning Research, Vol. 10, pp. 2079-2114 | September 2009 | Speech | [PDF]
|
| Exploiting Chinese Character Models to Improve Speech Recognition Performance | J. L. Hieronymus, X. Liu, M. J. F. Gales, and P. C. Woodland | Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009), Brighton, UK | September 2009 | Speech | |
| A View of the Parallel Computing Landscape | K. Asanović, R. Bodik, J. Demmel, T. Keaveny, K. Keutzer, J. D. Kubiatowicz, N. Morgan, D. A. Patterson, K. Sen, J. Wawrzynek, D. Wessel, and K. A. Yelick | Communications of the ACM, Vol. 52, No. 10, pp. 56-67 | October 2009 | Speech | [PDF]
|
| IXIR: A Statistical Information Distillation System | M. Levit, D. Hakkani-Tür, G. Tür, and D. Gillick | Journal of Computer Speech and Language, Vol. 23, Issue 4, pp. 527-542 | October 2009 | Speech | [PDF]
|
| Visual Speaker Localization Aided by Acoustic Models | G. Friedland, C. Yeo, and H. Hung | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2009), Beijing, China, pp. 195-202 | October 2009 | Speech | [PDF]
|
| Joke-o-Mat: Browsing Sitcoms Punchline by Punchline | G. Friedland, L. Gottlieb, and A. Janin | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2009), Beijing, China, pp. 1115-1116 | October 2009 | Speech | [PDF]
|
| Review of Cattelan, et al, "Watch-and-Comment as a Paradigm Toward Ubiquitous Interactive Video Editing" | G. Friedland | ACM Computer Reviews, CR136487 | October 2009 | Speech | |
| Robust Speaker Diarization for Short Speech Recordings | D. Imseng and G. Friedland | Proceedings of the 11th Biannual IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2009), Merano, Italy, pp. 432-437 | December 2009 | Speech | [PDF]
|
| Using Artistic Markers and Speaker Identification for Narrative-Theme Navigation of Seinfeld Episodes | G. Friedland, L. Gottlieb, and A. Janin | Proceedings of the 11th IEEE International Symposium on Multimedia (ISM2009), San Diego, California, pp. 511-516 | December 2009 | Speech | [PDF]
|
| Any Questions? Automatic Question Detection in Meetings | K. Boakye, B. Favre, and D. Hakkani-Tür | Proceedings of the 11th Biannual IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2009), Merano, Italy, pp. 485-489 | December 2009 | Speech | [PDF]
|
| Integrating Prosodic Features in Extractive Meeting Summarization | S. Xie, D. Hakkani-Tür, B. Favre, and Y. Liu | Proceedings of the 11th Biannual IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2009), Merano, Italy, pp. 387-391 | December 2009 | Speech | [PDF]
|
| Selected Papers from the Third IEEE International Conference on Semantic Computing (ICSC2009) | G. Friedland and S. C. Shen, eds. | International Journal on Semantic Computing, Vol. 3, Issue 4 | December 2009 | Speech | |
| Speaker Recognition and Diarization | G. Friedland and D. van Leeuwen | In Semantic Computing, P. Sheu, H. Yu, C. V. Ramamamoorthy, A. K. Joshi, and L. A. Zadeh, eds., pp. 115-130, IEEE Press/Wiley | 2010 | Speech | |
| MLP-Based Feature Extraction for Speech Transcription | N. Morgan, A. Faria, S. Ravuri, and S. Zhao | Handbook of Natural Language Processing and Machine Translation, J. Olive, ed., Springer, in press | 2010 | Speech | |
| Computationally Efficient Clustering of Audio-Visual Meeting Data | H. Hung, G. Friedland, and C. Yeo | In Multimedia Interaction and Intelligent User Interfaces: Principles, Methods, and Applications, M. Etho, J. Luo, and L. Shao, eds., pp. 25-59 | 2010 | Speech | |
| Multi-View Semi-Supervised Learning for Dialog Act Segmentation of Speech | U. Guz, S. Cuendet, G. Tur, and D. Hakkani-Tür | IEEE Transactions on Audio, Speech and Language Processing, Vol. 18, Issue 2, pp. 320-329 | February 2010 | Speech | [PDF]
|
| Why Has (Reasonably Accurate) Automatic Speech Recognition Been So Hard to Achieve? | S. Wegmann and L. Gillick | ArXiv.org under CoRR abs/1003.0206 | February 2010 | Speech | [PDF]
|
| Speaker Adaptation of Language and Prosodic Models for Automatic Dialog Act Segmentation of Speech | J. Kolar, Y. Liu, and E. Shriberg | Speech Communication, Vol. 52, Issue 3, pp. 236-245 | March 2010 | Speech | |
| An Adaptive Initialization Method for Speaker Diarization Based on Prosodic Features | D. Imseng and G. Friedland | Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, pp. 4946-4949 | March 2010 | Speech | [PDF]
|
| Summarization- and Learning-Based Approaches to Information Distillation | B. Toth, D. Hakkani-Tur, and S. Yaman | Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, pp. 5306-5309 | March 2010 | Speech | [PDF]
|
| Comparing the Contributions of Context and Prosody in Text-Independent Dialog Act Recognition | K. Laskowski and E. Shriberg | Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, pp. 5374-5377 | March 2010 | Speech | [PDF]
|
| A Comparison of Approaches for Modeling Prosodic Features in Speaker Recognition | L. Ferrer, N. Scheffer, and E. Shriberg | Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, pp. 4414-4417 | March 2010 | Speech | [PDF]
|
| Acoustic Front-End Optimization for Bird Species Recognition | M. Graciarena, M. Delplanche, E. Shriberg, A. Stolcke, and L. Ferrer | Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, pp. 293-296 | March 2010 | Speech | [PDF]
|
| Leveraging Speaker Diarization for Meeting Recognition from Distant Microphones | A. Stolcke, G. Friedland, and D. Imseng | Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, pp. 4390-4393 | March 2010 | Speech | [PDF]
|
| Cover Song Detection: From High Scores to General Classification | S. Ravuri and D. Ellis | Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, pp. 65-68 | March 2010 | Speech | [PDF]
|
| Evaluation of Semantic Role Labeling and Dependency Parsing of Automatic Speech Recognition Output | B. Favre, B. Bohnet, D. Hakkani-Tür | Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, pp. 5342-5345 | March 2010 | Speech | [PDF]
|
| Detecting Local Semantic Concepts in Environmental Sounds Using Markov Model Based Clustering | K. Lee, D. Ellis, and A. Loui | Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), Dallas, Texas, March 2010 | March 2010 | Speech | [PDF]
|
| Review of J. Nichols and B. Myers, "Creating a Lightweight User Interface Description Language: An Overview and Analysis of the Personal Universal Controller Project" | G. Friedland | ACM Computing Reviews, CR137773 | March 2010 | Speech | |
| Language Model Combination and Adaptation Using Weighted Finite State Transducers | X. Liu, M. J. F. Gales, J. L. Hieronymus, and P. C. Woodland | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, Texas | March 2010 | Speech | |
| Cascaded Model Adaptation for Dialog Act Segmentation and Tagging | U. Guz, G. Tur, D. Hakkani-Tür, and S. Cuendet | Journal of Computer Speech and Language, Vol. 24, Issue 2, pp. 289-306 | April 2010 | Speech | |
| Hunting for Wolves in Speaker Recognition | L. Stoll and G. Doddington | Proceedings of the Speaker and Language Recognition Workshop (Odyssey 2010), Brno, Czech Republic, pp. 159-164 | June 2010 | Speech | [PDF]
|
| LDA Based Similarity Modeling for Question Answering | A. Celikyilmaz, D. Hakkani-Tur, and G. Tur | Proceedings of the Workshop on Semantic Search at the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL HLT 2010), Los Angeles, California, pp. 1-9 | June 2010 | Speech | [PDF]
|
| A Graph-Based Semi-Supervised Learning for Question Semantic Labeling | A. Celikyilmaz and D. Hakkani-Tur | Proceedings of the Workshop on Semantic Search at the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL HLT 2010), Los Angeles, California, pp. 27-35 | June 2010 | Speech | [PDF]
|
| Opportunities and Challenges of Parallelizing Speech Recognition | J. Chong, G. Friedland, A. Janin, and N. Morgan | Proceedings of the Second USENIX Workshop on Hot Topics in Parallelism (HotPar '10), Berkeley, California | June 2010 | Speech | [PDF]
|
| Improving Language Recognition with Multilingual Phone Recognition and Speaker Adaptation Transforms | A. Stolcke, M. Akbacak, L. Ferrer, S. Kajarekar, C. Richey, N. Scheffer, and E. Shriberg | Proceedings of the Odyssey Speaker and Language Recognition Workshop, Brno, Czech Republic, pp. 256-262 | June 2010 | Speech | [PDF]
|
| A Hybrid Hierarchical Model for Multi-Document Summarization | A. Celikyilmaz and D. Hakkani-Tür | Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden, pp. 1149-1154 | July 2010 | Speech | [PDF]
|
| Review of E. Aguilar, "Animation and Performance Capture Using Digitized Models" | G. Friedland | ACM Computing Reviews, CR138181 | July 2010 | Speech | |
| Simple, Accurate Parsing with an All-Fragments Grammar | M. Bansal and D. Klein | Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden, pp. 1098-1107 | July 2010 | Speech | [PDF]
|
| Audio-Based Semantic Concept Classification for Consumer Video | K. Lee and D. Ellis | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 18, Issue 6, pp. 1406-1416 | August 2010 | Speech | [PDF]
|
| The CALO Meeting Assistant System | G. Tur, A. Stolcke, L. Voss, S. Peters, D. Hakkani-Tür, J. Dowding, B. Favre, R. Fernandez, M. Frampton, M. Frandsen, C. Frederickson, M. Graciarena, D. Kintzing, K. Leveque, S. Mason, J. Niekrasz, M. Purver, K. Riedhammer, E. Shriberg, J. Tien, D. Vergyri, and F. Yang | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 18, Issue 6, pp. 1601-1611 | August 2010 | Speech | [PDF]
|
| Multimodal Indoor Localization: An Audio-Wireless-Based Approach | O. Vinyals, E. Martin, and G. Friedland | Proceedings of the Fourth IEEE International Conference on Semantic Computing (ICSC-2010), Pittsburgh, Pennsylvania, pp. 120-125 | September 2010 | Speech | [PDF]
|
| A Hybrid Approach to Online Speaker Diarization | C. Vaquero, O. Vinyals, and G. Friedland | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 2642-2645 | September 2010 | Speech | [PDF]
|
| System Output Combination for Improved Speaker Diarization | S. Bozonnet, N. Evans, X. Anguera, O. Vinyals, G. Friedland, and C. Fredouille | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 2642-2645 | September 2010 | Speech | [PDF]
|
| Multimodal Speaker Diarization Using Oriented Optical Flow Histograms | M. Knox and G. Friedland | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 290-293 | September 2010 | Speech | [PDF]
|
| Discriminative Training for Hierarchical Clustering in Speaker Diarization | O. Vinyals, G. Friedland, and N. Morgan | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 2326-2329 | September 2010 | Speech | [PDF]
|
| Using Spectro-Temporal Features to Improve AFE Feature Extraction for ASR | S. Ravuri and N. Morgan | Proceedings of the 11th Internationational Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 1181-1184 | September 2010 | Speech | |
| Can Conversational Word Usage Be Used to Predict Speaker Demographics? | D. Gillick | Proceedings of the 11th Internationational Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan | September 2010 | Speech | [PDF]
|
| A Comparative Large Scale Study of MLP Features for Mandarin ASR | F. Valente, M. Magimai Doss, C. Plahl, S. Ravuri, and W. Wang | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 2630-2363 | September 2010 | Speech | [PDF]
|
| Multimodal Location Estimation | G. Friedland, O. Vinyals, and T. Darrell | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 1245-1251 | October 2010 | Speech | [PDF]
|