| Multimodal Indoor Localization: An Audio-Wireless-Based Approach | O. Vinyals, E. Martin, and G. Friedland | Proceedings of the Fourth IEEE International Conference on Semantic Computing (ICSC-2010), Pittsburgh, Pennsylvania, pp. 120-125 | September 2010 | Speech | [PDF]
|
| A Hybrid Approach to Online Speaker Diarization | C. Vaquero, O. Vinyals, and G. Friedland | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 2642-2645 | September 2010 | Speech | [PDF]
|
| System Output Combination for Improved Speaker Diarization | S. Bozonnet, N. Evans, X. Anguera, O. Vinyals, G. Friedland, and C. Fredouille | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 2642-2645 | September 2010 | Speech | [PDF]
|
| Multimodal Speaker Diarization Using Oriented Optical Flow Histograms | M. Knox and G. Friedland | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 290-293 | September 2010 | Speech | [PDF]
|
| Discriminative Training for Hierarchical Clustering in Speaker Diarization | O. Vinyals, G. Friedland, and N. Morgan | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 2326-2329 | September 2010 | Speech | [PDF]
|
| Using Spectro-Temporal Features to Improve AFE Feature Extraction for ASR | S. Ravuri and N. Morgan | Proceedings of the 11th Internationational Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 1181-1184 | September 2010 | Speech | |
| MLP-Based Feature Extraction for Speech Transcription | N. Morgan, A. Faria, S. Ravuri, and S. Zhao | Handbook of Natural Language Processing and Machine Translation, J. Olive, ed., Springer, in press | 2010 | Speech | |
| Multimodal Location Estimation | G. Friedland, O. Vinyals, and T. Darrell | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 1245-1251 | October 2010 | Speech | [PDF]
|
| Precise Indoor Localization Using Smart Phones | E. Martin, O. Vinyals, G. Friedland, and R. Bajcsy | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 787-790 | October 2010 | Speech | [PDF]
|
| Joke-O-Mat HD: Browsing Sitcoms with Human Derived Transcripts | A. Janin, L. Gottlieb, and G. Friedland | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 1591-1594 | October 2010 | Speech | [PDF]
|
| Narrative-Theme Navigation for Sitcoms Supported by Fan-Generated Scripts | G. Friedland, L. Gottlieb, and A. Janin | Proceedings of the Third International Workshop on Automated Information Extraction in Media Production (AIEMPro '10) at the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 3-8 | October 2010 | Speech | [PDF]
|
| Speaker Recognition Using Prosodic and Lexical Features | S. Kajarekar, L. Ferrer, A. Venkataraman, K. Sonmez, E. Shriberg, A. Stolcke, H. Bratt, and R. R. Gadde | Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2003), St. Thomas, Virgin Islands, pp. 19-24 | November 2003 | Speech | [PDF]
|
| Modeling NERFs for Speaker Recognition | S. Kajarekar, L. Ferrer, K. Sonmez, J. Zheng, E. Shriberg, and A. Stolcke | Proceedings of the Speaker and Language Recognition Workshop (Odyssey 2004), Toledo, Spain, pp. 51-56 | May 2004 | Speech | [PDF]
|
| Long Story Short - Global Unsupervised Models for Keyphrase Based Meeting Summarization | K. Riedhammer, B. Favre, and D. Hakkani-Tur | Speech Communication, Vol. 52, Issue 10, pp. 801-815. DOI:10.1016/j.specom.2010.06.002 | October 2010 | Speech | |
| Can Conversational Word Usage Be Used to Predict Speaker Demographics? | D. Gillick | Proceedings of the 11th Internationational Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan | September 2010 | Speech | [PDF]
|
| Review of E. Aguilar, "Animation and Performance Capture Using Digitized Models" | G. Friedland | ACM Computing Reviews, CR138181 | July 2010 | Speech | |
| Review of J. Nichols and B. Myers, "Creating a Lightweight User Interface Description Language: An Overview and Analysis of the Personal Universal Controller Project" | G. Friedland | ACM Computing Reviews, CR137773 | March 2010 | Speech | |
| Simple, Accurate Parsing with an All-Fragments Grammar | M. Bansal and D. Klein | Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden, pp. 1098-1107 | July 2010 | Speech | [PDF]
|
| Discriminative Pronunciation Learning Using Phonetic Decoder and Minimum-Classification-Error Criterion | O. Vinyals, L. Deng, D. Yu, and A. Acero | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), Taipei, Taiwan, pp. 4445-4448 | April 2009 | Speech | [PDF]
|
| Syllable Intelligibility for Temporally-Filtered LPC Cepstral Trajectories | T. Arai, M. Pavel, H. Hermansky, and C. Avendano | Journal of the Acoustical Society of America, Vol. 105, No. 5, pp. 2783-2791 | May 1999 | Speech | [PDF]
|
| Dynamic Pronunciation Models for Autmoatic Speech Recognition | E. Fosler-Lussier | Ph.D. Thesis, UC Berkeley, Fall 1999, ICSI Technical Report TR-99-015 | September 1999 | Speech | [PDF]
|
| Reduction of English Function Words in Switchboard | D. Jurafsky, A. Bell, E. Fosler-Lussier, C. Girand, and W. Raymond | Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP 98), Sydney, Australia, Vol. 7, p. 3111 | December 1998 | Speech | [PDF]
|
| Multi-Channel Source Separation by Factorial HMMs | M.J. Reyes-gomez, B. Raj, and D. Ellis | Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003), Hong Kong | April 2003 | Speech | [PDF]
|
| Discourse Segmentation of Multi-party Conversation | M. Galley, K. McKeown, E. Fosler-Lussier, and H. Jing | Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL-03), Sapporo, Japan | July 2003 | Speech | [PDF]
|
| Factored Language Models and Generalized Parallel Backoff | J. Bilmes and K. Kirchhoff | Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2003), Edmonton, Canada, p. 1 | May 2003 | Speech | [PDF]
|
| Getting more mileage from web text sources for conversational speech language modeling using class-dependent mixtures | I. Bulyko, M. Ostendorf, and A. Stolcke | Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2003), Edmonton, Canada, Vol. 2, pp. 7-9 | May 2003 | Speech | [PDF]
|
| Multi-Speaker Language Modeling | G. Ji and J. Bilmes | Proceedings of the Human Language Technology Conference at the North American Chapter of the Association for Computational Linguistics, Boston, Massachusetts, pp. 133-136 | May 2004 | Speech | [PDF]
|
| Pitch-Based Emphasis Detection for Characterization of Meeting Recordings | L. Kennedy and D. Ellis | Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2003), St. Thomas, Virgin Islands | November 2003 | Speech | [PDF]
|
| Adaptive Language Modeling with Varied Sources to Cover New Vocabulary Items | S. Schwarm, I. Bulyko, and M. Ostendorf | IEEE Transactions on Speech and Audio Processing, Vol. 12, No. 3, pp. 334-342 | May 2004 | Speech | [PDF]
|
| Meeting Acts: A Labeling System for Group Interaction in Meetings | R. Bates, P. Menning, E. Willingham, and C. Kuyper | Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005-Eurospeech 2005), Lisbon, Portugal | September 2005 | Speech | [PDF]
|
| Dialog Act Tagging Using Graphical Models | G. Ji and J. Bilmes | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, Pennsylvania, Vol. 1, pp. 33-36 | March 2005 | Speech | [PDF]
|
| Backoff Model Training Using Partially Observed Data: Application to Dialog Act Tagging | G. Ji and J. Bilmes | Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2006), New York City, New York, pp. 280-287 | June 2006 | Speech | [PDF]
|
| Using Symbolic Prominence to Help Design Feature Subsets for Topic Classification and Clustering of Natural Human-Human Conversations | C. Boulis and M. Ostendof | Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005-Eurospeech 2005), Lisbon, Portugal | September 2005 | Speech | [PDF]
|
| A Quantitative Analysis of Lexical Differences Between Genders in Telephone Conversations | C. Boulis and M. Ostendof | Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL 2005), Ann Arbor, Michigan, pp. 435-442 | June 2005 | Speech | [PDF]
|
| Improving Word Sense Disambiguation in Lexical Chaining | M. Galley and K. McKeown | Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 03), Acapulco, Mexico, pp. 1486-1488 | August 2003 | Speech | [PDF]
|
| Clap Detection and Discrimination for Rhythm Therapy | N. Lesser and D.P.W. Ellis | Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, Pennsylvania, pp. 37-40 | March 2005 | Speech | [PDF]
|
| Multiband Audio Modeling for Single-Channel Acoustic Source Separation | M.J. Reyes-Gomez, D. Ellis, and N. Jojic | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '04), Montreal, Canada, Vol.5, pp. 641-644 | May 2004 | Speech | [PDF]
|
| The Role of Disfluencies on Topic Classification of Human-Human Conversations | C. Boulis, J. G. Kahn, and M. Ostendorf | Proceedings of the Spoken Language Understanding Workshop Program at the 20th National Conference on Artificial Intelligence (AAAI-05), Pittsburgh, Pennsylvania | July 2005 | Speech | [PDF]
|
| Text Classification by Augmenting the Bag-of-Words Representation with Redundancy-Compensated Bigrams | C. Boulis and M. Ostendof | Proceedings of the SIAM International Conference on Data Mining at the Workshop on Feature Selection in Data Mining (SIAM-FSDM 2005), Newport Beach, California | April 2005 | Speech | [PDF]
|
| Combining Multiple Clustering Systems | C. Boulis and M. Ostendof | Proceedings of the 15th European Conference on Machine Learning (ECML/PKDD 2004), Pisa, Italy | September 2004 | Speech | [PDF]
|
| A Parallel Meeting Diarist | G. Friedland, J. Chong, and A. Janin | Proceedings of the Workshop on Searching Spontaneous Conversational Speech (SSCS) at the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 57-60 | October 2010 | Speech | [PDF]
|
| A Comparative Large Scale Study of MLP Features for Mandarin ASR | F. Valente, M. Magimai Doss, C. Plahl, S. Ravuri, and W. Wang | Proceedings of the 11th International Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan, pp. 2630-2363 | September 2010 | Speech | [PDF]
|
| Parallelizing Speaker-Attributed Speech Recognition for Meeting Browsing | G. Friedland, J. Chong, and A. Janin | Proceedings of the 2010 IEEE International Symposium on Multimedia (ISM2010), Taiwan, pp. 121-128 | December 2010 | Speech | [PDF]
|
| Dialocalizaton: Acoustic Speaker Diarization and Visual Localization as Joint Optimization Problem | G. Friedland, C. Yeo, and H. Hung | ACM Transactions on Multimedia Computing, Communications, and Applications, Vol. 6, No. 4, Article 27 | November 2010 | Speech | [PDF]
|
| Review of C. Mueller-Tomfelder, "Tabletops - Horizontal Interactive Displays" | G. Friedland | ACM Computing Reviews, CR138453 | October 2010 | Speech | [PDF]
|
| The 2010 ICSI Video Location Estimation System | J. Choi, A. Janin, and G. Friedland | Proceedings of the MediaEval 2010 Workshop, Pisa Italy | October 2010 | Speech | [PDF]
|
| Structured Approaches to Data Selection for Speaker Recognition | H. Lei | UC Berkeley dissertation | December 2010 | Speech | [PDF]
|
| Introduction to Multimedia Computing | G. Friedland and R. Jain | Cambridge University Press | 2011 | Speech | |
| Special Section on New Frontiers in Rich Transcription | G. Friedland, J. Fiscus, T. Hain, and S. Furui (eds) | IEEE Transactions in Audio, Speech, and Language Processing, Vol. 20, No. 2 | February 2012 | Speech | |
| Speaker Diarization: A Review of Recent Research | X. Anguera, S. Bozonnet, N. Evans, C. Fredouille, G. Friedland, and O. Vinyals | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, Issue 2, pp. 356-370 | February 2012 | Speech | [PDF]
|