| Precise Indoor Localization Using Smart Phones | E. Martin, O. Vinyals, G. Friedland, and R. Bajcsy | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 787-790 | October 2010 | Speech | [PDF]
|
| Joke-O-Mat HD: Browsing Sitcoms with Human Derived Transcripts | A. Janin, L. Gottlieb, and G. Friedland | Proceedings of the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 1591-1594 | October 2010 | Speech | [PDF]
|
| Narrative-Theme Navigation for Sitcoms Supported by Fan-Generated Scripts | G. Friedland, L. Gottlieb, and A. Janin | Proceedings of the Third International Workshop on Automated Information Extraction in Media Production (AIEMPro '10) at the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 3-8 | October 2010 | Speech | [PDF]
|
| Long Story Short - Global Unsupervised Models for Keyphrase Based Meeting Summarization | K. Riedhammer, B. Favre, and D. Hakkani-Tur | Speech Communication, Vol. 52, Issue 10, pp. 801-815. DOI:10.1016/j.specom.2010.06.002 | October 2010 | Speech | |
| A Parallel Meeting Diarist | G. Friedland, J. Chong, and A. Janin | Proceedings of the Workshop on Searching Spontaneous Conversational Speech (SSCS) at the ACM International Conference on Multimedia (ACM Multimedia 2010), Florence, Italy, pp. 57-60 | October 2010 | Speech | [PDF]
|
| Review of C. Mueller-Tomfelder, "Tabletops - Horizontal Interactive Displays" | G. Friedland | ACM Computing Reviews, CR138453 | October 2010 | Speech | [PDF]
|
| The 2010 ICSI Video Location Estimation System | J. Choi, A. Janin, and G. Friedland | Proceedings of the MediaEval 2010 Workshop, Pisa Italy | October 2010 | Speech | [PDF]
|
| Tuning-Robust Initialization Methods for Speaker Diarization | D. Imseng and G. Friedland | IEEE Transactions on Audio, Speech, and Language Processing, Vol. 18, Issue 8, pp. 2028-2037 | November 2010 | Speech | [PDF]
|
| Selected Papers from the 11th IEEE International Symposium on Multimedia (ISM2009) | G. Friedland and M.-L. Shyu, eds. | International Journal on Semantic Computing, Vol. 4, No. 2 | November 2010 | Speech | |
| Dialocalizaton: Acoustic Speaker Diarization and Visual Localization as Joint Optimization Problem | G. Friedland, C. Yeo, and H. Hung | ACM Transactions on Multimedia Computing, Communications, and Applications, Vol. 6, No. 4, Article 27 | November 2010 | Speech | [PDF]
|
| Parallelizing Speaker-Attributed Speech Recognition for Meeting Browsing | G. Friedland, J. Chong, and A. Janin | Proceedings of the 2010 IEEE International Symposium on Multimedia (ISM2010), Taiwan, pp. 121-128 | December 2010 | Speech | [PDF]
|
| Structured Approaches to Data Selection for Speaker Recognition | H. Lei | UC Berkeley dissertation | December 2010 | Speech | [PDF]
|
| Introduction to Multimedia Computing | G. Friedland and R. Jain | Cambridge University Press | 2011 | Speech | |
| Automated Information Extraction in Production | R. Desutter, J.P. Evain, G. Friedland, A. Messina, and M. Sano | Special issue in Multimedia Tools and Applications, Springer | 2011 | Speech | |
| Deep and Wide: Multiple Layers in Automatic Speech Recognition | N. Morgan | IEEE Transactions on Audio, Speech, and Language Processing, Special Issue on Deep Learning | 2011 | Speech | [PDF]
|
| The Automatic Recognition of Emotions in Speech | A. Batliner, B. Schuller, D. Seppi, S. Steidl, L. Devillers, L. Vidrascu, T. Vogt, V. Aharonson, and N. Amir | Article in P. Petta, Paolo, C. Pelachaud, R. Cowie, eds., Emotion-Oriented Systems: The Humaine Handbook Cognitive Technologies, pp. 71-99, Springer | 2011 | Speech | |
| On the Use of Spectro-Temporal Features in Noise-Additive Speech | S. Ravuri | UC Berkeley Master's thesis, Spring 2011 | 2011 | Speech | [PDF]
|
| Speaker Diarization | G. Friedland | In Speech and Audio Signal Processing, 2nd edition, B. Gold, N. Morgan, D. Ellis, eds., Wiley | 2011 | Speech | |
| Review of C. Simon, et al., "Visual Event Recognition Using Decision Trees" | G. Friedland | ACM Computing Reviews, CR138638 | January 2011 | Speech | |
| Semantic Computing and Privacy: A Case Study Using Inferred Geo-Tagging | G. Friedland and J. Choi | International Journal of Semantic Computing, Vol. 5, No. 1, pp. 79-93. Also Best Poster in the Electrical and Computer Science and Engineering Track at the Korean Student Technical and Leadership Conference, Chicago, Illinois, March 2012. DOI: 10.1142/S1793351X11001171 | March 2011 | Speech | [PDF]
|
| Automatic Tagging and Geo-Tagging in Video Collections and Communities | M. Larson, M. Soleymani, P. Serdyukov, S. Rudinac, C. Wartena, V. Murdock, G. Friedland, R. Ordelman, and G. J. F. Jones | Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR 2011), Trento, Italy, April 2011 | April 2011 | Speech | [PDF]
|
| CUDA-Level Performance with Python-Level Productivity for Gaussian Mixture Model Applications | H. Cook, E. Gonina, S. Kamil, G. Friedland, D. Patterson, and A. Fox | Proceedings of the Third USENIX Workshop on Hot Topics in Parallelism (HotPar ’11), Berkeley, California | May 2011 | Speech | [PDF]
|
| User Verification: Matching the Uploaders of Videos Across Accounts | H. Lei, J. Choi, A. Janin, and G. Friedland | Proceedings of the IEEE International Conference on Acoustic, Speech, and Signal Processing (ICASSP 2011), Prague, Czech Republic, pp. 2404-2407 | May 2011 | Speech | [PDF]
|
| The SRI NIST 2010 Speaker Recognition Evaluation System | N. Scheffer, L. Ferrer, M. Graciarena, S. Kajarekar, E. Shriberg, and A. Stolcke | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), Prague, Czech Republic, pp. 5292-5295 | May 2011 | Speech | [PDF]
|
| The IBM 2009 GALE Arabic Speech Transcription System | B. Kingsbury, H. Soltau, G. Saon, S. Chu, H.-K. Kuo, L. Mangu, S. Ravuri, A. Janin, and N. Morgan | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), Prague, Czech Republic, pp. 4672-4675 | May 2011 | Speech | [PDF]
|
| Language-Independent Constrained Cepstral Features for Speaker Recognition | E. Shriberg and A. Stolcke | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), Prague, Czech Republic, pp. 5296-5299 | May 2011 | Speech | [PDF]
|
| Bird Species Recognition Combining Acoustic and Sequence Modeling | M. Graciarena, M. Delplanche, E. Shriberg, and A. Stolcke | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), Prague, Czech Republic, pp. 341-344 | May 2011 | Speech | [PDF]
|
| Making the Most from Multiple Microphones in Meeting Recognition | A. Stolcke | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), Prague, Czech Republic, pp. 4992-4995 | May 2011 | Speech | [PDF]
|
| Associating Children’s Non-Verbal and Verbal Behaviour: Body Movements, Emotions, and Laughter in a Human-Robot Interaction | A. Batliner, S. Steidl, and E. Nöth | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), Prague, Czech Republic, pp. 22-27 | May 2011 | Speech | [PDF]
|
| Comparing Multilayer Perceptron to Deep Belief Network Tandem Features for Robust ASR | O. Vinyals and S. Ravuri | Proceedings of the 36th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '11), Prague, Czech Republic | May 2011 | Speech | [PDF]
|
| Estimating Dominance in Multi-Party Meetings Using Speaker Diarization from a Single Microphone | H. Hung, Y. Huang, G. Friedland, and D. Gatica-Perez | IEEE Transactions on Audio, Speech and Language Processing, Vol. 19, No. 4, pp. 847–860 | May 2011 | Speech | |
| How Good Is the Crowd at "Real" WSD? | J. Hong and C. F. Baker | Proceedings of the Fifth Linguistic Annotation Workshop (LAW-V), Portland, Oregon | June 2011 | Speech | [PDF]
|
| Review of J. Ajmera, et al., "Two-Stream Indexing for Spoken Web Search" | G. Friedland | ACM Computing Reviews, CR139192 | June 2011 | Speech | |
| Gappy Phrasal Alignment by Agreement | M. Bansal, C. Quirk, and R. C. Moore | Proceedings of the 49th annual Meeting of the Association for Computational Linguistics, pp. 1308-1317 Portland, Oregon | June 2011 | Speech | [PDF]
|
| The Surprising Variance in Shortest-Derivation Parsing | M. Bansal and D. Klein | Proceedings of the 49th annual Meeting of the Association for Computational Linguistics, Portland, Oregon | June 2011 | Speech | [PDF]
|
| Web-Scale Features for Full-Scale Parsing | M. Bansal and D. Klein | Proceedings of the 49th annual Meeting of the Association for Computational Linguistics, pp. 693-702, Portland, Oregon | June 2011 | Speech | [PDF]
|
| Review of A. Rahman, et al., "Spatial-Geometric Approach to Physical Mobile Interaction Based on Accelerometer and IR Sensory Data Fusion" | G. Friedland | ACM Computing Reviews, CR139264 | July 2011 | Speech | |
| Improved Overlapped Speech Handling for Speaker Diarization | K. Boakye, O. Vinyals, and G. Friedland | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 941-944 | August 2011 | Speech | |
| Data Selection with Kurtosis and Nasality features for Speaker Recognition | H. Lei and N. Mirghafori | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 2753-2756 | August 2011 | Speech | [PDF]
|
| Improved Classification of Speaking Styles for Mental Health Monitoring using Phoneme Dynamics | K. Chang, H. Lei, and J. Canny | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 85-88 | August 2011 | Speech | [PDF]
|
| Effective Arabic Dialect Classification Using Diverse Phonotactic Models | M. Akbacak, D. Vergyri, A. Stolcke, N. Scheffer, and A. Mandal | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 737-740 | August 2011 | Speech | [PDF]
|
| Constrained Cepstral Speaker Recognition Using Matched UBM and JFA Training | M. H. Sanchez, L. Ferrer, E. Shriberg, and A. Stolcke | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 141-144 | August 2011 | Speech | [PDF]
|
| Java Visual Speech Components for Rapid Application Development of GUI based Speech Processing Applications | S. Steidl, K. Riedhammer, T. Bocklet, F. Hoenig, and E. Noeth | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 3257-3260 | August 2011 | Speech | |
| Comparing Different Flavors of Spectro-Temporal Features for ASR | B. T. Meyer, S. V. Ravuri, M. R. Schaedler, and N. Morgan | Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 1269-1272 | August 2011 | Speech | [PDF]
|
| Data-Driven vs. Semantic-Technology-Driven Tag-Based Video Location Estimation | J. Choi and G. Friedland | Proceedings of the IEEE International Conference on Semantic Computing (ICSC 2011), Palo Alto, California, pp. 243-246 | September 2011 | Speech | [PDF]
|
| The 2011 ICSI Video Location Estimation System | J. Choi, H. Lei, and G. Friedland | Proceedings of the MediaEval 2011 Workshop, Pisa, Italy | September 2011 | Speech | [PDF]
|
| Data-Driven vs. Semantic-Technology-Driven Tag-Based Video Location Estimation | J. Choi and G. Friedland | Proceedings of the Fifth IEEE International Conference on Semantic Computing (ICSC 2011), Palo Alto, California, pp. 243-246 | September 2011 | Speech | [PDF]
|
| Improving Automatic Speech Recognition by Learning from Human Errors | B. T. Meyer | Proceedings of the 162nd Meeting of the Acoustical Society of America, San Diego, California | October 2011 | Speech | |
| Speech and Audio Signal Processing: Processing and Perception of Speech and Music, 2nd Edition | B. Gold, N. Morgan, and D. Ellis | Wiley | November 2011 | Speech | |
| Video2GPS: A Demo of Multimodal Location Estimation on Flickr Videos | G. Friedland, J. Choi, and A. Janin | Proceedings of the ACM Multimedia Conference (MM'11), Scottsdale, Arizona | November 2011 | Speech | [PDF]
|