Publication Search Results

TitleAuthorBibliographicDatesort descendingGroupLinks
Acoustic Super Models for Large Scale Video Event DetectionR. Mertens, H. Lei, L. Gottlieb, G. Friedland, and A. DivakaranProceedings of the ACM International Workshop on Events in Multimedia (EiMM11), Scottsdale, ArizonaNovember 2011Speech[PDF]

Multimodal Location Estimation on Flickr VideosG. Friedland, J. Choi, H. Lei, and A. JaninProceedings of the ACM International Workshop on Social Media (WSM11), Scottsdale, ArizonaNovember 2011Speech[PDF]

Fast Speaker Diarization Using a High-Level Scripting LanguageE. Gonina, G. Friedland, H. Cook, and K. KeutzerProceedings of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2011), Big Island, HawaiiDecember 2011Speech[PDF]

On the Applicability of Speaker Diarization to Audio Concept Detection for Multimedia RetrievalR. Mertens, P.-S. Huang, L. Gottlieb, G. Friedland, and A. DivakaranProceedings of the IEEE International Symposium on Multimedia, Dana Point, California, pp. 446-451December 2011Speech[PDF]

Don't Multiply Lightly: Quantifying Problems with the Acoustic Model Assumptions in Speech RecognitionD. Gillick, L. Gillick, and S. WegmannProceedings of the Automatic Speech Recognition and Understanding Workshop (ASRU), Big Island, HawaiiDecember 2011Speech[PDF]

Finding Difficult Speakers in Automatic Speaker RecognitionL. StollUC Berkeley PhD thesis, Berkeley, CaliforniaDecember 2011Speech[PDF]

Narrative Theme Navigation for Sitcoms Supported by Fan-Generated ScriptsG. Friedland, A. Janin, and L. GottliebTo appear in Multimedia Tools and Applications, Springer 2012Speech[PDF]

Syllable Models for Mandarin Speech Recognition: Exploiting Character Language ModelsX. Liu, J. L. Hieronymus, M. J. F. Gales, and P. C. WoodlandIn submission 2012Speech
Features Based on Auditory Physiology and PerceptionR. M. Stern and N. MorganIn Techniques for Noise Robustness in Automatic Speech Recognition, T. Virtanen, B. Raj, and R. Singh, Wiley Publishing 2012Speech
Introduction to the Special Section on Deep Learning for Speech and Language ProcessingD. Yu, G. Hinton, N. Morgan, J.-T. Chien, and S. SagayamaIEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, Issue 1, pp. 4-6January 2012Speech[PDF]

Deep and Wide: Multiple Layers in Automatic Speech RecognitionN. MorganIEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, Issue 1, pp. 7-13January 2012Speech[PDF]

Special Section on New Frontiers in Rich TranscriptionG. Friedland, J. Fiscus, T. Hain, and S. Furui (eds)IEEE Transactions in Audio, Speech, and Language Processing, Vol. 20, No. 2February 2012Speech
Speaker Diarization: A Review of Recent ResearchX. Anguera, S. Bozonnet, N. Evans, C. Fredouille, G. Friedland, and O. VinyalsIEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, Issue 2, pp. 356-370February 2012Speech[PDF]

The ICSI RT-09 Speaker Diarization SystemG. Friedland, A. Janin, D. Imseng, X. Anguera, L. Gottlieb, M. Huijbregts, M. Knox, and O. VinyalsIEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, Issue 2, pp. 371-381February 2012Speech[PDF]

Multimodal City-Verification on Flickr Videos Using Acoustic and Textual FeaturesH. Lei, J. Choi, and G. FriedlandProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, JapanMarch 2012Speech[PDF]

Spectro-Temporal Gabor Features for Speaker RecognitionH. Lei, B. T. Meyer, and N. MirghaforiProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, JapanMarch 2012Speech[PDF]

Discriminative Training for Speech Recognition is Compensating for Statistical Dependence on the HMM FrameworkD. Gillick and S. Wegmann, L. GillickProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, JapanMarch 2012Speech[PDF]

How to Put It Into Words - Using Random Forests to Extract Symbol Level Descriptions from Audio Content for Concept DetectionP.-S. Huang, R. Mertens, A. Divakaran, G. Friedland, and M. Hasegawa-JohnsProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, JapanMarch 2012Speech[PDF]

Easy Does It: Robust Spectro-Temporal Many-Stream ASR Without Fine Tuning StreamsS. Ravuri and N. MorganProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, JapanMarch 2012Speech
Articulatory Features for Expressive Speech SynthesisA. Black, H. T. Bunnell, Y. Dou, P. Kumar, F. Metze, D. Perry, T. Polzehl, K. Prahallad, S. Steidl, and C. VaugProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, JapanMarch 2012Speech[PDF]

From AUDREY to Siri: Is Speech Recognition A Solved Problem?R. PieracciniPresented at the Mobile Voice Conference, San Francisco, CaliforniaMarch 2012Speech[PDF]

Cybercasing the Joint: Language Technologies, Multimedia Retrieval, and Online PrivacyG. FriedlandPresented at the Language Technologies Institute Colloquium, Carnegie Mellon University, Pittsburgh, PennsylvaniaApril 13 2012Speech[PDF]

Speaker DiarizationG. Friedland and F. ValenteIn Multimodal Signal Processing: Human Interactions in Meetings, S. Reynals, H. Bourlard, J. Carletta, and A. Popescu-Belis, eds., Cambridge University PressJune 2012Speech
Semi-Autonomous Car Control Using Brain Computer InterfacesD. Goehring, D. Latotzky, M. Wang, and R. RojasProceedings of the 12th International Conference of Intelligent Autonomous Systems (IAS), Juju Island, KoreaJune 2012Speech
Multimodal Location Estimation of Consumer Media – Dealing with Sparse Training DataJ. Choi, G. Friedland, V. Ekambaram, and K. RamchandranProceedings of the IEEE International Conference on Multimedia and Expo, Melbourne, Australia, pp. 43-48July 2012Speech[PDF]

Where did I go Wrong?: Identifying Troublesome Segments for Speaker Diarization SystemsM. T. Knox, N. Mirghafori, and G. FriedlandProceedings of the 13th Annual Conference of the International Speech Communication Association (InterSpeech 2012), Portland, OregonSeptember 2012Speech[PDF]

Hooking Up Spectro-Temporal Filters with Auditory-Inspired Representations for Robust Automatic Speech RecognitionB. Meyer, C. Spille, B. Kollmeier, and N. MorganProceedings of the 13th Annual Conference of the International Speech Communication Association (InterSpeech 2012), Portland, OregonSeptember 2012Speech[PDF]

Longer Features: They Do a Speech Detector GoodTJ Tsai and N. MorganProceedings of the 13th Annual Conference of the International Speech Communication Association (InterSpeech 2012), Portland, OregonSeptember 2012Speech
There is No Data Like Less Data: Percepts for Video Concept Detection on Consumer-Produced MediaBenjamin Elizalde; Gerald Friedland; Howard Lei; Ajay DivakaranProceedings of the ACM International Workshop on Audio and Multimedia Methods for Large-Scale Video Analysis (AMVA) at ACM Multimedia 2012 (MM'12), Nara, Japan, pp. 27-32October 2012Speech[PDF]

Pushing the Limits of Mechanical Turk: Qualifying the Crowd for Video Geo-LocationL. Gottlieb, J. Choi, P. Kelm, T. Sikora, and G. FriedlandProceedings of the ACM Workshop on Crowdsourcing for Multimedia (CrowdMM 2012), held in conjunction with ACM Multimedia 2012, pp. 23-28, Nara, JapanOctober 2012Speech[PDF]

The 2012 ICSI/Berkeley Video Location Estimation SystemJ. Choi, V. Ekambaram, G. Friedland, and K. RamchandranPresented at the MediaEval 2012 Workshop, Pisa, ItalyOctober 2012Speech[PDF]

Hearing is Believing: Biologically-Inspired Feature Extraction for Robust Automatic Speech RecognitionR. M. Stern and N. MorganSignal Processing Magazine, Vol. 29, No. 6, pp. 34-43November 2012Speech[PDF]

Pages