Human Vs Machine: Establishing a Human Baseline for Multimodal Location Estimation

TitleHuman Vs Machine: Establishing a Human Baseline for Multimodal Location Estimation
Publication TypeConference Paper
Year of Publication2013
AuthorsChoi, J., Lei H., Ekambaram V., Kelm P., Gottlieb L., Sikora T., Ramchandran K., & Friedland G.
Other Numbers3649

Over the recent years, the problem of video location estimation (i.e., estimating the longitude/latitude coordinates of a video without GPS information) has been approached with diverse methods and ideas in the research community and significant improvements have been made. So far, however, systems have only been compared against each other and no systematic study on human performance has been conducted. Based on a human-subject study with 11,900 experiments, this article presents a human baseline for location estimation for different combinations of modalities (audio, audio/video, audio/video/text). Furthermore, this article compares state-of-the-art location estimation systems with the human baseline. Although the overall performance of humans' multimodal video location estimation is better than current machine learning approaches, the difference is quite small: For 41% of the test set, the machine's accuracy was superior to the humans. We present case studies and discuss why machines did better for some videos and not for others. Our analysis suggests new directions and priorities for future work on the improvement of location inference algorithms.


This work was partially supported by funding provided through National Science Foundation EAGER grant IIS-1128599 and well as a KFAS Doctoral Study Abroad Fellowship. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors or originators and do not necessarily reflect the views of the National Science Foundation.

Bibliographic Notes

Proceedings of the 21st ACM International Conference on Multimedia (Multimedia 2013), Barcelona, Spain

Abbreviated Authors

J. Choi, H. Lei, V. Ekambaram, P. Kelm, L. Gottlieb, T. Sikora, K. Ramchandran, and G. Friedland

ICSI Research Group

Audio and Multimedia

ICSI Publication Type

Article in conference proceedings