LINDA: Distributed Web-of-Data-Scale Entity Matching

TitleLINDA: Distributed Web-of-Data-Scale Entity Matching
Publication TypeConference Paper
Year of Publication2012
AuthorsBöhm, C., de Melo G., Naumann F., & Weikum G.
Page(s)2104-2108
Other Numbers3393
Abstract

Linked Data has emerged as a powerful way of interconnectingstructured data on the Web. However, the cross-linkagebetween Linked Data sources is not as extensive asone would hope for. In this paper, we formalize the task ofautomatically creating "sameAs" links across data sources ina globally consistent manner. Our algorithm, presented ina multi-core as well as a distributed version, achieves thislink generation by accounting for joint evidence of a match.Experiments conrm that our system scales beyond 100 millionentities and delivers highly accurate results despite thevast heterogeneity and daunting scale.

Acknowledgment

This work was partially funded by the Deutscher Akademischer Austausch Dienst (DAAD) through a postdoctoral fellowship.

URLhttp://www.icsi.berkeley.edu/pubs/ai/linda12.pdf
Bibliographic Notes

Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012), pp. 2104-2108, Maui, Hawaii

Abbreviated Authors

C. Boehm, G. de Melo, F. Naumann, and G. Weikum

ICSI Research Group

AI

ICSI Publication Type

Article in conference proceedings