LINDA: Distributed Web-of-Data-Scale Entity Matching
Title | LINDA: Distributed Web-of-Data-Scale Entity Matching |
Publication Type | Conference Paper |
Year of Publication | 2012 |
Authors | Böhm, C., de Melo G., Naumann F., & Weikum G. |
Page(s) | 2104-2108 |
Other Numbers | 3393 |
Abstract | Linked Data has emerged as a powerful way of interconnectingstructured data on the Web. However, the cross-linkagebetween Linked Data sources is not as extensive asone would hope for. In this paper, we formalize the task ofautomatically creating "sameAs" links across data sources ina globally consistent manner. Our algorithm, presented ina multi-core as well as a distributed version, achieves thislink generation by accounting for joint evidence of a match.Experiments conrm that our system scales beyond 100 millionentities and delivers highly accurate results despite thevast heterogeneity and daunting scale. |
Acknowledgment | This work was partially funded by the Deutscher Akademischer Austausch Dienst (DAAD) through a postdoctoral fellowship. |
URL | http://www.icsi.berkeley.edu/pubs/ai/linda12.pdf |
Bibliographic Notes | Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012), pp. 2104-2108, Maui, Hawaii |
Abbreviated Authors | C. Boehm, G. de Melo, F. Naumann, and G. Weikum |
ICSI Research Group | AI |
ICSI Publication Type | Article in conference proceedings |