LINDA: Distributed web-of-data-scale entity matching

Christoph Böhm, Gerard De Melo, Felix Naumann, Gerhard Weikum

Research output: Chapter in Book/Report/Conference proceedingConference contribution

43 Scopus citations

Abstract

Linked Data has emerged as a powerful way of interconnecting structured data on the Web. However, the cross-linkage between Linked Data sources is not as extensive as one would hope for. In this paper, we formalize the task of automatically creating "sameAs" links across data sources in a globally consistent manner. Our algorithm, presented in a multi-core as well as a distributed version, achieves this link generation by accounting for joint evidence of a match. Experiments confirm that our system scales beyond 100 million entities and delivers highly accurate results despite the vast heterogeneity and daunting scale.

Original languageEnglish (US)
Title of host publicationCIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management
Pages2104-2108
Number of pages5
DOIs
StatePublished - Dec 19 2012
Externally publishedYes
Event21st ACM International Conference on Information and Knowledge Management, CIKM 2012 - Maui, HI, United States
Duration: Oct 29 2012Nov 2 2012

Publication series

NameACM International Conference Proceeding Series

Other

Other21st ACM International Conference on Information and Knowledge Management, CIKM 2012
CountryUnited States
CityMaui, HI
Period10/29/1211/2/12

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Keywords

  • distributed entity matching
  • entity matching
  • linked data
  • mapreduce

Fingerprint Dive into the research topics of 'LINDA: Distributed web-of-data-scale entity matching'. Together they form a unique fingerprint.

  • Cite this

    Böhm, C., De Melo, G., Naumann, F., & Weikum, G. (2012). LINDA: Distributed web-of-data-scale entity matching. In CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management (pp. 2104-2108). (ACM International Conference Proceeding Series). https://doi.org/10.1145/2396761.2398582