SWEO Community Project: Linking Open Data on the Semantic Web

Equivalence Mining and Matching Frameworks

This page collects software tools and papers about techniques that can be used to auto-generate links between data items within different datasources.

The page is part of the community project wiki:SweoIG/TaskForces/CommunityProjects/LinkingOpenData

An example of an equivalence link is <http://dbpedia.org/resource/Berlin> owl:sameAs <http://sws.geonames.org/2950159> claiming that a data item in the dbpedia dataset is the same as a data item in the Geonames dataset.

Simple alternative which avoids the need of equivalence mining is to use commonly accepted identifiers within URIs. For example, the RDF book mashup uses ISBN numbers in its URIs- This allows other data sources about books to set links to the data items of the book mashup using a simple URI-pattern including the ISBN number.

Software Tools

People Interested in the Area

Papers and Web Resources on the Topic

This stuff has been done over and over in the database community, often called duplicate recognition or record linkage. So if somebody knows good overview papers about the area please add them to this page, so that people don't have to reinvent the wheel.

There was a workshop on Ontology Matching at ISWC 2006. The approaches proposed there

might also be useful for equivalence mining on data item/instance level.

TaskForces/CommunityProjects/LinkingOpenData/EquivalenceMining (last edited 2009-06-14 19:54:33 by SoerenAuer)