TaskForces/CommunityProjects/LinkingOpenData/DataSets

From W3C Wiki
Revision as of 16:21, 1 February 2007 by w3cpedia>ChrisBizer
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

SWEO Community Project: Linking Open Data on the Seamntic Web

Datasets

This page collects open datasets.

It is a first draft and I will add more datasets in the next days. ChrisB

The page is part of the community project [[SweoIG/TaskForces/CommunityProjects/LinkingOpenData|]]

Datasets available with dereferencable URIs

  • WordNet: WordNet is a large lexical database of English. Currently being RDFized by a Best Practices Task Force. Detail ...
  • DBLP Bibliography: Provides bibliographic information about scientific papers. Size of the dataset. 800.000 articles and 400.000 authors, aprox. 15 million triples.
  • RDF Book Mashup: Provides bibliographic information, reviews and sales offers for most books that have a ISBN number. Maps data from Amazon and Google base to RDF. Size of the dataset: Unknown, billions of triples.
  • dbpedia: Dataset containing extracted data from Wikipedia. About 19 million triples. Please don't use for linking yet, as the URIs will change in the next weeks.

Datasets available as RDF Dump

  • Lots. Please feel free to add plenty :-)

Datasets available via SPARQL Endpoints

See [[[SparqlEndpoints]]]

Datasets currently being RDFized

  • MusicBrainz. Please ask Frederick Giasson for details.
  • US Census Data. Please ask Josh Tauberer for details.
  • GEMET. GEMET is the GEneral Multilingual Environmental Thesaurus of the European Environment Agency. Please ask Bernard Vatant for details.

Datasets that would be nice to have on the Semantic Web

  • Lots. Please feel free to add plenty :-)