SPASQL

From W3C Wiki


Much of the world's data are found in relational databases and spreadsheets as well as unstructured files such as Word documents. Integration of these data sources would provide individuals with a more comprehensive view of their data and enable better decision making.

The emergence of the Semantic Web is enabling new approaches to federated data queries. RDF promotes universally grounded identifiers for data, allowing the SPARQL query language for RDF to perform joins across different data sources.

SPASQL is simply an extension of the SQL standard, allowing execution of SPARQL queries within SQL statements, typically by treating them as subquery or function clauses. Implementations of this SQL extension may be internal to the DBMS engine, as in OpenLink Virtuoso (see docs), or delivered through an extension architecture, as in the SPASQL module for MySQL. Adding native SPARQL support to the database can deliver the same performance as for well-tailored SQL queries.

Several gateways between RDF and conventional relational stores have also been developed to take advantage of federated query capabilities. Examples of implementations that can rewrite SPARQL queries to SQL include OpenLink Virtuoso (see docs), D2RQ, and SquirrelRdf.

Mapping

Semantic mapping is (in this context) a mapping from SPARQL queries expressed in terms of RDF graphs to relational queries (SQL) expressed in terms of tables and attributes. The advantages of mapping include

  • portability: a query about a person with foaf:name "Bob" can work on multiple relational databases
  • intuitiveness: using common terms allows multiple databases to express their data in the shape of a single, well-thought-out schema developed and understood by the community
  • migration: no need to convert relational databases into RDF for storage in a triple store, which can add latency, and increase storage requirements

Federation

The objective of federation is to perform efficient queries over data in disparate databases. These databases may be within one organization, span multiple organizations, or be provided by custodians of data. For an example, see Case Study: FeDeRate for Drug Research.

Technology Adoption

A number of commercial enterprises are interested in taking advantage of mapping approaches, such as SPASQL, to enable them to effectively migrate to the Semantic Web. Examples of companies that are interested in this technology include Eli Lilly.