SparqlUpdateLanguage

From W3C Wiki

The first version of SPARQL is designed to query RDF data; while update features have been postponed for a future version of the standard, let's share experimental implementation experience here.

Related Work

In addition to nearby topics such as DeltaView, DiffAndPatch, BeesAndAnts, RDFAccessProtocol, UpdatingRelationalDataViaSPARUL, a survey of related work includes the following (listed newest first):

hmm... dbin has some diff/sync stuff, no?

and where is the IBM/Boca stuff? LinkMe

Update queries to RDF stores using sparql4j

I implemented a rudimentary query update mechanism by using construct queries and sparql4j. I developed a small driver which is able to insert results of a construct query into an RDF store by using rdf2go. Maybe I could implement a query rewrite mechanisms of queries described in SPARQL/Update A language for updating RDF graphs to have a small update query prototype.

Strawman Proposal

This is a strawman proposal for a SPARQL-based RDF update language. It is based on discussions between MaxVölkel and RichardCyganiak.

Big missing pieces: This deals only with updates to the default graph, and there is no account of how to deal with blank nodes when removing triples, and there's no account of what to do in the presence of inferred triples.

The WHERE keyword is still a full query pattern which can include OPTIONAL and FILTER, etc., like in traditional SPARQL. The ADD and REMOVE keywords take a graph template similar to CONSTRUCT. The template is optional in cases where a graph is provided as an input or generated by a DESCRIBE clause in the query.

In operations that both add and delete statements, deletion always happens first, no matter which order the ADD and REMOVE keywords appear in. Maybe we should require the REMOVE to be written first.

No-input operations

Add a fixed graph:


ADD { [] a foaf:Person; foaf:name "Max"; foaf:mbox <mailto:max@xam.de> }


Delete a fixed graph:


REMOVE { :person123 foaf:mbox <mailto:max@xam.de> }


Atomic add and delete:


ADD { :person123 foaf:mbox_sha1sum "9dec5f368c386776b648332359a4b9ec0156471d" }
REMOVE { :person123 foaf:mbox <mailto:max@xam.de> }


Find bindings matching a query pattern, then delete based on a graph template:


REMOVE { ?s ?p ?o } WHERE { ?s foaf:name "Max" }


Do atomic find/add/delete based on a query pattern and two graph templates:


ADD { ?x foaf:name "Max Völkel" } REMOVE { ?x foaf:name [] } WHERE { ?x foaf:mbox <mailto:max@xam.de> }


Delete based on DESCRIBE. It's up to the server to decide which triples “belong” to the resource.


REMOVE DESCRIBE :person123


Delete based on DESCRIBE. It's up to the server to decide which triples “belong” to the resource.


REMOVE DESCRIBE ?x WHERE { ?x foaf:name "Max" }


Graph input operations

These operations take one or two input graphs. In the HTTP bindings, the input would be POSTed. For two-graph operations, mime/multipart would be used.

Adds an input graph:


ADD


Deletes an input graph:


REMOVE


Atomic delete and add of two input graphs:


UPDATE


Result set operations

These operations take a result set as an input. I'm not sure if any of these are useful. Probably not.

Create several persons based on data from the input bindings:


ADD { [] foaf:name ?name; a foaf:Person; foaf:mbox ?email }


Delete several statements based on data from input bindings:


REMOVE { ?x foaf:nick [] }


Delete several persons based on data from input bindings. The server decides which triples “belong” to each person:


REMOVE DESCRIBE ?x


Update the names of several persons based on input bindings. The input bindings contain the variables ?x, ?oldName, ?newName:


ADD { ?x foaf:name ?newName } REMOVE { ?x foaf:name ?oldName }


Q & A

Q: Why not just use HTTP?

There's many things that HTTP alone can't do, like atomic updates to large graphs (see ADD {...} REMOVE {...} example), or updates based on a query result (see REMOVE {...} WHERE example).

That said, SPARQL has HTTP bindings. The update language must of course work over HTTP too. Especially the graph-input operations would work well with that interface -- POST to ...?query=ADD to add a subgraph; POST to ...?query=REMOVE to remove a subgraph.

REST-style HTTP operations could play a bigger role in operations for adding, updating, and removing entire named graphs.

Criticism

Don't do protocol "bindings"

The current notion of SPARQL protocol bindings is broken. It just happens to (sort of) work out because SPARQL is read-only and so can be mapped into URIs (and therefore GET) without breaking too many principles of Web architecture. That's not the case for updates. Please try to avoid making the same mistakes as Web services; application protocols were not made to be "bound onto", because doing so requires masking most of their value.

See also: Mark Baker: The trouble with “binding” and discussion

You should use other HTTP verbs, not PUT and POST

While PUT and POST are generally useful, neither REST nor the Web architecture precludes the use of other methods which might better facilitate "atomic updates to large graphs". I expect that HTTP PATCH would be quite useful for those kinds of updates.

However, PATCH was never really implemented and was removed from RFC 2616. Proposals that rely on new HTTP verbs usually are not popular, see URIQA.

Atomicity doesn't require a single message

This approach to atomicity may be overloading the HTTP verbs unnecessarily, a reliable ADD+REMOVE should be doable without combining them in a single message. See also : HTTPLR