In RDF, blank notes are treated as existential variables - they indicate the existence of a thing without saying anything about the name of that thing. So it is reasonable to express a query as a graph with bNodes used as if they were wildcards and to define a query operation as something like "find all instances of the query graph which are entailed by the data". Perhaps, your operation might want to the find the union of that set of matching subgraphs rather than return the separate matches, depending on the application.
This can work but it is quite restrictive.
First, bNodes can only be used in place of nodes, not in place of properties. This is a big limitation since many queries require matching over properties. Second, you can't express constraints such as string pattern matches or range constraints on the literals to be matched. To get around this, attempts at this "query by example" approach often use metalevel annotations to allow such things to be expressed. For example, see our own experiments this area, RDF-QBE. Once, you start doing this you can use the annotations to identify the query nodes in the first place and not bother using bNodes at all. This is essentially, what the simplest of the Edutella query languages, RDF-QEL-1, does.
Other limitations are the inability to express disjunctive queries this way (RDF is purely conjunctive) and the akwardness of expressing constraints between variables.
Despite these limitations the symmetry of expressing queries, and indeed the resulting matches, directly in RDF rather than indirectly encoded in RDF is appealing and could be appropriate in some applications.
[N.B. This is an early version of a FAQ entry responding to one of the items on the FAQ ideas list. I'm sure others will be able to add more information on this topic and over time the proto-entry might turn into a real entry.]
As we blogged last month, we have been putting together a portal inferface tool that allows us to take a collection of RDF, in our case descriptions of environmental organizations, and render it in a faceted browser. This is working well and enabled us to demonstrate a prototype successfully to Anthony Perret of the environment council at a recent meeting.
The dimensions to use to drive the browsing are described in the form of either RDFS class hiearchies or SKOS thesauri. It proved to be quite easy to use Jena's rule processing engine to add rules to propagate the transitive closure of the SKOS term lattice along with basic RDFS processing and a little OWL support (we needed inverse properties). In the portal description (in RDF of course) you can just specify a set of data sources and ontologies, together with what rule file you want to use for processing. Surprisingly simple rules have been enough to implement the functionality needed for the demo so far.
We've also been able to connect the two tools together. For an internal demonstration we were able to capture and classify some information snippets in a semblog and view them in the appropriate categories in a portal along with some preclassified documents. What makes it really fun is that the classification scheme itself, since it's expressed in RDF, is just another object you can browse and manipulate. So you can link in another data source, which uses a different classification scheme, and can see that scheme as another dimension available for use in browsing.
On Friday of last week both Libby Miller from ILRT and I (Alistair Miles from CCLRC) attended the JISC Terminology Services Workshop in London. The workshop was being held to explore all issues surrounding the need for making terminologies (thesauri, taxonomies, classification systems etc.) available via services on the web to a wider community, and the potential role of JISC in that effort. The Thesaurus Activity of the SWAD-Europe project is concerned with exactly this problem, and although our work on a thesaurus web-service API is still in progress, we've already done some interesting pre-prototype implementations of modular services and applications. This workshop was a chance for us to show off our prototypes, and discuss future directions with a well-informed and experienced group of people.
There were some very interesting presentations, with some clear issues of importance emerging. Lorcan Dempsey from OCLC emphasised the need for a modular approach to distributed service architectures, borrowing the phrase 'unplug and play'. This issue was revisited several times throughout the day, although it appeared that the perspective of the majority was still rooted in an older approach which favoured bespoke, monolithic components incorporating a lot of functionality into a single module. There was also some confusion about whether JISC was talking about terminology services in general (i.e. setting up a common and agreed interface to terminology services, so many communities and organisations could publish their own data and interoperate) or a 'national terminology service' which would be a single source point for a group of public domain terminologies.
One very encouraging sign was, however, that 'the semantic web' 'RDF' and 'OWL' are no longer dirty words, but are more and more being considered as viable and realistic approaches to solving these technological and architectural problems. It is also clear that if this community is going to start moving towards semantic web style solutions, then there is a bridge to be built between traditional approaches to structured vocabularies and the Web Ontology Language. I believe the SKOS schemas can play a significant role in building that bridge, and will provide an opportunity for the large communities of library and information scientists to enrich the framework of the semantic web.
There was also some very positive feedback on the recent SKOS work, including the reports on representing monolingual thesauri, multilingual thesauri and inter-thesaurus mappings. Another issue raised by Nicholas Gibbins of Southampton University was the potential value of a common meta-model for Knowledge Organisation Systems (KOS), to facilitate the interoperability of different KOS styles, and support the coexistent use of these different forms. Although there was limited discussion of the model itself, I was encouraged by the fact that the meta-model re-iterated by several of the most experienced participants was also the concept-based meta-model inherent in the SKOS schemas. This reinforced my hope that, although SKOS was primarily designed to support the use of thesauri on the web, it can provide a framework for many other types of KOS to be used side by side, both with each other, and with more formal web ontologies.
Thanks again to Helen Hockx-Yu, Natasha Bishop, Sarah Smith and all the folk at JISC for looking after us so well.