March 31, 2004

FAQ: How do I parse RDF?

Many times, application developers ask how they can get RDF data from the semantic web into their application, from the recommended syntax RDF/XML. This usually ends up being a question about parsing syntaxes and APIs in certain languages. There are widely available, mature and standards-compliant open source parsing libraries available for most high level programming libraries that application developers might need. This article has provides a summary of what are good choices and up-to-date.

The simple answer is: use one of the readily-available parsers that open source developers in the community have provided. There is no need to create a new parser for the most commonly used application languages, in a similar way to how XML parsers and APIs are widely available.

The more detailed answer depends on the application programming language being chosen (or if a new project, this might influence that choice), as well as the licensing of the project. Most of the items listed below are in easily reusable form and are used in commercial applications. Finally, there is a question of the syntax details - does the system support the latest RDF/XML Syntax Specification (Revised) W3C Recommendation of February 2004.

The list below covers what is available and what I recommend when people ask for commonly used languages for web sites.

ARP2 Parser by Jeremy Carroll
A Java parser developed 2001-present (ARP1) and part of the Jena Java semantic web toolkit. Passes all the RDF/XML tests, provides a lot of validation and internationalisation support and mature. BSD License with advertising.
Drive by Rahul Singh
A C# parser for the ECMA CLR platform developed over 2002-2003. Passed all positive RDF/XML test suite in 2003 but not clear if it does at present. Some third-party reports of problems. GPL license.
RAP RDF API for PHP by Chris Bizer
A PHP library including an RDF/XML parser developed 2002-present. Passes all the RDF/XML tests and mature. LGPL License.
Raptor by Dave Beckett
A C library developed 2001-present with APIs via Redland in several other web languages: Java, Perl, PHP, Python, Ruby and Tcl. Passes all RDF/XML tests and mature. GPL/LGPL/MPL License.
RDF::Simple by Jo Walsh
A Perl parser in CPAN, a translation of rdfxml.py below developed recently and not complete. Does not return all the information about literals (language, datatypes) or with some details of blank nodes.
RDFLib by Daniel Krech
A Python RDF toolkit including parsers for several RDF syntaxes developed over 2002-present and mature. Passes all the RDF/XML tests. BSD license with advertising.
rdfxml.py by Sean B. Palmer
A Python parser in under 10K of source. Designed to be small and as complete as possible. It passes most of the RDF/XML test suite but has not been updated to do the later revisions. GPL License / W3C License (alternate version).
Rio RDF Parser by Aduna
A Java RDF/XML parser part of the Sesame Java toolkit developed in 2003 as a small and fast parser requiring only SAX2. Passes all the RDF/XML tests. LGPL License.

For the state of the tools that have been run against the RDF/XML tests, see the RDF Core Test Results.

Several of the parsers above also provide support for other RDF syntaxes such as N-Triples, as used by the RDF test cases, Notation 3 (N3) and other subsets of N3 and experiments such as Turtle.

There are also several other older, unmaintained software or ones with unknown state against the tests that I have no detailed personal knowledge of: Injectilo (XSLT), Profium (Perl and Java, commercial), libwww (C, old), Snail (XSLT, old, slow) RDF Filter (Java, old), Repat (C, old), SWI-Prolog (Prolog), XWMF (Tcl, old) W3C Perllib (Perl).


Categories Frequently Asked Questions (FAQs)
Posted by dbeckett2 at March 31, 2004 01:44 PM
Comments
Post a comment