10 June 2007

SPARQL Papers at ESWC 2007

There were 3 papers that particularly caught my SPARQL-driven attention at ESWC2007.

SPARQLeR: Extended Sparql for Semantic Association Discovery
Krys J. Kochut, Maciej Janik

This describes an extension to SPARQL for path variables. A path is a regular expression of properties but in addition the paper describes the need for reverse properties and constraints on paths (like length).

See also: PSPARQL: http://psparql.inrialpes.fr/

Minimal Deductive Systems for RDF
Sergio Muñoz, Jorge Pérez and Claudio Gutierrez.

This is a proposal for reduced RDFS with just rdfs:domain, rdfs:range, rdfs:subClassOf, rdfs:subPropertyOf and rdf:type.

This results in (on page 8) a small set of rules that have to be applied to the data but there is no core vocabulary. The rules can be applied to a streaming data stream, if the RDFS schema is known, because each rule only refers to at most one data triple.

There are no containers, which may be inconvenient, but that might more usefully be covered by not using typing, but having a different property just to match these syntactic constructs. That removes the container vocabulary from interacting with the application vocabulary.

A colleague here, Nipun Bhatia, has been working on streaming checking and rule application based on extending Eyeball. Nipun even adds cardinality validation by preprocessing the data to get the triples in subject order. Unix sort(1) is quite capable of sorting very large N-triples files in sensible amounts of time.

Semantic Process Retrieval with iSPARQL
Christoph Kiefer, Abraham Bernstein, Hong Joo Lee, Mark Klein and Markus Stocker.

((Non) interest declaration: Markus is now spending a few months working with us in Bristol - this work was done before that.)

The core of this paper is an example where statistical techniques beats logic. There is a strong message to us all here - don't think logic and perfect organization is necessarily the best solution to actual problems.

As part of this work, but not the main argument of the paper, they created iSPARQL (i=inprecise) which is an embedding of access to similarity metrics inside standard SPARQL without syntax changes. They use property functions (they are using ARQ but the principle is quite general) to access the similarity engine.

The idea of embedding some index or other functionality that can provide bindings of variables for some expression seems like a general extension technique for SPARQL. LARQ provides free-texting matching, using Lucene to do matching, and can include all the Lucene loose matching

No comments: