07 May 2006

Parameterized Queries

Sometimes, an application will be making a SPARQL query, using the results from a previous query or using some RDF term found through the other Jena APIs.

SQL has prepared statements - they allow an SQL statement to take a number of parameters. The application fills in the parameters and executes the statement.

One way is to resort to doing this in SPARQL by building a complete, new query string, parsing it and executing it. But it takes a little care to handle all cases like quoting special characters; you can at least use some of the many utilities in ARQ for producing strings such as FmtUtils.stringForResource (it's not in the application API but in the util package currently).

Queries in ARQ can be built programmatically but it is tedious, especially when the documentation hasn't been written yet.

Another way is to use query variables and bind them to initial values that apply to all query solutions. Consider the query:

PREFIX dc <http://purl.org/dc/elements/1.1/>
SELECT ?doc { ?doc dc:title ?title }

It gets documents and their titles.

Executing a query in program might look like:

import com.hp.hpl.jena.query.* ;

Model model = ... ;
String queryString = StringUtils.join("\n",
         new String[]{
     "PREFIX dc <http://purl.org/dc/elements/1.1/>",
     "SELECT ?doc { ?doc dc:title ?title }"
         }) ;
Query query = QueryFactory.create(queryString) ;
QueryExecution qexec =
    QueryExecutionFactory.create(query, model) ;
try {
    ResultSet results = qexec.execSelect() ;
    for ( ; results.hasNext() ; )
       QuerySolution soln = results.nextSolution() ;
       Literal l = soln.getLiteral("doc") ;
} finally { qexec.close() ; }

Suppose the application knows the title it's interesting in - can it use this to get the document?

The value of ?title made a parameter to the query and fixed by an initial binding. All query solutions will be restricted to patterns matches where ?title is that RDF term.

QuerySolutionMap initialSettings = new QuerySolutionMap() ;
initialSettings.add("title", node) ;

and this is passed to the factory that creates QueryExecution's:

QueryExecution qexec = 
                                 initialSettings) ;

It doesn't matter if the node is a literal, a resource with URI or a blank node. It becomes a fixed value in the query, even a blank node, because it's not part of the SPARQL syntax, it's a fixed part of every solution.

This gives named parameters to queries enabling something like SQL prepared statements except with named parameters not positional ones.

This can make a complex application easier to structure and clearer to read. It's better than bashing strings together, which is error prone, inflexible, and does not lead to clear code.


AndyS said...

See also leigh's blog

Anonymous said...

A try/catch/finally without catch-clause is very stupid... :-D See you...

AndyS said...

The code block after finally gets executed for any runtime exceptions and also if the code exist normally. Using finally guarantees that the clean up is done even if an unexpected exception (from another subsystem maybe) happens.