28 August 2010

Migrating from the SPARQL Update submission language to the emerging SPARQL 1.1. Update standard

SPARQL 1.1 Update is work-in-progress by the SPARQL Working Group but the general design and language is reasonably stable. There is also the W3C submission SPARQL Update from July 2008. The language are similar in style but the details of the grammars differ. So how to migrate from the syntax used in the submission to the upcoming SPARQL recommendation for a SPARQL Update language?

One way is to provide both languages behind a common API, with the application indicating which language to use. This maximises compatibility because if the submission is the chosen language, the parser for the submission language will be used. But the application has to be changed to move between the languages and conversion of update scripts has to be done for each script, so probably it's a "big bang" change over. The two languages are very close - is it possible to have a single language that covers both languages? Then the application can mix usages and when an update request is printed it can be printed in the soon-to-be standard language, helping people see how the language has changed.

It turns out that most, but no all, the submission language can be incorporated into the grammar for the emerging standard. The cases not covered don't seem to be ones likely to be widely used although it would be good to know if they are.

  • CREATE, CLEAR, LOAD, DROP are covered.
  • INSERT DATA, DELETE DATA on the default graph covered or working on one a named graph is covered but not on more than one graph at once.
  • An extra grammar rule for MODIFY is supported, again working on the default graph or one named graph. but with only a single, optional GRAPH <uri>.
  • The old style INSERT { :s :p :o }, DELETE { :s :p :o }, that is, insert or delete some data using just the INSERT or DELETE keyword, without DATA, leads to ambiguity in the combined grammar. These forms are not supported in the combined language. In fact, these forms pre-date the DATA forms in the submission language.

The ability to work on only one named graph needs a little explanation. In the combined grammar, the INTO or FROM is used to set the WITH part of an update. There can be at most one WITH. In the submission,

INSERT INTO <g1> <g2> <g3> { ... } WHERE { ... }

is legal. In terms of language, this could be incorporated into the extended language but it introduces a capability not present in the upcoming working group language and it can't be written out again without repeating the operation, once for each named graph. Operating on a single named graph, or the default graph, is covered by the standard.

For old style INSERT or DELETE of data, conversion can be done by adding in the word DATA to the operation or adding WHERE {} to the update operation. Both these conversions yield something that is legal and the same under the submission language so the conversation can be done and retain the use of old software.

In summary: The accepted forms of the submission language are:

  INSERT [INTO <uri>] {...} WHERE {...}
  DELETE [FROM <uri>] {...} WHERE {...}
  INSERT DATA [INTO <uri>] {...}
  DELETE DATA [FROM <uri>] {...}

By using an extended grammar, the application can even mix syntax of the submission on SPARQL Update and SPARQL 1.1 Update in a single request or, indeed, single operation. When printed the output can be in the equivalent SPARQL 1.1 Syntax.

ARQ (currently, the development snapshot) includes a command line SPARQL 1.1 Update extended parser, "arq.uparse". arq.uparse reads the extended syntax and prints the equivalent strict SPARQL 1.1 Update form. It can be used to translate from the submission language to W3C standards language. More on practical details: jena-dev/message/45040.

Key points from the extended Grammar: The working group is not planning on including this published SPARQL 1.1 Update grammar.

UpdateUnit  :=  Prologue Update <EOF>

Update  :=  ( Update1 )+

# As for SPARQL 1.1 Update with addition of "ModifyOld"
Update1 :=  ( Load | Clear | Drop | Create |
              InsertData | DeleteData | DeleteWhere |
              Modify | ModifyOld )
            ( <SEMICOLON> )?

Load    :=  <LOAD> IRIref ( <INTO> ( <GRAPH> )? IRIref )?

Clear   :=  <CLEAR> ( <SILENT> )? GraphRefAll

Drop    :=  <DROP> ( <SILENT> )? GraphRefAll

Create  :=  <CREATE> ( <SILENT> )? GraphRef

InsertData  :=  <INSERT_DATA> OptionalIntoTarget QuadPattern

DeleteData  :=  <DELETE_DATA> OptionalFromTarget QuadData

DeleteWhere :=  <DELETE_WHERE> QuadPattern

Modify  :=  ( <WITH> IRIref )?
            ( DeleteClause ( InsertClause )? | InsertClause )
            ( UsingClause )*
            <WHERE> GroupGraphPattern

# The MODIFY form from the submission
ModifyOld   :=  <MODIFY> ( IRIref )?
                ( DeleteClause )?
                ( InsertClause )?
                <WHERE> GroupGraphPattern

DeleteClause    :=  <DELETE> OptionalFromTarget QuadPattern

InsertClause    :=  <INSERT> OptionalIntoTarget QuadPattern

# Optional INTO: wraps the QuadPattern with a GRAPH
OptionalIntoTarget  :=  ( ( <INTO> )? IRIref )?

# Optional FROM; wraps the QuadPattern with a GRAPH
OptionalFromTarget  :=  ( ( <FROM> )? IRIref )?

UsingClause :=  <USING> ( IRIref | <NAMED> IRIref )