I've implemented two new extensions for ARQ:
- Assignment
- Sub-queries
Both these expose facilities that are already in the query algebra.
Sub-queries are done by simply allowing query algebra operators to appear
anywhere in the query, not requiring solution modifiers to only be at the
outer level of the query, so it allows extensions like counting, to
be inside the query and available to the rest of the pattern matching. An
assigment operator existed as an algebra extension for optimization and to
support ARQ SELECT
expressions
Both are syntactic extensions and
are available if the query is parsed
with language Syntax.syntaxARQ
.
Currently available in ARQ SVN.
Assignment
This assigns a computed value to a variable in the middle of a pattern.
LET (?x := ?y + 5 )
The assignment operator is ":=
". A single "=
" is
already the test for equals in SPARQL.
This means that a computed value can be used in other pattern matching:
SELECT ?y ?area { ?x rdf:type :Rectangle ; :height ?h ; :width ?w . LET (?area := ?h*?w ) GRAPH <otherShapes> { ?y :area ?area . # Shapes with the same area } }
Application writer can provide their own functions, maybe to do a little data munging to map between different formats:
?x foaf:name ?name . # "John Smith" # Convert to a different style: "Smith, John" for example. LET (?vcardName := my:convertName(?name) ) ?y vCard:FN ?vcardName .
There are some rules for the assignment:
- if the expression does not evaluate (e.g. unbound variable in the expression), no assignment occurs and the query continues.
- if the variable is unbound, and the expression evaluates, the variable is bound to the value.
- if the variable is bound to the same value as the expression evaluates, nothing happens and the query continues.
- if the variable is bound to a different value as the expression evaluates, an error occurs and the current solution will be excluded from the results.
ARQ already has expressions in
SELECT
expressions so a combination of sub-query and expression can achieve the
same effect but it's unnatural and verbose and sometimes requires parts of the
pattern matching to be written twice, inside and outside the sub-query.
One place where LET might be useful is in a CONSTRUCT
query. In
strict SPARQL, only terms found in the original data can be used for variables
in the construct template but with LET-assignment:
CONSTRUCT { ?x :lengthInInches ?inch } WHERE { ?x :lengthInCM ?cm LET (?inch := ?cm/2.54 ) }
This isn't a new idea - see for example: "A SPARQL Semantics based on Datalog" - although the syntax in ARQ is designed to group the terms better.
Sub-queries
A sub-query can be used to apply some solution modifier to a sub-pattern.
Useful examples include aggregation, especially
grouping and counting, and LIMIT
with ORDER BY
to get only some of the results of a pattern match.
{ SELECT (COUNT(*) AS ?c) { ?s ?p ?o } }
A sub-query is enclosed by {}
and must be the only thing inside
those braces, the same style as
Virtuoso Subqueries.
The sub-query will be combined, with
SPARQL join,
with other patterns in the same group. In the example
Find how many people all persons with two or more phones foaf:knows:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?person ?knowsCount { # ?person who have 2 or more phones { SELECT ?person WHERE { ?person foaf:phone ?phone } GROUP BY ?person HAVING (COUNT(?phone) >= 2) } # Join on ?person with how many people they foaf:knows { SELECT ?person (COUNT(?x) AS ?knowsCount) WHERE { ?person foaf:knows ?x .} GROUP BY ?person } }
Queries with sub-queries can become complicated quite quickly so I usually write each of the part separately then combining them.
2 comments:
Thanks that was really useful to me.
good job. you find new method how to to ARQ
Post a Comment