[Swift-devel] Re: Swift Provenance

Ben Clifford benc at hawaga.org.uk
Thu Mar 25 10:59:30 CDT 2010


(cc swift devel in case others are interested)

> We're going to work on making provenance more friendly to scientists, this
> will involve writing stored procedures (or something equivalent) on top of the
> provenance database to answer common questions that would be hard for a
> scientist to express in SQL.

ok.

VDS1 had a provenance query language that had roughly the same goals (I 
think)

I have some some criticisms of that:

The language tried to support multiple database backend formats (XML and 
SQL) and in doing so failed to support either well (for example, sometimes 
it was desirable to make a join, but this was not supported meaning that 
you had to drop out to the SQL database directly for such queries). Pick 
one model and support it (I think SQL is my preferred one, but you may 
differ).

Data was made available in a mishmash of models - for example, the 
abstract provenance query language would sometimes return an entire XML 
blob of data, which could not be further queried inside the provenance 
query language. This meant that having used to provenance query language 
to partly answer a question, in the end tools such as ad-hoc bad XML 
parsers written in perl ended up being used too (see the Provenance 
Challenge 1 queries for VDS for example). Make sure that whatever you 
produce can actually completely address the queries you are trying to 
address - if I have to hack some bodge on to the end of what you produce 
to answer my question, then I might as well have hit raw SQL in the first 
place.

I'm not sure what the other people involved in OPM are up to now, but 
maybe they have interesting scientist-friendly approaches. OPM feels like 
a very good fit for Swift's provenance, and so there is some potential 
there if anyone else is actually working in the "easy query" space (not 
sure if anyone is).

-- 




More information about the Swift-devel mailing list