[Swift-devel] Re: Analysis of wrapper.sh

Michael Wilde wilde at mcs.anl.gov
Tue Jul 29 23:52:54 CDT 2008


This is a god basis for a model, Ben. We should capture and refine this 
in the same doc as the one mentioned above.

On 7/29/08 12:37 PM, Ben Clifford wrote:
> in a similar context, plenty of people involved in swift development have 
> had thoughts about how data management might change in the future.
> 
> it would be useful to define what the application 'contract' is, so as to 
> separate out the accidental features of the wrapper as distinct from that 
> contract.

yes

> So (in my mind, though others perhaps see it differently): an applciation 
> can expect to be started up in a working directory that is not shared with 
> any other applicaiton start up; and that mapped input files will be 
> available for read only access within that directory (at top level or in 
> subdirs, depending on mapping); and mapped output files should be left in 
> that directory (at top level or in subdirs, depending on mapping)

yes.

may want to define the requirements that make hard and soft links 
possible in the work dir.

> Applications should not make assumptions about the nature of the file 
> system (which is how the wrapper can have an option to switch between 
> working dirs on the worker node or on shared fs). Nor should they 
> necessarily assume that the wrapper.sh script is the way in which things 
> get there, or that there is a shared directory at all; for example, if 
> Falkon's real or future data management features were wired in, Falkon 
> might handle the movement of files from some submit-side location to 
> individual application working directories...

all sound good at the moment, certainly on the right track.

> Separate from the above is our implementation of that interface, which is 
> both the wrapper.sh on the worker side and behaviour in the submit-side 
> Swift code to manage the site-side shared directories.

Ive felt that some similar contract is needed between the swift 
interpreter and the mappers it calls. There should be an abstract data 
model defined by the swift language - that of scalars, files, structs 
and arrays; and separately, the mapping of that to files/data-objects on 
various storage systems (including perhaps the "shared dirs" above).

Im in favor of nailing down contracts for the app exec side and the 
swift side, and then having families of mappers that implement different 
data management strategies.

I suspect we may need to generalize the interaction between the swift 
"vm" that interprets primitives and the storage providers that the 
mappers provides references to. Not sure how all these fit together but 
I think we should address this as we get closer to implementing 
VDS-style data caching with an RLS-like catalog and replica model.

- Mike




More information about the Swift-devel mailing list