[Swift-devel] Re: Meet friday to discuss CNARI status?

Michael Wilde wilde at mcs.anl.gov
Thu Jan 31 16:08:18 CST 2008


Sarah, all - I had a chance to dig deeper into Sarah's email describing 
problems in expressing her workflow in swift.

I've commented on her observations and am sending to the list for 
discussion.


On 1/31/08 2:05 PM, skenny at uchicago.edu wrote:

...

> regarding freesurfer implementation in swift...so, my previous
> implementation in vds (which has been pretty stable) worked in
> single-site execution. the reason for this was that freesurfer
> produces LOTS of files (including logs and sigfiles that are
> sometimes created implicitly but will cause a failure later in
> the workflow if they're not there) which vary somewhat
> depending on your run and are extremely difficult to map
> individually.

Can you send us a sample of their naming pattern and a note on hw many 
and how large?

> so what i've done is build the directory tree on
> the remote site on the shared filesystem and then let jobs go
> out, crunch away on that tree and then tar the whole thing up
> and send it back at the end.

Sounds reasonable for the moment.  Are you already doing this as a 
wrapper script, that can build a directory anywhere, and not require 
pre-setup of each site you want to use?

> the freesurfer workflow (aka reconall) comes in 3
> stages--recon1, recon2 and recon3. i finished recon1 in swift
> using single-site execution in a similar fashion as i'd done
> in the past. but with many sites failing  me (including uc/anl
> a lot now)

Can you describe these failures?  We need to improve swift's ability to 
deal with ste problems and in some cases report site errors and or 
suggest site improvements.

> and due to the prompting of ben and mihael i've
> decided to try and rework it for multi-site execution...this
> means rewriting recon1 instead of moving on to recon2 which
> was my intent. so that's where i'm at...drudging my way thru a
> multi-site version of recon1.

It would be ideal if there were no difference between a 1-site and 
multiu-site version (ie what you call the multiste version is really a 
general version with no site dependencies and fully-explicit file 
declarations.

One way to do this is with a wraper script that tars up the intermediate 
files and declares it as an output of the swift function. (As you 
suggest below).

> incidentally, this relates back
> to our discussion of being able to pass a dir tree as
> input/output in swift. if that were possible we could have
> multi-site execution in swift more easily; but because it's
> not i need to rework every stage of the (hefty) workflow to
> make sure it's mapping all potential input and output files.

This is the part that I'd like to work through with you, because I 
thought tht in general passing a dir tree was straightforward. So I'd 
like to work with you to understand why in this case its not

> so, admittedly, it's a little frustrating...it's not
> impossible it's just giving me flashbacks of starting
> freesurfer in vds from the very beginning and how trying it
> was, heh. 

Thats what I was hoping to avoid. I think we'll need to work on this 
harder with the group.

> i guess the other 2 options, which might speed it up a little
> (but are maybe not using the expressiveness of swift so much)
> would be to either 1) continue working on a single-site
> execution changing sites when necessary or 2) passing a tar of
> the entire tree for each job.

Right - that option 2 is not so bad, but lets see if its needed.

Swift lets you declare an argument to be a dataset made up of mutiple 
files.  I suspect the problem is simply that the names of the output 
files are not known before the application starts.  The remedy is a 
wrapper, and if thats the case, we might as well use the tarball soution 
instead of single files.

- Mike




More information about the Swift-devel mailing list