[Swift-devel] Re: Meet friday to discuss CNARI status?
Michael Wilde
wilde at mcs.anl.gov
Thu Jan 31 16:08:18 CST 2008
Sarah, all - I had a chance to dig deeper into Sarah's email describing
problems in expressing her workflow in swift.
I've commented on her observations and am sending to the list for
discussion.
On 1/31/08 2:05 PM, skenny at uchicago.edu wrote:
...
> regarding freesurfer implementation in swift...so, my previous
> implementation in vds (which has been pretty stable) worked in
> single-site execution. the reason for this was that freesurfer
> produces LOTS of files (including logs and sigfiles that are
> sometimes created implicitly but will cause a failure later in
> the workflow if they're not there) which vary somewhat
> depending on your run and are extremely difficult to map
> individually.
Can you send us a sample of their naming pattern and a note on hw many
and how large?
> so what i've done is build the directory tree on
> the remote site on the shared filesystem and then let jobs go
> out, crunch away on that tree and then tar the whole thing up
> and send it back at the end.
Sounds reasonable for the moment. Are you already doing this as a
wrapper script, that can build a directory anywhere, and not require
pre-setup of each site you want to use?
> the freesurfer workflow (aka reconall) comes in 3
> stages--recon1, recon2 and recon3. i finished recon1 in swift
> using single-site execution in a similar fashion as i'd done
> in the past. but with many sites failing me (including uc/anl
> a lot now)
Can you describe these failures? We need to improve swift's ability to
deal with ste problems and in some cases report site errors and or
suggest site improvements.
> and due to the prompting of ben and mihael i've
> decided to try and rework it for multi-site execution...this
> means rewriting recon1 instead of moving on to recon2 which
> was my intent. so that's where i'm at...drudging my way thru a
> multi-site version of recon1.
It would be ideal if there were no difference between a 1-site and
multiu-site version (ie what you call the multiste version is really a
general version with no site dependencies and fully-explicit file
declarations.
One way to do this is with a wraper script that tars up the intermediate
files and declares it as an output of the swift function. (As you
suggest below).
> incidentally, this relates back
> to our discussion of being able to pass a dir tree as
> input/output in swift. if that were possible we could have
> multi-site execution in swift more easily; but because it's
> not i need to rework every stage of the (hefty) workflow to
> make sure it's mapping all potential input and output files.
This is the part that I'd like to work through with you, because I
thought tht in general passing a dir tree was straightforward. So I'd
like to work with you to understand why in this case its not
> so, admittedly, it's a little frustrating...it's not
> impossible it's just giving me flashbacks of starting
> freesurfer in vds from the very beginning and how trying it
> was, heh.
Thats what I was hoping to avoid. I think we'll need to work on this
harder with the group.
> i guess the other 2 options, which might speed it up a little
> (but are maybe not using the expressiveness of swift so much)
> would be to either 1) continue working on a single-site
> execution changing sites when necessary or 2) passing a tar of
> the entire tree for each job.
Right - that option 2 is not so bad, but lets see if its needed.
Swift lets you declare an argument to be a dataset made up of mutiple
files. I suspect the problem is simply that the names of the output
files are not known before the application starts. The remedy is a
wrapper, and if thats the case, we might as well use the tarball soution
instead of single files.
- Mike
More information about the Swift-devel
mailing list