[Swift-devel] multiple worker.sh in the same job
Mihael Hategan
hategan at mcs.anl.gov
Tue Jul 1 16:27:58 CDT 2008
This whole thing, I think, applies not only to MPI jobs, but also to any
job requesting more than one node. So I think the solution is not to
swap mpirun and wrapper.sh, but, along the lines of what Andriy did,
perform all the relevant wrapper functions in only one instance and have
a barrier right before running the executable as well as right after.
How exactly this would be done is a little hazy in my head, but I guess
that's what makes it interesting.
On Tue, 2008-07-01 at 21:11 +0000, Ben Clifford wrote:
> Here is one problem with Swift + MPI, with workaround, that Andriy Fedorov
> <fedorov at cs.wm.edu> and I have uncovered. I'm interested in anyone's
> commentary.
>
> If you use GRAM with jobtype=mpi, then your job is run through mpirun, and
> thus executes on each node in the job rather than once.
>
> In the case of Swift submitting this way, 'your job' actually means the
> Swift server side code, wrapper.sh, not 'your (the user's) job'.
>
> This means there are multiple wrapper.sh jobs running, all trying to use
> the same working directory, input files and output files.
>
> Andriy tried making only one of the nodes create output files (eg the rank
> 0 node), and that appears to work in his case, though I think the
> following is happening:
>
> * each worker will link the same input files into the same working
> directory. if this was a copy, this would be a potentially damaging
> race condition. as its a link, I think there is a still a race
> condition there that would cause some of the workers to fail (so
> perhaps in the presence of any input files at all this won't work - I
> think Andriy's test case does not have any input files).
>
> * I think that all except the rank-0 wrapper script indicates failure
> (because of missing output files); and the rank-0 wrapper script
> indicates success. Swift submit-side checks for success flag before
> failure flag, so regards the job as successful. I think this only
> works if at least one job succeeds, which pretty much means one job
> must generate all the output files, rather than different jobs
> generating different output files.
>
> I haven't really tested the above out in great depth, but I think that is
> what is happening
>
> >From a technical perspective, I think the way to address this is to swap
> the mpirun and wrapper.sh, so that one wrapper.sh runs, and inside that it
> runs mpirun which then spawns only the application executables.
>
> There you lose the abstraction from GRAM of being able to specify
> jobtype=mpi; instead you have to know how to do this yourself, and run the
> job as a normal, not mpi, job from GRAM's perspective.
>
> However, in the case of non-GRAM execution mechanisms, then that
> abstraction is not in place anyway.
>
More information about the Swift-devel
mailing list