[Swift-user] Passing hostType for MPI jobs

Ben Clifford benc at hawaga.org.uk
Thu Jul 3 03:34:18 CDT 2008


On Wed, 2 Jul 2008, Andriy Fedorov wrote:

> Ok, I tried that. It indeed allocates correct number of the requested
> hosts. But, there's still a problem. It appears that only one instance
> of the executable is running, at least when I specify jpbType to mpi.
> I am not sure it is being run as an MPI job.

I can replicate that with plain GRAM4 on TG UC. In the PBS Epilogue, I 
see:

 Limits:         nodes=5:ia64-compute:ppn=1,walltime=00:15:00
 Nodes:          tg-c053 tg-c034 tg-c020 tg-c011 tg-c007

but my code only has COMM_WORLD size 1.

This code doesn't run at all if it is not run through mpi, so I think the 
code *is* being run as an mpi job but the mpi node count is not getting 
specified correctly.

My present recommended way of doing mpi in Swift is not using jobtype=mpi 
in gram, though, so I don't want to spend too much time figuring this out. 
The gram-user at globus.org list and/or help at teragrid.org probably can offer 
more.

> By the way, I also discovered, that sometimes the order of tags in
> .xml makes difference (meaning, with certain order of "count",
> "walltime" and "hostCount" globusrun-ws will abort). I had no idea
> order matters...

yes.

Those options are defined with an XML Schema <sequence> which means, to be 
valid, they must appear in the order they are defined in:

http://www.globus.org/toolkit/docs/4.0/execution/wsgram/schemas/gram_job_description.html#type_JobDescriptionType

-- 



More information about the Swift-user mailing list