[Swift-user] Passing hostType for MPI jobs
Ben Clifford
benc at hawaga.org.uk
Wed Jul 2 03:34:35 CDT 2008
So:
<job>
<executable>/bin/hostname</executable>
<directory>/home/benc/mpi</directory>
<stdout>/home/benc/mpi/test.stdout</stdout>
<stderr>/home/benc/mpi/test.stderr</stderr>
<hostCount>3</hostCount>
</job>
allocates three hosts for me, without specifying the type. This seems to
give the correct behaviour.
<job>
<executable>/bin/hostname</executable>
<directory>/home/benc/mpi</directory>
<stdout>/home/benc/mpi/test.stdout</stdout>
<stderr>/home/benc/mpi/test.stderr</stderr>
<hostCount>3</hostCount>
<extensions>
<resourceAllocationGroup>
<hostType>ia64-compute</hostType>
</resourceAllocationGroup>
</extensions>
</job>
allocates one host for me (ignoring the hostCount) but it is of the
correct type, ia64-compute. This seems to be incorrect behaviour because
it ignores the hostcount.
A different approach, using a different hostcount field that the job
extensions web page at
http://www.globus.org/toolkit/docs/4.0/execution/wsgram/WS_GRAM_Job_Desc_Extensions.html
suggests:
<job>
<executable>/bin/hostname</executable>
<directory>/home/benc/mpi</directory>
<stdout>/home/benc/mpi/test.stdout</stdout>
<stderr>/home/benc/mpi/test.stderr</stderr>
<extensions>
<resourceAllocationGroup>
<hostType>ia64-compute</hostType>
<hostCount>3</hostCount>
</resourceAllocationGroup>
</extensions>
</job>
results in:
[benc at tg-login1 mpi]$ globusrun-ws -submit -Ft PBS -F
tg-grid.uc.teragrid.org -job-description-file ./gram4-dbg.rsl
Submitting job...Done.
Job ID: uuid:cc6b465e-4810-11dd-9981-0007e9d811ce
Termination time: 07/03/2008 08:28 GMT
Current job state: Failed
Destroying job...Done.
globusrun-ws: Job failed: The executable could not be started.
qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
Likewise if I use this extension:
<resourceAllocationGroup>
<hostType>ia64-compute</hostType>
<cpuCount>3</cpuCount>
</resourceAllocationGroup>
But finally...
<extensions>
<resourceAllocationGroup>
<hostType>ia64-compute</hostType>
<hostCount>5</hostCount>
<cpusPerHost>1</cpusPerHost>
</resourceAllocationGroup>
</extensions>
allocates 5 hosts.
So it looks like you need to specify both hostCount and cpusPerHost.
So that is how to specify it with GRAM4 direct submission.
I'll have to have a play around to figure out how that can be specified in
Swift+GRAM4.
--
More information about the Swift-user
mailing list