[Swift-user] Passing hostType for MPI jobs

Ben Clifford benc at hawaga.org.uk
Wed Jul 2 03:34:35 CDT 2008


So:

<job>
  <executable>/bin/hostname</executable>
  <directory>/home/benc/mpi</directory>
  <stdout>/home/benc/mpi/test.stdout</stdout>
  <stderr>/home/benc/mpi/test.stderr</stderr>
  <hostCount>3</hostCount>
</job>

allocates three hosts for me, without specifying the type. This seems to 
give the correct behaviour.

<job>
  <executable>/bin/hostname</executable>
  <directory>/home/benc/mpi</directory>
  <stdout>/home/benc/mpi/test.stdout</stdout>
  <stderr>/home/benc/mpi/test.stderr</stderr>
  <hostCount>3</hostCount>
  <extensions>
    <resourceAllocationGroup>
      <hostType>ia64-compute</hostType>
    </resourceAllocationGroup>
  </extensions>
</job>

allocates one host for me (ignoring the hostCount) but it is of the 
correct type, ia64-compute. This seems to be incorrect behaviour because 
it ignores the hostcount.

A different approach, using a different hostcount field that the job 
extensions web page at  
http://www.globus.org/toolkit/docs/4.0/execution/wsgram/WS_GRAM_Job_Desc_Extensions.html 
suggests:

<job>
  <executable>/bin/hostname</executable>
  <directory>/home/benc/mpi</directory>
  <stdout>/home/benc/mpi/test.stdout</stdout>
  <stderr>/home/benc/mpi/test.stderr</stderr>
  <extensions>
    <resourceAllocationGroup>
      <hostType>ia64-compute</hostType>
      <hostCount>3</hostCount>
    </resourceAllocationGroup>
  </extensions>
</job>

results in:

[benc at tg-login1 mpi]$ globusrun-ws -submit -Ft PBS -F 
tg-grid.uc.teragrid.org -job-description-file ./gram4-dbg.rsl 
Submitting job...Done.
Job ID: uuid:cc6b465e-4810-11dd-9981-0007e9d811ce
Termination time: 07/03/2008 08:28 GMT
Current job state: Failed
Destroying job...Done.
globusrun-ws: Job failed: The executable could not be started.
qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes

Likewise if I use this extension:

    <resourceAllocationGroup>
      <hostType>ia64-compute</hostType>
      <cpuCount>3</cpuCount>
    </resourceAllocationGroup>

But finally...

  <extensions>
    <resourceAllocationGroup>
        <hostType>ia64-compute</hostType>
        <hostCount>5</hostCount>
        <cpusPerHost>1</cpusPerHost>
    </resourceAllocationGroup>
  </extensions>

allocates 5 hosts.

So it looks like you need to specify both hostCount and cpusPerHost.

So that is how to specify it with GRAM4 direct submission.

I'll have to have a play around to figure out how that can be specified in 
Swift+GRAM4.

-- 




More information about the Swift-user mailing list