[Swift-devel] Re: Error in pbs job submission with coasters in PADS fast queue

wilde at mcs.anl.gov wilde at mcs.anl.gov
Thu Jun 3 14:42:39 CDT 2010


Forgot to add:

for now, a remedy is to change your config:

  <pool handle="coasterpads">
   <execution provider="coaster" url="login1.pads.ci.uchicago.edu" jobmanager="ssh:pbs"/>
   <profile namespace="globus" key="maxtime">3000</profile>
   <profile namespace="globus" key="workersPerNode">8</profile>
   <profile namespace="globus" key="slots">1</profile>
   <profile namespace="globus" key="nodeGranularity">1</profile>
   <profile namespace="globus" key="maxNodes">10</profile>
   <profile namespace="globus" key="queue">short</profile>
   <profile namespace="karajan" key="jobThrottle">0.5</profile>
   <profile namespace="karajan" key="initialScore">10000</profile>
   <filesystem provider="ssh" url="login1.pads.ci.uchicago.edu"/>
   <workdirectory>/home/wilde/swift/lab/wwjbug.2010.0602</workdirectory>
 </pool>

from queue fast to queue "short"

or to use slots 10 and maxnodes 1


This will enable you (and me) to pop up one level and debug the original coaster reply timeout problem that we were trying to re-create here.

- Mike


----- wilde at mcs.anl.gov wrote:

> Wenjun,
> 
> The error that was causing your "cat.swift" script to fail in the PADS
> fast queue as soon as the swift script generates more jobs than
> "workersPerNode" is that that fast queue is limited to one node.
> 
> Trapping the pbs submit script (I think debug=true in the
> 
> login1$ qsub PBS5612813565711306842.submit
> 
> which has:
> #PBS -l nodes=2
> #PBS -q fast
> 
> Gives:
> 
> qsub: Job exceeds queue resource limits MSG=cannot satisfy queue max
> nodect requirement
> 
> We (swift developers) need to find out why the error message wasn't
> made prominent.
> 
> We need to both document this limitation, make sure the error message
> gets to the user, and document procedures for finding and debugging
> Swift's generated .submit files.
> 
> - Mike

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list