[Swift-devel] problem with coasters on pbs provider (on pads)

Mihael Hategan hategan at mcs.anl.gov
Thu Aug 5 16:23:35 CDT 2010


On Thu, 2010-08-05 at 17:14 -0400, Glen Hocky wrote:
> I'm having a problem running on PADS. It seems that when I submit jobs
> with workerspernode=8, the queuing system doesn't pick up on the fact
> that each job submitted by swift should have ppn=8 (specifically, that
> is missing from the qsub command registered by pbs)

It's not meant to.

At some point in the past the meaning of "workerspernode" has changed
from "start n instances of the worker" to "submit at most n concurrent
jobs to one worker".
Since this applies to SMP "nodes" the end result is similar (i.e. n jobs
per node), except only one worker.pl instance (and therefore only one
TCP connection) is used per node.

> 
> 
> when I do a qsub -f on my running jobs I get
>         submit_args = -A CI-CCR000013 -l nodes=1,walltime=02:00:00,
>         
>          size=1 /home/hockyg/.globus/scripts/PBS4482066898055181239.submit
> 
> 
> so there's no ppn=8 and i think it should also say size=8.
> 
> 
> the result is that I get 56 jobs running on one node
> 

I lost you there. You get 56 j/n with coasters (which would be bad) or
with the manual qsub (whose degree of badness I cannot assess)?

Mihael





More information about the Swift-devel mailing list