[Swift-devel] local:slurm ppn and jobs per node not honored

Ketan Maheshwari ketancmaheshwari at gmail.com
Tue Oct 1 15:10:48 CDT 2013


Ahh sorry, my sites file was missing slots value. Swift was using the
default slots value of 20. Rerunning with slots=1 gives the desired task
rate.

Thanks,
Ketan


On Tue, Oct 1, 2013 at 2:42 PM, Mihael Hategan <hategan at mcs.anl.gov> wrote:

> slots = 1?
>
> On Tue, 2013-10-01 at 14:35 -0500, Ketan Maheshwari wrote:
> > Hi,
> >
> > On stampede, I am observing that the sites parameters such as ppn and
> > jobspernode are being overridden by the throttle value. I intended to
> run 1
> > job per node on a single node run. However, it seems more jobs are
> running.
> > My sites parameters are as follows:
> >
> >    <profile namespace="globus"  key="jobsPerNode">1</profile>
> >    <profile namespace="globus"  key="ppn">1</profile>
> >    <profile namespace="globus"  key="maxTime">7500</profile>
> >    <profile namespace="globus"  key="maxwalltime">00:10:00</profile>
> >    <profile namespace="globus"  key="lowOverallocation">100</profile>
> >    <profile namespace="globus"  key="highOverallocation">100</profile>
> >    <profile namespace="globus"  key="queue">normal</profile>
> >    <profile namespace="globus"  key="nodeGranularity">1</profile>
> >    <profile namespace="globus"  key="maxNodes">1</profile>
> >    <profile namespace="karajan" key="jobThrottle">.3199</profile>
> >
> > To reproduce, I run a catsnsleep example with 100 tasks and sleep value
> 10
> > seconds. From the standard out and completion time, it seems Swift is
> > running more jobs in parallel despite jobsPerNode and ppn set to be 1.
> >
> > Attached is the stdout and log for this run. I am setting throttle high
> > since otherwise coasters will leave resources and try to reacquire on
> each
> > job burst.
> >
> > Thanks,
> > Hi,
> >
> >
> > On stampede, I am observing that the sites parameters such as ppn and
> > jobspernode are being overridden by the throttle value. I intended to
> > run 1 job per node on a single node run. However, it seems more jobs
> > are running. My sites parameters are as follows:
> >
> >
> >    <profile namespace="globus"  key="jobsPerNode">1</profile>
> >    <profile namespace="globus"  key="ppn">1</profile>
> >    <profile namespace="globus"  key="maxTime">7500</profile>
> >    <profile namespace="globus"  key="maxwalltime">00:10:00</profile>
> >    <profile namespace="globus"  key="lowOverallocation">100</profile>
> >    <profile namespace="globus"  key="highOverallocation">100</profile>
> >    <profile namespace="globus"  key="queue">normal</profile>
> >    <profile namespace="globus"  key="nodeGranularity">1</profile>
> >    <profile namespace="globus"  key="maxNodes">1</profile>
> >    <profile namespace="karajan" key="jobThrottle">.3199</profile>
> >
> >
> > To reproduce, I run a catsnsleep example with 100 tasks and sleep
> > value 10 seconds. From the standard out and completion time, it seems
> > Swift is running more jobs in parallel despite jobsPerNode and ppn set
> > to be 1.
> >
> >
> > Attached is the stdout and log for this run. I am setting throttle
> > high since otherwise coasters will leave resources and try to
> > reacquire on each job burst.
> >
> >
> > Thanks,
> > --
> > Ketan
> >
> >
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>
>
>


-- 
Ketan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20131001/2b15bbbe/attachment.html>


More information about the Swift-devel mailing list