[Swift-devel] local:slurm ppn and jobs per node not honored

Mihael Hategan hategan at mcs.anl.gov
Tue Oct 1 14:42:19 CDT 2013


slots = 1?

On Tue, 2013-10-01 at 14:35 -0500, Ketan Maheshwari wrote:
> Hi,
> 
> On stampede, I am observing that the sites parameters such as ppn and
> jobspernode are being overridden by the throttle value. I intended to run 1
> job per node on a single node run. However, it seems more jobs are running.
> My sites parameters are as follows:
> 
>    <profile namespace="globus"  key="jobsPerNode">1</profile>
>    <profile namespace="globus"  key="ppn">1</profile>
>    <profile namespace="globus"  key="maxTime">7500</profile>
>    <profile namespace="globus"  key="maxwalltime">00:10:00</profile>
>    <profile namespace="globus"  key="lowOverallocation">100</profile>
>    <profile namespace="globus"  key="highOverallocation">100</profile>
>    <profile namespace="globus"  key="queue">normal</profile>
>    <profile namespace="globus"  key="nodeGranularity">1</profile>
>    <profile namespace="globus"  key="maxNodes">1</profile>
>    <profile namespace="karajan" key="jobThrottle">.3199</profile>
> 
> To reproduce, I run a catsnsleep example with 100 tasks and sleep value 10
> seconds. From the standard out and completion time, it seems Swift is
> running more jobs in parallel despite jobsPerNode and ppn set to be 1.
> 
> Attached is the stdout and log for this run. I am setting throttle high
> since otherwise coasters will leave resources and try to reacquire on each
> job burst.
> 
> Thanks,
> Hi,
> 
> 
> On stampede, I am observing that the sites parameters such as ppn and
> jobspernode are being overridden by the throttle value. I intended to
> run 1 job per node on a single node run. However, it seems more jobs
> are running. My sites parameters are as follows:
> 
> 
>    <profile namespace="globus"  key="jobsPerNode">1</profile>
>    <profile namespace="globus"  key="ppn">1</profile>
>    <profile namespace="globus"  key="maxTime">7500</profile>
>    <profile namespace="globus"  key="maxwalltime">00:10:00</profile>
>    <profile namespace="globus"  key="lowOverallocation">100</profile>
>    <profile namespace="globus"  key="highOverallocation">100</profile>
>    <profile namespace="globus"  key="queue">normal</profile>
>    <profile namespace="globus"  key="nodeGranularity">1</profile>
>    <profile namespace="globus"  key="maxNodes">1</profile>
>    <profile namespace="karajan" key="jobThrottle">.3199</profile>
> 
> 
> To reproduce, I run a catsnsleep example with 100 tasks and sleep
> value 10 seconds. From the standard out and completion time, it seems
> Swift is running more jobs in parallel despite jobsPerNode and ppn set
> to be 1.
> 
> 
> Attached is the stdout and log for this run. I am setting throttle
> high since otherwise coasters will leave resources and try to
> reacquire on each job burst.
> 
> 
> Thanks,
> -- 
> Ketan
> 
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel





More information about the Swift-devel mailing list