[Swift-user] Question about nr of nodes

Lorenzo Pesce lpesce at uchicago.edu
Wed Apr 18 20:14:01 CDT 2012


Going over the emails produced during the famous Beagle crash =)

(While watching "Word World" with my son, so don't worry if it doesn't make too much sense. By now we got to "Puss in Boots".)

> It all depends on ho you want to shape your jobs.
> slots: The maximum number of coaster blocks to submit(pbs jobs).

Would there be a reason to keep the number of coasters to 1? I would be tempted to put it equal to maxnodes.
The issue would then be how to keep the overall number of nodes used to, say, numnodes?
Of course I could set a combination of the two, like

<profile namespace "globus" key="slots">$nnodes</profile>
<profile namespace "globus" key="maxNodes">1</profile>
<profile namespace "globus" key="nodeGranularity">1</profile>

Like Glen did.


> nodeGranularity: How much to increment node count by(nodeGranularity <= maxNodes)

Within a coaster? or either by adding a coaster block or by adding a node to a coaster block?

> Adjusting slots will provide more active coaster blocks.
> 
> In Glen's example he will requests "$nodes" number of single node jobs.  He could have said the same thing by setting:
> <profile namespace "globus" key="slots">1</profile>
> <profile namespace "globus" key="maxNodes">$nodes</profile>
> <profile namespace "globus" key="nodeGranularity">$nodes</profile>
> 
> In his example, several single node coaster blocks would be submitted for execution.  With the above settings, a single multi-node coaster block would be submitted.  If the machine is overloaded and there is slow response time, then Glen's approach would probably be better as the scheduler may bias some single node jobs to run over multi-node jobs to keep the entire machine busy.  This way progress will be made(even if it is slow progress).

HIs would be a good approach to make good use of the backfill, wouldn't it? the jobs can use any time and any number of coasters, so they can fit in the holes. Or am  I wrong here?


> Another setting that should be set is:
> <profile namespace="globus" key="lowOverallocation">100</profile>
> <profile namespace="globus" key="highOverallocation">100</profile>
> 
> This will force the coaster blocks to be exactly the maxTime you asked for.  If those are not set coasters dynamically chooses a wall time which is often lower than the time you specified.

But would this limit its ability to fit into the backfill? For this job, I can afford having many relatively short coasters (say longer than 20 minutes), but I also want to make sure that I am getting all the "total time I am asking for", but this I mean:
totnrnodes*walltime. The idea being that to run all my jobs I need about 2400 core hours, which requires 100 node hours. I don't care too much how it is sliced in this problem. Does it make sense? the more I can fit into the backfill holes, the better.


>>     <profile namespace="env" key="OMP_NUM_THREADS">$PPN</profile>

Added to the presentation.

>>     <profile namespace="globus" key="maxwalltime">$TIME</profile>
>>     <profile namespace="globus" key="maxTime">$MAXTIME</profile>

I am lost here. Difference? Maxwalltime is what I was talking about like 100 node hours? 
What I would like to be able to set is a minimum time/size (say, no coaster block shorter than 1/2 hour and smaller than 3 nodes), a increment (I understand that to be granularity, so if blocks need to be at least of 3 nodes for an MPI job, this is what I would set it to, presumably the minimum time would propagate here) and a grand total of CPU time to run all of them

>>     <profile namespace="karajan" key="jobThrottle">200.00</profile>
>>     <profile namespace="karajan" key="initialScore">10000</profile>

I need to read more about these two. I will look for the documentation and send you feedback about it.
I need also to read a lot more about mappers and all the rest...






More information about the Swift-user mailing list