<div>David, Mike</div><div>I'm now in a position to verify if this is working correctly or not, again. </div><div><br></div><div>I wanted my new swift LAMMPS scripts to run 1 task per node using 16 cores. This sites file seems to do that correctly, i.e., it appears w/ David's change that only one coaster is started per node (and one job run per coaster). In principle I should be able to test packing different number of jobs in other ways</div>
<div><br></div><div>-Glen</div><div><br></div><div> <pool handle="ranger"></div><div> <execution jobmanager="local:sge" provider="coaster" url="none"/></div>
<div> <profile namespace="globus" key="maxWallTime">00:29:00</profile></div><div> <profile namespace="globus" key="maxTime">3600</profile></div>
<div> <profile key="jobsPerNode" namespace="globus">1</profile></div><div> <profile key="coresPerNode" namespace="globus">16</profile></div>
<div> <profile key="slots" namespace="globus">200</profile></div><div> <profile key="nodeGranularity" namespace="globus">1</profile></div>
<div>
<profile key="pe" namespace="globus">16way</profile></div><div> <profile key="maxNodes" namespace="globus">1</profile></div><div> <profile key="queue" namespace="globus">development</profile></div>
<div> <profile key="jobThrottle" namespace="karajan">1.99</profile></div><div> <profile key="initialScore" namespace="karajan">10000</profile></div>
<div> <profile namespace="globus" key="project">TG-CHE110004</profile></div><div> <scratch>/scratch/01021/hockyg/glass-lammps-runs</scratch></div><div> <workdirectory>/share/home/01021/hockyg/reichman/glassy_dynamics/code/swift_lammps/run/test/swiftwork</workdirectory></div>
<div> <filesystem provider="local" url="none" /></div><div> </pool></div><div><br></div><br><div class="gmail_quote">On Thu, Mar 8, 2012 at 10:21 AM, Michael Wilde <span dir="ltr"><<a href="mailto:wilde@mcs.anl.gov">wilde@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">David, thanks for addressing this problem. Does it affect any of the other local providers: pbs, condor, (sge), cobalt?<br>
<br>
(I need to do some cobalt runs on Eureka for a user today, so I hope that provider is OK).<br>
<br>
You should describe the issue and fix on swift-devel.<br>
<br>
We should start a convention where we can document known issues for releases, so that users dont have to discover these bugs on their own. Can you make an action item to propose and start such a place (probably crosslinked to both Downloads and Documentation). Not urgent for today, but next week would be good.<br>
<br>
Thanks,<br>
<br>
- Mike<br>
<br>
<br>
<br>
----- Original Message -----<br>
> From: "David Kelly" <<a href="mailto:davidk@ci.uchicago.edu">davidk@ci.uchicago.edu</a>><br>
> To: "Ketan Maheshwari" <<a href="mailto:ketancmaheshwari@gmail.com">ketancmaheshwari@gmail.com</a>><br>
> Cc: "Michael Wilde" <<a href="mailto:wilde@mcs.anl.gov">wilde@mcs.anl.gov</a>><br>
> Sent: Thursday, March 8, 2012 8:45:00 AM<br>
> Subject: Re: sites.xml for ranger sge coasters<br>
> 0.93 is frozen, but I committed the same change to 0.93.1 this<br>
> morning.<br>
><br>
> ----- Original Message -----<br>
> > From: "Ketan Maheshwari" <<a href="mailto:ketancmaheshwari@gmail.com">ketancmaheshwari@gmail.com</a>><br>
> > To: "David Kelly" <<a href="mailto:davidk@ci.uchicago.edu">davidk@ci.uchicago.edu</a>><br>
> > Cc: "Michael Wilde" <<a href="mailto:wilde@mcs.anl.gov">wilde@mcs.anl.gov</a>><br>
> > Sent: Wednesday, March 7, 2012 6:39:09 PM<br>
> > Subject: Re: sites.xml for ranger sge coasters<br>
> > is it committed in 0.93 too?<br>
> ><br>
> ><br>
> > On Wed, Mar 7, 2012 at 6:10 PM, David Kelly < <a href="mailto:davidk@ci.uchicago.edu">davidk@ci.uchicago.edu</a><br>
> > ><br>
> > wrote:<br>
> ><br>
> ><br>
> > I submitted a fix to trunk for the SGE provider. The submit script<br>
> > was<br>
> > wrong - it started one worker per core, rather than one worker per<br>
> > host. (Oddly it's been like that for years without anybody<br>
> > noticing).<br>
> > I ran a few sleep/hostname tests and it seems to be working. Can you<br>
> > please give it a try?<br>
> ><br>
> > Below is the sites.xml I used for my test:<br>
> ><br>
> > <config><br>
> > <pool handle="ranger"><br>
> > <execution jobmanager="local:sge" provider="coaster" url="none"/><br>
> ><br>
> > <filesystem provider="local" url="none" /><br>
> > <profile namespace="globus" key="maxWallTime">5</profile><br>
> > <profile namespace="globus" key="maxTime">600</profile><br>
> > <profile key="jobsPerNode" namespace="globus">16</profile><br>
> > <profile key="slots" namespace="globus">1</profile><br>
> > <profile key="nodeGranularity" namespace="globus">3</profile><br>
> > <profile key="pe" namespace="globus">16way</profile><br>
> > <profile key="maxNodes" namespace="globus">3</profile><br>
> > <profile key="queue" namespace="globus">development</profile><br>
> > <profile key="jobThrottle" namespace="karajan">0.4799</profile><br>
> > <profile key="initialScore" namespace="karajan">10000</profile><br>
> > <profile namespace="globus" key="project">TG-DBS080004N</profile><br>
> > <workdirectory>/share/home/01503/davidkel/swiftwork</workdirectory><br>
> > </pool><br>
> > </config><br>
> ><br>
> > Thanks,<br>
> > David<br>
> ><br>
> ><br>
> ><br>
> ><br>
> > --<br>
> > Ketan<br>
<span class="HOEnZb"><font color="#888888"><br>
--<br>
Michael Wilde<br>
Computation Institute, University of Chicago<br>
Mathematics and Computer Science Division<br>
Argonne National Laboratory<br>
<br>
_______________________________________________<br>
Swift-devel mailing list<br>
<a href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a><br>
<a href="https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel" target="_blank">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel</a><br>
</font></span></blockquote></div><br>