[Swift-devel] sites.xml for ranger sge coasters

Glen Hocky hockyg at uchicago.edu
Fri Mar 9 15:59:29 CST 2012


David, Mike
I'm now in a position to verify if this is working correctly or not, again.

I wanted my new swift LAMMPS scripts to run 1 task per node using 16 cores.
This sites file seems to do that correctly, i.e., it appears w/ David's
change that only one coaster is started per node (and one job run per
coaster). In principle I should be able to test packing different number of
jobs in other ways

-Glen

      <pool handle="ranger">
          <execution jobmanager="local:sge" provider="coaster" url="none"/>
          <profile namespace="globus" key="maxWallTime">00:29:00</profile>
          <profile namespace="globus" key="maxTime">3600</profile>
          <profile key="jobsPerNode" namespace="globus">1</profile>
          <profile key="coresPerNode" namespace="globus">16</profile>
          <profile key="slots" namespace="globus">200</profile>
          <profile key="nodeGranularity" namespace="globus">1</profile>
          <profile key="pe" namespace="globus">16way</profile>
          <profile key="maxNodes" namespace="globus">1</profile>
          <profile key="queue" namespace="globus">development</profile>
          <profile key="jobThrottle" namespace="karajan">1.99</profile>
          <profile key="initialScore" namespace="karajan">10000</profile>
          <profile namespace="globus" key="project">TG-CHE110004</profile>
          <scratch>/scratch/01021/hockyg/glass-lammps-runs</scratch>

<workdirectory>/share/home/01021/hockyg/reichman/glassy_dynamics/code/swift_lammps/run/test/swiftwork</workdirectory>
          <filesystem provider="local" url="none" />
        </pool>


On Thu, Mar 8, 2012 at 10:21 AM, Michael Wilde <wilde at mcs.anl.gov> wrote:

> David, thanks for addressing this problem.  Does it affect any of the
> other local providers: pbs, condor, (sge), cobalt?
>
> (I need to do some cobalt runs on Eureka for a user today, so I hope that
> provider is OK).
>
> You should describe the issue and fix on swift-devel.
>
> We should start a convention where we can document known issues for
> releases, so that users dont have to discover these bugs on their own.  Can
> you make an action item to propose and start such a place (probably
> crosslinked to both Downloads and Documentation). Not urgent for today, but
> next week would be good.
>
> Thanks,
>
> - Mike
>
>
>
> ----- Original Message -----
> > From: "David Kelly" <davidk at ci.uchicago.edu>
> > To: "Ketan Maheshwari" <ketancmaheshwari at gmail.com>
> > Cc: "Michael Wilde" <wilde at mcs.anl.gov>
> > Sent: Thursday, March 8, 2012 8:45:00 AM
> > Subject: Re: sites.xml for ranger sge coasters
> > 0.93 is frozen, but I committed the same change to 0.93.1 this
> > morning.
> >
> > ----- Original Message -----
> > > From: "Ketan Maheshwari" <ketancmaheshwari at gmail.com>
> > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > Cc: "Michael Wilde" <wilde at mcs.anl.gov>
> > > Sent: Wednesday, March 7, 2012 6:39:09 PM
> > > Subject: Re: sites.xml for ranger sge coasters
> > > is it committed in 0.93 too?
> > >
> > >
> > > On Wed, Mar 7, 2012 at 6:10 PM, David Kelly < davidk at ci.uchicago.edu
> > > >
> > > wrote:
> > >
> > >
> > > I submitted a fix to trunk for the SGE provider. The submit script
> > > was
> > > wrong - it started one worker per core, rather than one worker per
> > > host. (Oddly it's been like that for years without anybody
> > > noticing).
> > > I ran a few sleep/hostname tests and it seems to be working. Can you
> > > please give it a try?
> > >
> > > Below is the sites.xml I used for my test:
> > >
> > > <config>
> > > <pool handle="ranger">
> > > <execution jobmanager="local:sge" provider="coaster" url="none"/>
> > >
> > > <filesystem provider="local" url="none" />
> > > <profile namespace="globus" key="maxWallTime">5</profile>
> > > <profile namespace="globus" key="maxTime">600</profile>
> > > <profile key="jobsPerNode" namespace="globus">16</profile>
> > > <profile key="slots" namespace="globus">1</profile>
> > > <profile key="nodeGranularity" namespace="globus">3</profile>
> > > <profile key="pe" namespace="globus">16way</profile>
> > > <profile key="maxNodes" namespace="globus">3</profile>
> > > <profile key="queue" namespace="globus">development</profile>
> > > <profile key="jobThrottle" namespace="karajan">0.4799</profile>
> > > <profile key="initialScore" namespace="karajan">10000</profile>
> > > <profile namespace="globus" key="project">TG-DBS080004N</profile>
> > > <workdirectory>/share/home/01503/davidkel/swiftwork</workdirectory>
> > > </pool>
> > > </config>
> > >
> > > Thanks,
> > > David
> > >
> > >
> > >
> > >
> > > --
> > > Ketan
>
> --
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20120309/1e2611c4/attachment.html>


More information about the Swift-devel mailing list