[Swift-user] Re: [Swift-devel] burnin' up ranger w/the latest coasters

Mihael Hategan hategan at mcs.anl.gov
Fri Oct 23 11:37:10 CDT 2009


On Fri, 2009-10-23 at 10:45 -0500, skenny at uchicago.edu wrote:
> however...when i use the configs here and i try to run a
> workflow with 196,608 jobs it seems that coasters starts to
> ramp up nicely, but maybe a little too well :) as it begins
> requesting more cores than i'm allowed in the normal queue on
> ranger. that is, the limit is 4096. i tried changing maxNodes
> to 4096 which did not work.

Shouldn't that be 4096/workersPerNode?

>  i'm wondering if workers per node
> should actually be 16 instead (?) but i know you've gotten it
> to work well with the setting at 32 so i'm not sure...

You could set it to 16. My reasoning for doubling it was that if the
processes you run are slightly I/O bound, then you'd get slightly better
performance by running two processes per core. 

> 
> anyway, it ramped up nicely (and was only like 8 jobs away
> from finishing the whole thing) i just need to know how to cap
> it off so it won't ask for more than 4096 cores. 
> 
> thanks
> ~sk
> 
> ---- Original message ----
> >Date: Wed, 14 Oct 2009 15:39:15 -0500 (CDT)
> >From: <skenny at uchicago.edu>  
> >Subject: Re: [Swift-devel] burnin' up ranger w/the latest
> coasters  
> >To: swift-user at ci.uchicago.edu, swift-devel at ci.uchicago.edu
> >
> >for those interested, here are the config files used for this
> run:
> >
> >swift.properties:
> >
> >sites.file=config/coaster_ranger.xml
> >tc.file=/ci/projects/cnari/config/tc.data
> >lazy.errors=false
> >caching.algorithm=LRU
> >pgraph=false
> >pgraph.graph.options=splines="compound", rankdir="TB"
> >pgraph.node.options=color="seagreen", style="filled"
> >clustering.enabled=false
> >clustering.queue.delay=4
> >clustering.min.time=60
> >kickstart.enabled=maybe
> >kickstart.always.transfer=false
> >wrapperlog.always.transfer=false
> >throttle.submit=3
> >throttle.host.submit=8
> >throttle.score.job.factor=64
> >throttle.transfers=16
> >throttle.file.operations=16
> >sitedir.keep=false
> >execution.retries=3
> >replication.enabled=false
> >replication.min.queue.time=60
> >replication.limit=3
> >foreach.max.threads=16384
> >
> >coaster_ranger.xml:
> >
> ><config>
> > <import file="sys.xml"/>
> ><import file="vdl-lib.xml"/>
> ><set name="username">
> >  <arg><vdl:new type="string" value="user"/></arg>
> ></set>
> ><pool handle="RANGER">
> >    <profile namespace="karajan"
> >key="jobThrottle">1000.0</profile>
> >    <filesystem provider="coaster"
> >url="gt2://gatekeeper.ranger.tacc.teragrid.org"/>
> >    <profile namespace="globus" key="queue">normal</profile>
> >    <profile namespace="globus" key="workersPerNode">32</profile>
> >    <profile namespace="globus" key="nodeGranularity">1</profile>
> >    <profile namespace="globus" key="slots">16</profile>
> >    <profile namespace="globus" key="maxNodes">8192</profile>
> >    <profile namespace="globus" key="maxTime">72000</profile>
> >    <profile namespace="globus"
> >key="project">TG-DBS080004N</profile>
> >    <execution provider="coaster"
> >url="gatekeeper.ranger.tacc.teragrid.org"
> >jobManager="gt2:gt2:SGE"/>
> >    <!--
> >workdirectory>/work/00043/tg457040/sidgrid_out/{username}</workdirectory
> >-->
> >   
> ><workdirectory>/work/00926/tg459516/sidgrid_out/{username}</workdirectory>
> >  </pool>
> ></config>
> >
> >
> >
> >---- Original message ----
> >>Date: Tue, 13 Oct 2009 11:14:17 -0500 (CDT)
> >>From: <skenny at uchicago.edu>  
> >>Subject: [Swift-devel] burnin' up ranger w/the latest coasters  
> >>To: swift-devel at ci.uchicago.edu
> >>
> >>Final status:  Finished successfully:131072 
> >>
> >>re-running some of the workflows from our recent SEM
> >>paper with the latest swift...sadly, queue time on ranger has
> >>only gone up since those initial runs...but luckily coasters
> >>has speeded things up, so it ends up evening out for time to
> >>solution :)
> >>
> >>not sure i fully understand the plot:
> >>
> >>http://www.ci.uchicago.edu/~skenny/workflows/sem_131k/
> >>
> >>log is here:
> >>
> >>/ci/projects/cnari/logs/skenny/4reg_2cond-20091012-1607-ugidm2s2.log
> >>_______________________________________________
> >>Swift-devel mailing list
> >>Swift-devel at ci.uchicago.edu
> >>http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >_______________________________________________
> >Swift-devel mailing list
> >Swift-devel at ci.uchicago.edu
> >http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel




More information about the Swift-user mailing list