[Swift-devel] persistent coaster service

Allan Espinosa aespinosa at cs.uchicago.edu
Mon Aug 9 17:09:16 CDT 2010


I tried it today on OSG.  The coaster service was run on bridled.ci .  But from
the session below, it looks like its connecting to the site headnode instead:

RunID: coaster
Progress:
Progress:  uninitialized:1  Selecting site:675  Initializing site shared
directory:1
Progress:  Initializing:2  Selecting site:1444  Initializing site shared
directory:1
Progress:  uninitialized:1  Selecting site:2499  Initializing site shared
directory:1
Progress:  uninitialized:1  Selecting site:3818  Initializing site shared
directory:1
Progress:  uninitialized:1  Initializing:1  Selecting site:4201  Initializing
site shared directory:1
Progress:  Initializing:1  Selecting site:3  Stage in:4202
Progress:  uninitialized:1  Initializing:1  Selecting site:5  Submitting:4202
Progress:  Initializing:1  Selecting site:6  Stage in:2  Submitting:4202
Find: https://ff-grid2.unl.edu:1984
Find:  keepalive(120), reconnect - https://ff-grid2.unl.edu:1984
Progress:  Initializing:2  Selecting site:6  Stage in:144  Submitting:4303
Failed but can retry:16
Progress:  Initializing:2  Selecting site:31  Stage in:80  Submitting:4945
Failed but can retry:54
Progress:  Initializing:1  Selecting site:6  Stage in:2  Submitting:5222 Failed
but can retry:68
Progress:  Initializing:1  Selecting site:6  Stage in:1  Submitting:5686
Submitted:1 Failed but can retry:95
...
...

Corresponding log entry (IMO):
2010-08-09 17:01:31,690-0500 WARN  RemoteConfiguration Find:
https://ff-grid2.unl.edu:1984
2010-08-09 17:01:31,690-0500 WARN  RemoteConfiguration Find:  keepalive(120),
reconnect - https://ff-grid2.unl.edu:1984



sites.xml
  <pool handle="Firefly">
    <execution provider="coaster-persistent" url="ff-grid2.unl.edu"
      jobmanager="gt2:gt2:pbs" />

    <profile namespace="globus" key="maxTime">86400</profile>
    <profile namespace="globus" key="maxNodes">1290</profile>
    <profile namespace="globus" key="spread">0.8</profile>
    <profile namespace="globus" key="slots">10</profile>
    <profile namespace="globus" key="lowOverallocation">20</profile>
    <profile namespace="globus" key="remoteMonitorEnabled">true</profile>

    <profile namespace="karajan" key="initialScore">1500.0</profile>
    <profile namespace="karajan" key="jobThrottle">51.54</profile>

    <gridftp  url="gsiftp://ff-grid3.unl.edu"/>
    <workdirectory>/panfs/panasas/CMS/data/engage-scec/swift_scratch</workdirectory>
  </pool>


-Allan

On Thu, Aug 05, 2010 at 10:34:34PM -0500, Mihael Hategan wrote:

> ... was added in cog r2834.
> 
> Despite having run a few jobs with it, I don't feel very confident about
> it. So please test.
> 
> Start with bin/coaster-service and use "coaster-persistent" as provider
> in sites.xml. Everything else would be the same as in the "coaster"
> case.
> 
> Mihael
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 



More information about the Swift-devel mailing list