[Swift-devel] persistent coaster service
Mihael Hategan
hategan at mcs.anl.gov
Mon Aug 9 17:36:59 CDT 2010
ff-grid2.unl.edu is the url you are supplying in sites.xml. It's
connecting to that. Though I'm surprised it works given that you are
implying that there is no service running there.
On Mon, 2010-08-09 at 17:09 -0500, Allan Espinosa wrote:
> I tried it today on OSG. The coaster service was run on bridled.ci . But from
> the session below, it looks like its connecting to the site headnode instead:
>
> RunID: coaster
> Progress:
> Progress: uninitialized:1 Selecting site:675 Initializing site shared
> directory:1
> Progress: Initializing:2 Selecting site:1444 Initializing site shared
> directory:1
> Progress: uninitialized:1 Selecting site:2499 Initializing site shared
> directory:1
> Progress: uninitialized:1 Selecting site:3818 Initializing site shared
> directory:1
> Progress: uninitialized:1 Initializing:1 Selecting site:4201 Initializing
> site shared directory:1
> Progress: Initializing:1 Selecting site:3 Stage in:4202
> Progress: uninitialized:1 Initializing:1 Selecting site:5 Submitting:4202
> Progress: Initializing:1 Selecting site:6 Stage in:2 Submitting:4202
> Find: https://ff-grid2.unl.edu:1984
> Find: keepalive(120), reconnect - https://ff-grid2.unl.edu:1984
> Progress: Initializing:2 Selecting site:6 Stage in:144 Submitting:4303
> Failed but can retry:16
> Progress: Initializing:2 Selecting site:31 Stage in:80 Submitting:4945
> Failed but can retry:54
> Progress: Initializing:1 Selecting site:6 Stage in:2 Submitting:5222 Failed
> but can retry:68
> Progress: Initializing:1 Selecting site:6 Stage in:1 Submitting:5686
> Submitted:1 Failed but can retry:95
> ...
> ...
>
> Corresponding log entry (IMO):
> 2010-08-09 17:01:31,690-0500 WARN RemoteConfiguration Find:
> https://ff-grid2.unl.edu:1984
> 2010-08-09 17:01:31,690-0500 WARN RemoteConfiguration Find: keepalive(120),
> reconnect - https://ff-grid2.unl.edu:1984
>
>
>
> sites.xml
> <pool handle="Firefly">
> <execution provider="coaster-persistent" url="ff-grid2.unl.edu"
> jobmanager="gt2:gt2:pbs" />
>
> <profile namespace="globus" key="maxTime">86400</profile>
> <profile namespace="globus" key="maxNodes">1290</profile>
> <profile namespace="globus" key="spread">0.8</profile>
> <profile namespace="globus" key="slots">10</profile>
> <profile namespace="globus" key="lowOverallocation">20</profile>
> <profile namespace="globus" key="remoteMonitorEnabled">true</profile>
>
> <profile namespace="karajan" key="initialScore">1500.0</profile>
> <profile namespace="karajan" key="jobThrottle">51.54</profile>
>
> <gridftp url="gsiftp://ff-grid3.unl.edu"/>
> <workdirectory>/panfs/panasas/CMS/data/engage-scec/swift_scratch</workdirectory>
> </pool>
>
>
> -Allan
>
> On Thu, Aug 05, 2010 at 10:34:34PM -0500, Mihael Hategan wrote:
>
> > ... was added in cog r2834.
> >
> > Despite having run a few jobs with it, I don't feel very confident about
> > it. So please test.
> >
> > Start with bin/coaster-service and use "coaster-persistent" as provider
> > in sites.xml. Everything else would be the same as in the "coaster"
> > case.
> >
> > Mihael
> >
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
More information about the Swift-devel
mailing list