[Swift-devel] Coaster persistent service issues - logs
Michael Wilde
wilde at mcs.anl.gov
Tue Aug 31 12:14:21 CDT 2010
----- Forwarded Message -----
From: "David Kelly" <dk0966 at cs.ship.edu>
To: "Michael Wilde" <wilde at mcs.anl.gov>
Cc: "Jonathan Monette" <jon.monette at gmail.com>, "Justin Wozniak" <wozniak at mcs.anl.gov>, "Mihael Hategan" <hategan at mcs.anl.gov>
Sent: Sunday, August 29, 2010 10:32:07 AM GMT -06:00 US/Canada Central
Subject: Re: change skype call time today - and some to-do notes
Hello all,
A few things I've noticed while trying out various coaster configurations this weekend:
Had similar problems with the -nosec option. Here is the output I got:
davidk at churn:~/cog/modules/swift/dist/swift-svn/bin$ coaster-service -nosec
Error loading credential: [JGLOBUS-10] Expired credentials (DC=org,DC=doegrids,OU=People,CN=David Kelly 16830,CN=753950975).
Error loading credential
org.globus.gsi.GlobusCredentialException: [JGLOBUS-10] Expired credentials (DC=org,DC=doegrids,OU=People,CN=David Kelly 16830,CN=753950975).
at org.globus.gsi.GlobusCredential.verify(GlobusCredential.java:321)
at org.globus.gsi.GlobusCredential.reloadDefaultCredential(GlobusCredential.java:593)
at org.globus.gsi.GlobusCredential.getDefaultCredential(GlobusCredential.java:575)
at org.globus.cog.abstraction.coaster.service.CoasterPersistentService.main(CoasterPersistentService.java:73)
I tested multiple connections when using a coasters-persistent+active mode. That seemed to have worked fine, with each new swift connection waiting for the previous to finish. I noticed there were some java exceptions in the log files:
org.globus.cog.karajan.workflow.service.ReplyTimeoutException
at org.globus.cog.karajan.workflow.service.commands.Command.handleReplyTimeout(Command.java:280)
at org.globus.cog.karajan.workflow.service.commands.Command$Timeout.run(Command.java:285)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
Channel IOException
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:124)
at org.globus.gsi.gssapi.net.impl.GSIGssOutputStream.writeToken(GSIGssOutputStream.java:61)
at org.globus.gsi.gssapi.net.impl.GSIGssOutputStream.flush(GSIGssOutputStream.java:45)
at org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel$Sender.send(AbstractStreamKarajanChannel.java:298)
at org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel$Sender.run(AbstractStreamKarajanChannel.java:247)
Not sure if this important or not, but I will include the logs.
I can't quite get coasters-persistent working in passive mode. I am not sure if this if a configuration issue, a swift issue, or operator error. Here is what I am trying to do:
sites.xml:
<config>
<pool handle="churn">
<execution provider="coaster-persistent" url=" churn.mcs.anl.gov " jobmanager="local:local"/>
<profile namespace="globus" key="workerManager">passive</profile>
<profile namespace="globus" key="workersPerNode">1</profile>
<profile namespace="globus" key="maxTime">3500</profile>
<profile namespace="globus" key="slots">1</profile>
<profile namespace="globus" key="nodeGranularity">1</profile>
<profile namespace="globus" key="maxNodes">1</profile>
<filesystem provider="local" url="none" />
<profile key="jobThrottle" namespace="karajan">.31</profile>
<profile namespace="karajan" key="initialScore">10000</profile>
<workdirectory>/home/davidk/swiftwork/churn</workdirectory>
</pool>
</config>
I run grid-proxy init on the submit host (login*. mcs.anl.gov ) and on the remote host ( churn.mcs.anl.gov ). From churn I run coaster-service. When I run the catsn.swift script on login, I notice these kind of errors in the coaster-service output:
GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT)
GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT)
GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT)
GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT)
GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT)
GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT)
GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT)
GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT)
GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT)
GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT)
These messages seem to repeat several times per second and never stop. The script never finishes. The configurations and log files attached.
Once I can get this configuration working manually, I will start working on a script to automate this process for multiple hosts to make things a little easier.
David
--
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: coaster-logs.tar.gz
Type: application/x-gzip
Size: 48618 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20100831/b9ddc1d4/attachment.bin>
More information about the Swift-devel
mailing list