[Swift-devel] Coaster persistent service issues - logs

Mihael Hategan hategan at mcs.anl.gov
Tue Sep 7 00:31:35 CDT 2010


-nosec should be fixed in cog r2879.

That means that the service won't complain about missing credentials,
and swift should be able to submit to it as long as you say http:// or
tcp:// in the url.

I removed some small piece of code whose exact purpose was to provide
some default that was interfering with some of the other logic. Long
story short, please also test the normal coasters to ensure that there
are no unintended consequences there.

Mihael

On Tue, 2010-08-31 at 11:14 -0600, Michael Wilde wrote:
> ----- Forwarded Message -----
> From: "David Kelly" <dk0966 at cs.ship.edu>
> To: "Michael Wilde" <wilde at mcs.anl.gov>
> Cc: "Jonathan Monette" <jon.monette at gmail.com>, "Justin Wozniak" <wozniak at mcs.anl.gov>, "Mihael Hategan" <hategan at mcs.anl.gov>
> Sent: Sunday, August 29, 2010 10:32:07 AM GMT -06:00 US/Canada Central
> Subject: Re: change skype call time today - and some to-do notes
> 
> Hello all, 
> 
> A few things I've noticed while trying out various coaster configurations this weekend: 
> 
> Had similar problems with the -nosec option. Here is the output I got: 
> 
> davidk at churn:~/cog/modules/swift/dist/swift-svn/bin$ coaster-service -nosec 
> Error loading credential: [JGLOBUS-10] Expired credentials (DC=org,DC=doegrids,OU=People,CN=David Kelly 16830,CN=753950975). 
> Error loading credential 
> org.globus.gsi.GlobusCredentialException: [JGLOBUS-10] Expired credentials (DC=org,DC=doegrids,OU=People,CN=David Kelly 16830,CN=753950975). 
> at org.globus.gsi.GlobusCredential.verify(GlobusCredential.java:321) 
> at org.globus.gsi.GlobusCredential.reloadDefaultCredential(GlobusCredential.java:593) 
> at org.globus.gsi.GlobusCredential.getDefaultCredential(GlobusCredential.java:575) 
> at org.globus.cog.abstraction.coaster.service.CoasterPersistentService.main(CoasterPersistentService.java:73) 
> 
> I tested multiple connections when using a coasters-persistent+active mode. That seemed to have worked fine, with each new swift connection waiting for the previous to finish. I noticed there were some java exceptions in the log files: 
> 
> org.globus.cog.karajan.workflow.service.ReplyTimeoutException 
> at org.globus.cog.karajan.workflow.service.commands.Command.handleReplyTimeout(Command.java:280) 
> at org.globus.cog.karajan.workflow.service.commands.Command$Timeout.run(Command.java:285) 
> at java.util.TimerThread.mainLoop(Timer.java:512) 
> at java.util.TimerThread.run(Timer.java:462) 
> Channel IOException 
> java.net.SocketException: Broken pipe 
> at java.net.SocketOutputStream.socketWrite0(Native Method) 
> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) 
> at java.net.SocketOutputStream.write(SocketOutputStream.java:124) 
> at org.globus.gsi.gssapi.net.impl.GSIGssOutputStream.writeToken(GSIGssOutputStream.java:61) 
> at org.globus.gsi.gssapi.net.impl.GSIGssOutputStream.flush(GSIGssOutputStream.java:45) 
> at org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel$Sender.send(AbstractStreamKarajanChannel.java:298) 
> at org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel$Sender.run(AbstractStreamKarajanChannel.java:247) 
> 
> Not sure if this important or not, but I will include the logs. 
> 
> I can't quite get coasters-persistent working in passive mode. I am not sure if this if a configuration issue, a swift issue, or operator error. Here is what I am trying to do: 
> 
> sites.xml: 
> <config> 
> <pool handle="churn"> 
> <execution provider="coaster-persistent" url=" churn.mcs.anl.gov " jobmanager="local:local"/> 
> <profile namespace="globus" key="workerManager">passive</profile> 
> <profile namespace="globus" key="workersPerNode">1</profile> 
> <profile namespace="globus" key="maxTime">3500</profile> 
> <profile namespace="globus" key="slots">1</profile> 
> <profile namespace="globus" key="nodeGranularity">1</profile> 
> <profile namespace="globus" key="maxNodes">1</profile> 
> <filesystem provider="local" url="none" /> 
> <profile key="jobThrottle" namespace="karajan">.31</profile> 
> <profile namespace="karajan" key="initialScore">10000</profile> 
> <workdirectory>/home/davidk/swiftwork/churn</workdirectory> 
> </pool> 
> </config> 
> 
> I run grid-proxy init on the submit host (login*. mcs.anl.gov ) and on the remote host ( churn.mcs.anl.gov ). From churn I run coaster-service. When I run the catsn.swift script on login, I notice these kind of errors in the coaster-service output: 
> 
> GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT) 
> GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT) 
> GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT) 
> GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT) 
> GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT) 
> GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT) 
> GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT) 
> GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT) 
> GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT) 
> GSSSChannel-null(1)[7990655: {}] REQ: Handler(HEARTBEAT) 
> 
> These messages seem to repeat several times per second and never stop. The script never finishes. The configurations and log files attached. 
> 
> Once I can get this configuration working manually, I will start working on a script to automate this process for multiple hosts to make things a little easier. 
> 
> David 
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel





More information about the Swift-devel mailing list