[Swift-devel] coaster-service warning messages
Tim Armstrong
tim.g.armstrong at gmail.com
Wed Aug 27 21:22:51 CDT 2014
I've been experimenting with the coaster C client on midway and have been
seeing a lot of warning messages from the coaster service. I wanted to
know if this is a known problem, or get some guidance on how to try to
resolve it.
I'm running the latest github master version.
I've been starting the coaster service with active coasters:
export GLOBUS_HOSTNAME=172.25.180.72
coaster-service -nosec -p 65001
I then have the coaster C client connect to it and submit jobs, with the
following settings: jobManager=slurm,jobQueue=sandyb,tasksperworker=16.
The jobs have mostly been completing successfully, but I've also seen some
instability and failures. I don't know if it's related to the many
warnings in the service log (attached) , e.g. this one:
2014-08-28 01:06:34,174+0000 WARN TaskNotifier Client could not properly
process notification: null
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:147)
at org.globus.cog.coaster.channels.Sender.send(Sender.java:149)
at org.globus.cog.coaster.channels.Sender.run(Sender.java:85)
or this one:
2014-08-28 01:06:51,872+0000 WARN RemoteLogger Failed to send remote log
message: BLOCK_SHUTDOWN id=0828-0601150-000000
org.globus.cog.coaster.channels.ChannelException: Channel died and no
contact available
at
org.globus.cog.coaster.channels.ChannelManager.connect(ChannelManager.java:253)
at
org.globus.cog.coaster.channels.ChannelManager.reserveChannel(ChannelManager.java:274)
at
org.globus.cog.coaster.channels.ChannelManager.reserveChannel(ChannelManager.java:245)
at
org.globus.cog.abstraction.coaster.rlog.RemoteLogger.log(RemoteLogger.java:53)
at
org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:303)
at
org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdownIfEmpty(Block.java:236)
at
org.globus.cog.abstraction.coaster.service.job.manager.Block.suspend(Block.java:576)
at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.removeIdleBlocks(BlockQueueProcessor.java:472)
at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:750)
at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:161)
Any guidance or thoughts would be appreciated.
- Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20140827/dbfb7e4f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cps-2014-08-28_01-06-05.log
Type: text/x-log
Size: 161663 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20140827/dbfb7e4f/attachment.bin>
More information about the Swift-devel
mailing list