[Swift-devel] coaster-service warning messages

Tim Armstrong tim.g.armstrong at gmail.com
Wed Aug 27 21:22:51 CDT 2014


I've been experimenting with the coaster C client on midway and have been
seeing a lot of warning messages from the coaster service.  I wanted to
know if this is a known problem, or get some guidance on how to try to
resolve it.

I'm running the latest github master version.

I've been starting the coaster service with active coasters:

  export GLOBUS_HOSTNAME=172.25.180.72
  coaster-service -nosec -p 65001

I then have the coaster C client connect to it and submit jobs, with the
following settings: jobManager=slurm,jobQueue=sandyb,tasksperworker=16.

The jobs have mostly been completing successfully, but I've also seen some
instability and failures.  I don't know if it's related to the many
warnings in  the service log (attached) , e.g. this one:

2014-08-28 01:06:34,174+0000 WARN  TaskNotifier Client could not properly
process notification: null
java.net.SocketException: Broken pipe
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:147)
        at org.globus.cog.coaster.channels.Sender.send(Sender.java:149)
        at org.globus.cog.coaster.channels.Sender.run(Sender.java:85)

or this one:

2014-08-28 01:06:51,872+0000 WARN  RemoteLogger Failed to send remote log
message: BLOCK_SHUTDOWN id=0828-0601150-000000
org.globus.cog.coaster.channels.ChannelException: Channel died and no
contact available
        at
org.globus.cog.coaster.channels.ChannelManager.connect(ChannelManager.java:253)
        at
org.globus.cog.coaster.channels.ChannelManager.reserveChannel(ChannelManager.java:274)
        at
org.globus.cog.coaster.channels.ChannelManager.reserveChannel(ChannelManager.java:245)
        at
org.globus.cog.abstraction.coaster.rlog.RemoteLogger.log(RemoteLogger.java:53)
        at
org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:303)
        at
org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdownIfEmpty(Block.java:236)
        at
org.globus.cog.abstraction.coaster.service.job.manager.Block.suspend(Block.java:576)
        at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.removeIdleBlocks(BlockQueueProcessor.java:472)
        at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:750)
        at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:161)


Any guidance or thoughts would be appreciated.

- Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20140827/dbfb7e4f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cps-2014-08-28_01-06-05.log
Type: text/x-log
Size: 161663 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20140827/dbfb7e4f/attachment.bin>


More information about the Swift-devel mailing list