[Swift-user] Block task failed: Connection to worker lost
Ozik, Jonathan
jozik at anl.gov
Wed Dec 3 10:50:40 CST 2014
Hi all,
I’m trying to run a large set of simulations on Midway using Swift 0.95-RC5.
768 of the 2187 tasks completed successfully and then I got the exception:
exception @ swift-int.k, line: 530
Caused by: Block task failed: Connection to worker lost
org.globus.cog.coaster.TimeoutException: Channel timed out. lastTime=141203-145449.325, now=141203-145649.844, channel=TCPChannel [type: server, contact: 1202-5410010-000072-000000]
at org.globus.cog.coaster.channels.AbstractCoasterChannel.checkTimeouts(AbstractCoasterChannel.java:133)
at org.globus.cog.coaster.channels.AbstractCoasterChannel$1.run(AbstractCoasterChannel.java:124)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
Progress: Wed, 03 Dec 2014 14:59:51+0000 Submitted:651 Failed:6 Finished successfully:768 Failed but can retry:762
Progress: Wed, 03 Dec 2014 14:59:52+0000 Submitted:651 Failed:44 Finished successfully:768 Failed but can retry:724
And the process seems to have stopped.
What log file would be helpful for diagnosing this?
Jonathan
More information about the Swift-user
mailing list