<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""></div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Dec 3, 2014, at 11:04 AM, Yadu Nand Babuji <<a href="mailto:yadunand@uchicago.edu" class="">yadunand@uchicago.edu</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">Hi Jonathan,<br class=""><br class="">The issue you are seeing sounds pretty close to what David reported a <br class="">while back.<br class="">Could you send us a tar ball of your run directory from a failed run ?<br class=""><br class="">Could you also check if you've set lowOverAllocation and <br class="">highOverAllocation in your sites definition ?<br class=""><br class="">Thanks,<br class="">Yadu<br class=""><br class="">On 12/03/2014 10:50 AM, Ozik, Jonathan wrote:<br class=""><blockquote type="cite" class="">Hi all,<br class=""><br class="">I’m trying to run a large set of simulations on Midway using Swift 0.95-RC5.<br class="">768 of the 2187 tasks completed successfully and then I got the exception:<br class=""><br class=""><span class="Apple-tab-span" style="white-space:pre">   </span>exception @ swift-int.k, line: 530<br class="">Caused by: Block task failed: Connection to worker lost<br class="">org.globus.cog.coaster.TimeoutException: Channel timed out. lastTime=141203-145449.325, now=141203-145649.844, channel=TCPChannel [type: server, contact: 1202-5410010-000072-000000]<br class=""><span class="Apple-tab-span" style="white-space:pre">       </span>at org.globus.cog.coaster.channels.AbstractCoasterChannel.checkTimeouts(AbstractCoasterChannel.java:133)<br class=""><span class="Apple-tab-span" style="white-space:pre">       </span>at org.globus.cog.coaster.channels.AbstractCoasterChannel$1.run(AbstractCoasterChannel.java:124)<br class=""><span class="Apple-tab-span" style="white-space:pre">       </span>at java.util.TimerThread.mainLoop(Timer.java:555)<br class=""><span class="Apple-tab-span" style="white-space:pre">      </span>at java.util.TimerThread.run(Timer.java:505)<br class=""><br class="">Progress: Wed, 03 Dec 2014 14:59:51+0000  Submitted:651  Failed:6  Finished successfully:768  Failed but can retry:762<br class="">Progress: Wed, 03 Dec 2014 14:59:52+0000  Submitted:651  Failed:44  Finished successfully:768  Failed but can retry:724<br class=""><br class="">And the process seems to have stopped.<br class=""><br class="">What log file would be helpful for diagnosing this?<br class=""><br class="">Jonathan<br class=""><br class=""><br class="">_______________________________________________<br class="">Swift-user mailing list<br class=""><a href="mailto:Swift-user@ci.uchicago.edu" class="">Swift-user@ci.uchicago.edu</a><br class="">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user<br class=""></blockquote><br class="">_______________________________________________<br class="">Swift-user mailing list<br class=""><a href="mailto:Swift-user@ci.uchicago.edu" class="">Swift-user@ci.uchicago.edu</a><br class="">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user</div></blockquote></div><br class=""></div></body></html>