[Swift-devel] Coaster run to UC3 dies with channel timeout
Michael Wilde
wilde at mcs.anl.gov
Tue Mar 12 08:51:01 CDT 2013
This demo (for OSG all-hands) was running fairly reliably, 100's to a few thousand 30-second tasks to UC3 with flocking to OSG and other pools.
But just got a failure, so it looks like sporadic problems remain.
Running Swift 0.94 latest rev.
Log is on midway in:
/home/wilde/osgdemo/modis/svn/swiftdemo/test.uc3
-rw-rw-r-- 1 wilde wilde 11632001 Mar 12 08:42 saved/modis-20130312-1335-p30ylps9.log
I'll file a ticket once we get a sense of the frequency.
- Mike
Progress: time: Tue, 12 Mar 2013 13:42:28 +0000 Selecting site:461 Stage in:10 Submitted:782 Active:204 Stage out:4 Finished successfully:1539
Progress: time: Tue, 12 Mar 2013 13:42:29 +0000 Selecting site:453 Stage in:6 Submitted:779 Active:215 Finished successfully:1547
Progress: time: Tue, 12 Mar 2013 13:42:30 +0000 Selecting site:439 Stage in:16 Submitting:1 Submitted:776 Active:204 Stage out:2 Finished successfully:1562
Execution failed:
Exception in perl:
Arguments: [getlanduse.pl, input/h06v33.rgb]
Host: uc3
Directory: modis-20130312-1335-p30ylps9/jobs/7/perl-7s7qvh6l
Caused by:
Task failed: null
org.globus.cog.karajan.workflow.service.TimeoutException: Channel timed out. lastTime=130312-084030.762, now=130312-084231.763, channel=TCP-0312-3508510-000259-000000
at org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:131)
at org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel$1.run(AbstractKarajanChannel.java:122)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
getLandUse, modis.swift, line 24
swift$ pwd
/home/wilde/osgdemo/modis/svn/swiftdemo/test.uc3
swift$ ls
cf input/ modis-20130312-1335-p30ylps9.0.rlog modis-20130312-1335-p30ylps9.log saved/ tc
getlanduse.pl* landuse/ modis-20130312-1335-p30ylps9.d/ run* swift.log uc3.xml
swift$ e ../save
swift$ save
swift$ ls saved
modis-20130312-1326-n9rofj6e.d/ modis-20130312-1329-f2a2eic4.log modis-20130312-1335-p30ylps9.log
modis-20130312-1326-n9rofj6e.log modis-20130312-1335-p30ylps9.0.rlog swift.log
modis-20130312-1329-f2a2eic4.d/ modis-20130312-1335-p30ylps9.d/
swift$ ls saved/modis-20130312-1335-p30ylps9.log
saved/modis-20130312-1335-p30ylps9.log
swift$ pwd; ls -l saved/modis-20130312-1335-p30ylps9.log
/home/wilde/osgdemo/modis/svn/swiftdemo/test.uc3
-rw-rw-r-- 1 wilde wilde 11632001 Mar 12 08:42 saved/modis-20130312-1335-p30ylps9.log
swift$
--
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory
More information about the Swift-devel
mailing list