[Swift-devel] [Bug 210] job exceeding wallclock limit -- error is not reported by swift
skenny at uchicago.edu
skenny at uchicago.edu
Tue Jul 14 12:38:36 CDT 2009
can you try resubmitting your test to ranger?
---- Original message ----
>Date: Tue, 14 Jul 2009 06:29:00 -0500 (CDT)
>From: bugzilla-daemon at mcs.anl.gov
>Subject: [Swift-devel] [Bug 210] job exceeding wallclock
limit -- error is not reported by swift
>To: swift-devel at ci.uchicago.edu
>
>https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=210
>
>
>
>
>
>--- Comment #1 from Ben Clifford <benc at hawaga.org.uk>
2009-07-14 06:29:00 ---
>This bug is rather ambiguously described.
>
>In non-bugzilla discussion it has been reported as:
>
>> well, for some reason, when a job hits wallclock and is
killed by the JM, swift just keeps saying "active"
>
>This is not behaviour that I observe with Swift against NCSA
using the below
>swiftscript and configuration using Swift swift-r3006
cog-r2430 - in such case,
>I see the job fail three times in a row and then the example
SwiftScript fails
>as should happen.
>
>Please clarify this bug.
>
>s.swift:
>
>$ cat s.swift
>type messagefile;
>
>app (messagefile t) greeting() {
> sleep "999s" stdout=@filename(t);
>}
>
>messagefile outfile <"hello.txt">;
>
>outfile = greeting();
>
>
>
>tc.data:
>
>$ cat tc.data
>cat: tc.data: No such file or directory
>benc at communicado:~/tmp-walltime/cog/modules/swift !1055
>$ cat dist/swift-svn/etc/tc.data
>#This is the transformation catalog.
>#
>#It comes pre-configured with a number of simple
transformations with
>#paths that are likely to work on a linux box. However, on
some systems,
>#the paths to these executables will be different (for
example, sometimes
>#some of these programs are found in /usr/bin rather than in
/bin)
>#
>#NOTE WELL: fields in this file must be separated by tabs,
not spaces; and
>#there must be no trailing whitespace at the end of each line.
>#
># sitename transformation path INSTALLED platform profiles
>hg echo /bin/echo INSTALLED INTEL32::LINUX
null
>hg cat /bin/cat INSTALLED INTEL32::LINUX
null
>hg ls /bin/ls INSTALLED INTEL32::LINUX
null
>hg grep /bin/grep INSTALLED INTEL32::LINUX
null
>hg sort /bin/sort INSTALLED INTEL32::LINUX
null
>hg sleep /bin/sleep INSTALLED
INTEL32::LINUX null
>
>
>site definition:
>
><pool handle="hg" >
> <gridftp url="gsiftp://grid-hg.ncsa.teragrid.org" />
> <jobmanager universe="vanilla"
>url="grid-hg.ncsa.teragrid.org/jobmanager-pbs
>" major="2" />
> <workdirectory >/home/ac/benc</workdirectory>
> <profile namespace="globus" key="queue">debug</profile>
> <profile namespace="globus" key="maxwalltime">1</profile>
></pool>
>
>
>the output:
>
>Swift svn swift-r3006 cog-r2430
>
>RunID: 20090714-0616-dgktv8b3
>Progress:
>Progress: Stage in:1
>Progress: Submitted:1
>Progress: Submitted:1
>Progress: Submitted:1
>Progress: Active:1
>Progress: Active:1
>Progress: Active:1
>Progress: Active:1
>Progress: Checking status:1
>Progress: Stage in:1
>Progress: Submitted:1
>Progress: Submitted:1
>Progress: Active:1
>Progress: Active:1
>Progress: Active:1
>Progress: Checking status:1
>Progress: Submitted:1
>Progress: Submitted:1
>Progress: Submitted:1
>Progress: Active:1
>Progress: Active:1
>Progress: Active:1
>Progress: Checking status:1
>Execution failed:
> Exception in sleep:
>Arguments: [999s]
>Host: hg
>Directory: s-20090714-0616-dgktv8b3/jobs/8/sleep-8h82cndj
>stderr.txt:
>stdout.txt:
>----
>
>Caused by:
> No status file was found. Check the shared filesystem on hg
>
>--
>Configure bugmail:
https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email
>------- You are receiving this mail because: -------
>You are watching the assignee of the bug.
>You are watching someone on the CC list of the bug.
>_______________________________________________
>Swift-devel mailing list
>Swift-devel at ci.uchicago.edu
>http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
More information about the Swift-devel
mailing list