[Swift-devel] Jobs being aborted by PBS server on tg-grid.uc.teragrid.org
Michael Wilde
wilde at mcs.anl.gov
Sun Nov 4 21:38:44 CST 2007
Ive reported this to TG and Ti on the chance that its on the server
side. If nothing else, possibly a PBS log can pinpoint what we're doing
wrong if its us or me.
The two runs below are in ~benc/swift-logs/wilde/
7:46 PM - run142
8:57 PM - run142
Ive started to add a 'comment' file to my log dirs there to note the
reason, and on occasion I copy output placed in cwd to _output.
Also adding find or ls output to each dir when its relevant and I
remember. Im trying to automate more of this as I go.
- Mike
On 11/4/07 9:20 PM, Michael Wilde wrote:
> Im starting to see more frequent problems like this.
> Happened once last night to 3 consecutive jobs, and tonight happened
> twice, to 6 jobs.
>
> Ti, could you look in the PBS logs, possibly on the related node(s) and
> see if its looking like a problem on tg-uc or on our side?
>
> Thanks,
>
> Mike
>
>
> 11/3 8:05 PM - 3 failures
> Job IDs 1571647, 48, & 49
> 11/4 7:46 PM - 3 failures
> Job IDs 1572031, 33, & 34
> 11/4 8:56 - 8:57 PM
> 1572040, 42, 43
>
> All errors have the format below.
>
> Swift retries failing jobs 3 times, hence the groups of 3 above.
>
>
> -------- Original Message --------
> Subject: PBS JOB 1572043.tg-master.uc.teragrid.org
> Date: Sun, 4 Nov 2007 20:57:11 -0600 (CST)
> From: adm at tg-master.uc.teragrid.org (root)
> To: wilde at tg-grid1.uc.teragrid.org
>
> PBS Job Id: 1572043.tg-master.uc.teragrid.org
> Job Name: STDIN
> Aborted by PBS Server
> Job cannot be executed
> See Administrator for help
>
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>
>
More information about the Swift-devel
mailing list