[Swift-devel] mystery runs on ucanl

Michael Wilde wilde at mcs.anl.gov
Tue Jul 29 14:43:09 CDT 2008


On 7/29/08 2:34 PM, skenny at uchicago.edu wrote:
>>>>> yes (see below) and SOME of the jobs in the workflow do
>>>>> complete when we submit the whole workflow to ucanl.
>>>> Indeed. It seems like roughly half of them work and the other
>>> half
>>>> break. Could this be an ia32/ia64 issue? Like python being
>>> compiled for
>>>> the wrong platform or something?
> 
> well, i thought that sounded pretty likely (apparently some
> jobs were going to 32-bit machines even though 64 was
> specified in the sites file).

Is it possible that the property was mis-spelled? I recall some issues 
with this profile attribute in the past, when you first started running 
Swift last Oct-Nov.

> however, i've just sent a batch
> to the site and am getting failures on 64-bit nodes as
> well (and on varying nodes, so not just 1 or 2 bum
> nodes)...because there is still this odd behavior of jobs
> remaining in the queue even after they've been killed, i'm
> tempted to blame pbs (gotta blame someone ;) also, i'm getting
> emails from pbs like this:
> 
> PBS Job Id: 1759910.tg-master.uc.teragrid.org
> Job Name:   STDIN
> Exec host:  tg-c054/0
> Aborted by PBS Server 
> Job cannot be executed
> See Administrator for help
> 
> and the swift log simply gives "Failed Error code: 271,
> ProcessDied"

I also recall some similar issues on UC Teragrid last Nov (2007) as we 
were preparing Angle runs for SC07. Ti was involved in that debugging 
and had given us PBS diagnostic commands to capture log data on the 
problem at the time.  Ti, can you recall the details?

- Mike

> 
> hence, i'm copying help at teragrid on this...if there are any
> other tests i can run to try and narrow down the bug let me
> know. i've tried submitting several globusrun-ws jobs but
> haven't gotten an error that way as of yet. 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel



More information about the Swift-devel mailing list