[Swift-devel] Swift run to Falkon hanging

Ben Clifford benc at hawaga.org.uk
Mon Oct 29 12:39:37 CDT 2007


Hmm.

You have a bunch of jobs in progress at the end of the log file.

I presume you have lazy errors turned off (that is the default). I'm not 
sure how eager the eager error handling is - Mihael might know off the top 
of his head, or I can have a poke. If an execute2 fails with 
APPLICATION_EXCEPTION, does that kill the whole workflow? I would have 
thought not but I realise I am not sure.

http://www.ci.uchicago.edu/~benc/report-awf2-20071029-0831-nouanbe8/

Specifically:

> Breakdown of last known status for execute2s:
>
>   1 APPLICATION_EXCEPTION
>   5 JOB_START

1 execute2 had failed, but 5 were in progres (I think 4 of the original 
submissions and 1 retry).



On Mon, 29 Oct 2007, Michael Wilde wrote:

> The attached logfile with my comments and questions (full logs in
> ~benc/swift-logs/wilde/run117) is for a a small 5-job Angle test with about
> 1-week-old code to try to get a stable falkon config back on uc-teragrid.
> 
> Ioan has confirmed that there are issues causing the provisioner on the ia32
> login host to fail to connect to the service correctly. Im still re-working
> around those issues.
> 
> Im trying to just get this config, which worked moderately well last week,
> back to a working state before I jump to the latest code base.
> 
> I may give up after this last attempt, and upgrade both swift and falkon.
> 
> But my last run puzzles me. Can people take a look at the attached log with
> questions and comments and help me understand whats happening?
> 
> Basically I get one app exception, things hang, and I dont see what in the log
> if anything is pointing to the cause.
> 
> Thanks,
> 
> Mike
> 



More information about the Swift-devel mailing list