[Swift-devel] [Bug 72] Campaign for scaling wf up to 244 molecules

bugzilla-daemon at mcs.anl.gov bugzilla-daemon at mcs.anl.gov
Sat Jun 30 17:09:11 CDT 2007


http://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=72





------- Comment #5 from hategan at mcs.anl.gov  2007-06-30 17:09 -------
First of all, can you commit the changes to SVN?

(In reply to comment #4)
> We fixed the potential synchronization issue
> Mihael pointed out.

There were two.

> We also fixed a badly handled exception we had in the
> Falkon provider, that would give up very easily and exit the Falkon provider
> thread in case of an exception, even if it wasn't a fatal one.  This time
> around, we changed the logic to simply print the exception, if there were any,
> and not exit the Falkon provider, just continue.  Personally, I think this
> logic on handling exceptions in the Falkon provider was causing the Falkon
> provider to exit prematurely, and hence not send any more tasks to Falkon...

I can't seem to find anything that would fit that profile in the provider code.
Can you be more specific? If the provider was setting the status of the task to
failed, then it doesn't matter. Swift retries failed things.

> note that Swift was setting the set status of submitted tasks to the Falkon
> provider in a separate thread,

Swift does not set status of tasks. That's what the provider is supposed to do.

> which was not necesarly exiting when the Falkon
> provider was, and hence we had the scenario in which Swift thought it sent out
> more tasks than Falkon really saw. 

Can you be more specific? If there is a problem in Swift, we need to fix it,
but your comment is too vague.

> 
> Now, the issue that I think stopped this experiment.  On the console of Swift,
> the last thing that it printed was a "stack overflow error"; I don't think this
> printed in the logs, just on the console.

Without the stack trace, the information is not very useful.

> 
> Ioan
> 


-- 
Configure bugmail: http://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.



More information about the Swift-devel mailing list