[Swift-user] Looking for the cause of failure
Andriy Fedorov
fedorov at bwh.harvard.edu
Sat Jan 30 22:10:27 CST 2010
On Sat, Jan 30, 2010 at 22:46, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> In ~/.globus/coasters you will find a bunch of worker logs. If you can
> identify the ones for your run (based perhaps on the timestamp on the
> files), they may contain the reason for the failure.
>
Strangely, I don't have worker logs for these executions -- the latest
are from Jan 18.
>> Anybody can explain what happened? The same workflow ran earlier, but
>> with fewer (2) workers per node.
>
> Does it work if you set workers per node to 2 again? If yes, that may be
> an indication that the workers per node setting causes a problem, and
> that's a stronger statement than "it doesn't work right now".
>
I will try, and let you know. If this is indeed the case, is there any
particular reason why it may not work for 4 workers per node?
As Mike pointed out, the nodes actually have 8 cores.
>
>
>
More information about the Swift-user
mailing list