[Swift-user] Looking for the cause of failure

Andriy Fedorov fedorov at bwh.harvard.edu
Sat Jan 30 22:10:27 CST 2010


On Sat, Jan 30, 2010 at 22:46, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> In ~/.globus/coasters you will find a bunch of worker logs. If you can
> identify the ones for your run (based perhaps on the timestamp on the
> files), they may contain the reason for the failure.
>

Strangely, I don't have worker logs for these executions -- the latest
are from Jan 18.

>> Anybody can explain what happened? The same workflow ran earlier, but
>> with fewer (2) workers per node.
>
> Does it work if you set workers per node to 2 again? If yes, that may be
> an indication that the workers per node setting causes a problem, and
> that's a stronger statement than "it doesn't work right now".
>

I will try, and let you know. If this is indeed the case, is there any
particular reason why it may not work for 4 workers per node?

As Mike pointed out, the nodes actually have 8 cores.

>
>
>



More information about the Swift-user mailing list