[Swift-devel] Swift issues and next steps on OOPS app
Michael Wilde
wilde at mcs.anl.gov
Mon Apr 6 09:13:10 CDT 2009
was: Re: status update
On 4/4/09 7:52 PM, Glen Hocky wrote:
> Things seem to be kind of working on all machines (including ranger,
> which picked up some speed) but not totally.
So for ranger at the moment we can run default params and hope for 640
cores at a time. We should queue up several science runs of full-scale
rounds, and assess the results and run times.
> Problems to investigate this week:
> swift choking after running lots of jobs successfully (shoudl probably
> just ignore these things)
I'm not sure which errors you mean here - lets examine them first. Do
you mean the "successfully retried" errors?
> swift not balancing load accross different sites (dumps all ones for my
> teragrid sites file onto one site, grr!)
Can you send a log of this to the Swift developers? They need that in
order to look at this problem.
I will do a sanity test of WS-GRAM with coasters on abe and queenbee. If
it works we should expand our science runs there.
These are good things to do today while BG/P is down.
- Mike
>
> Glen
More information about the Swift-devel
mailing list