[Swift-devel] Swift issues and next steps on OOPS app

Michael Wilde wilde at mcs.anl.gov
Mon Apr 6 09:13:10 CDT 2009


was: Re: status update

On 4/4/09 7:52 PM, Glen Hocky wrote:

> Things seem to be kind of working on all machines (including ranger, 
> which picked up some speed) but not totally.

So for ranger at the moment we can run default params and hope for 640 
cores at a time. We should queue up several science runs of full-scale 
rounds, and assess the results and run times.

> Problems to investigate this week:

> swift choking after running lots of jobs successfully (shoudl probably 
> just ignore these things)

I'm not sure which errors you mean here - lets examine them first. Do 
you mean the "successfully retried" errors?

> swift not balancing load accross different sites (dumps all ones for my 
> teragrid sites file onto one site, grr!)

Can you send a log of this to the Swift developers? They need that in 
order to look at this problem.

I will do a sanity test of WS-GRAM with coasters on abe and queenbee. If 
it works we should expand our science runs there.

These are good things to do today while BG/P is down.

- Mike

> 
> Glen



More information about the Swift-devel mailing list