[Swift-devel] Swift and BGP

Mon Oct 26 17:55:30 CDT 2009

On Mon, 2009-10-26 at 16:36 -0500, Ioan Raicu wrote:
> >   
> Here were our experiences with running scripts from GPFS. The #s below
> represents the throughput for invoking scripts (a bash script that
> invoked a sleep 0) from GPFS on 4 workers, 256 workers, and 2048
> workers. 
> Number of Processors
> Invoke script throughput (ops/sec)
>                                   4
>                             125.214
>                                 256
>                            109.3272
>                                2048
>                            823.0374
> 

Looks right. What I saw was that things were getting shitty at around
10000 cores. Lower if info writing, directory making, and file copying
was involved.

> > [...]  
> In our experience with Falkon, the limit came much sooner than 64K. In
> Falkon, using the C worker code (which runs on the BG/P), each worker
> consumes 2 TCP/IP connections to the Falkon service.

Well, the coaster workers use only one connection.

>  In the centralized Falkon service version, this racks up connections
> pretty quick. I don't recall at exactly what point we started having
> issues, but it was somewhere in the range of 10K~20K CPU cores.
> Essentially, we could establish all the connections (20K~40K TCP
> connections), but when the experiment would actually start, and data
> needed to flow over these connections, all sort of weird stuff started
> happening, TCP connection would get reset, workers were failing (e.g.
> their TCP connection was being severed and not being re-established),
> etc. I want to say that 8K (maybe 16K) cores was the largest tests we
> made on the BG/P with a centralized Falkon service, that were stable
> and successful. 

Possible. I haven't properly tested above 12k workers. I was just
mentioning a theoretical limitation that doesn't seem possible to beat
without having things distributed.

[...]
> 
> For the BG/P specifically, I think the distribution of the Falkon
> service to the I/O nodes gave us a low maintanance, robust, and
> scalable solution!

Lower than if you only had to run one service on the head node?