[Swift-devel] Swift and BGP
Ioan Raicu
iraicu at cs.uchicago.edu
Mon Oct 26 23:27:24 CDT 2009
Mihael Hategan wrote:
> On Mon, 2009-10-26 at 16:36 -0500, Ioan Raicu wrote:
>
>>>
>>>
>> Here were our experiences with running scripts from GPFS. The #s below
>> represents the throughput for invoking scripts (a bash script that
>> invoked a sleep 0) from GPFS on 4 workers, 256 workers, and 2048
>> workers.
>> Number of Processors
>> Invoke script throughput (ops/sec)
>> 4
>> 125.214
>> 256
>> 109.3272
>> 2048
>> 823.0374
>>
>>
>
> Looks right. What I saw was that things were getting shitty at around
> 10000 cores. Lower if info writing, directory making, and file copying
> was involved.
>
Right.
>
>>> [...]
>>>
>> In our experience with Falkon, the limit came much sooner than 64K. In
>> Falkon, using the C worker code (which runs on the BG/P), each worker
>> consumes 2 TCP/IP connections to the Falkon service.
>>
>
> Well, the coaster workers use only one connection.
>
Its 1 connection per core? or per node? Zhao tried to reduce to 1
connection per node, but the worker was not stable, so we left it alone
in the interest of time. The last time I looked at it, the workers used
2 connections per core, or 8 connections per node. Quite inefficient at
scale, but not an issue given that each service only handles 256 cores.
>
>> In the centralized Falkon service version, this racks up connections
>> pretty quick. I don't recall at exactly what point we started having
>> issues, but it was somewhere in the range of 10K~20K CPU cores.
>> Essentially, we could establish all the connections (20K~40K TCP
>> connections), but when the experiment would actually start, and data
>> needed to flow over these connections, all sort of weird stuff started
>> happening, TCP connection would get reset, workers were failing (e.g.
>> their TCP connection was being severed and not being re-established),
>> etc. I want to say that 8K (maybe 16K) cores was the largest tests we
>> made on the BG/P with a centralized Falkon service, that were stable
>> and successful.
>>
>
> Possible. I haven't properly tested above 12k workers. I was just
> mentioning a theoretical limitation that doesn't seem possible to beat
> without having things distributed.
>
> [...]
>
>> For the BG/P specifically, I think the distribution of the Falkon
>> service to the I/O nodes gave us a low maintanance, robust, and
>> scalable solution!
>>
>
> Lower than if you only had to run one service on the head node?
>
Yes, in fact it was for Falkon. If we ran Falkon on the head node, the
user would have to start it manually, on an available port, and then
shut it down when finished. Running things on the I/O nodes was tougher
at the beginning, but once we got it all configured and running, it was
great! The Falkon service starts up on I/O node boot time, on a specific
port (no need to check if its available as the I/O node is dedicated to
the user), all compute nodes can easily find their respective I/O nodes
at the same location (some 192.xxx private address), and when the run is
over, the I/O nodes terminate and the services stop all on their own. At
least for Falkon, it really made the difference between having a
turn-key solution that always works, and one that would require constant
tinkering (starting and stopping) and configuration (e.g. ports).
Again, the downside to the distributed one, was the overhead of
implementing and testing it, and also the load-balancing that required a
bit of fine tunning in Swift to get just right.
Ioan
>
>
>
--
=================================================================
Ioan Raicu, Ph.D.
NSF/CRA Computing Innovation Fellow
=================================================================
Center for Ultra-scale Computing and Information Security (CUCIS)
Department of Electrical Engineering and Computer Science
Northwestern University
2145 Sheridan Rd, Tech M384
Evanston, IL 60208-3118
=================================================================
Cel: 1-847-722-0876
Tel: 1-847-491-8163
Email: iraicu at eecs.northwestern.edu
Web: http://www.eecs.northwestern.edu/~iraicu/
https://wiki.cucis.eecs.northwestern.edu/
=================================================================
=================================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20091026/843e2bf4/attachment.html>
More information about the Swift-devel
mailing list