[Swift-devel] fakecnari on ranger without gridftp
Ioan Raicu
iraicu at cs.uchicago.edu
Sun Sep 28 22:29:36 CDT 2008
Hi,
Ben Clifford wrote:
> On Sun, 28 Sep 2008, Ian Foster wrote:
>
> ...
>
> >From a raw CPU perspective, there's 65535 parallel tasks each of a few
> seconds long each; there's a fairly obvious, if naive, target there of 65k
> x speedup (mmm).
>
>
At that point, the main bottlenecks will be scalability of the
mechanisms you use to drive the execution framework (i.e.
Coaster/Falkon), the time it takes to bootstrap your the execution
framework, the throughput you can dispatch tasks and receive results,
and the speed of the file system that it can read inputs and write
outputs. Assuming the execution framework scales, the rest can be
computed as long as you understand the performance of the machine and
the execution framework.
For example, here are two runs we made with Falkon on the BG/P recently
that might be similar to the fakecnari workload. We had 32K CPU-cores,
128K tasks, and each task involved sleeping for 4 sec, and writing 1KB
of data. In an ideal world with 0 costs for the execution framework,
and 0 costs of I/O, the workload time would have been 16 seconds
(128K/32K*4sec), which would equate to 524288 CPU seconds. Running this
workload on GPFS directly took 180 seconds (2912X), and running the same
workload through the collective I/O framework we have took 61 seconds
(8594X). The bottleneck in the GPFS case was the rate that we could
create files and write the 1KB file, in the context of 32K CPUs doing
this concurrently. The bottleneck for the collective I/O was the
dispatch rate of Falkon, which in this case was 2148 tasks/sec. Once
you understand the performance of the file system, and execution
framework, these large scale numbers can be estimated quite nicely.
> I still don't really have a feel for what the filesystem on Ranger
> (lustre) will do - its been behaving fairly well so far, I think, but I
> imagine thats because it hasn't been terribly heavily loaded in my
> testing.
>
And if it was built to support the entire machine at full scale (64K
CPU-cores), then I'd imagine that you'll need at least 1000s, if not
10Ks of CPU-cores to saturate the file system with small files. Once of
these days, we'll probably start testing some of our BG/P apps on Ranger
as well, so then, we can exchange notes better on each other's
experiences and problems we are each facing.
Ioan
--
===================================================
Ioan Raicu
Ph.D. Candidate
===================================================
Distributed Systems Laboratory
Computer Science Department
University of Chicago
1100 E. 58th Street, Ryerson Hall
Chicago, IL 60637
===================================================
Email: iraicu at cs.uchicago.edu
Web: http://www.cs.uchicago.edu/~iraicu
http://dev.globus.org/wiki/Incubator/Falkon
http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page
===================================================
===================================================
More information about the Swift-devel
mailing list