[Swift-devel] lots of very small files vs gridftp

Ian Foster foster at mcs.anl.gov
Mon Sep 29 20:34:18 CDT 2008


Remind me again why we aren't just using TAR and GridFTP?

Ian.

On Sep 29, 2008, at 6:37 PM, Mihael Hategan wrote:

> On Thu, 2008-09-25 at 13:10 +0000, Ben Clifford wrote:
>
>> To transfer 1000 files:
>>
>>   # concurrent conncetions  |   duration of copy (seconds, multiple  
>> runs)
>>                       16          7, 16, 16
>>                        4         14, 14, 14
>>                        2         26, 25
>>                        1         48, 52
>>
>
> I tried a similar experiment, this time with the java libraries, to  
> see
> how that works.
>
> The setup was transfer 1024 files of 1024 bytes each with parallelism
> (at the karajan level, though this should cause corresponding gridftp
> connection parallelism) of 1 to 16 in powers of 2.
>
> I got this for Ranger (in ms):
> 1: 242030
> 2: 121916
> 4: 61787
> 8: 31903
> 16: died (probably trying to start too many connections concurrently)
>
> Then UC:
> 1: 212192
> 2: 106872
> 4: 54790
> 8: 28838
> 16: 18166
>
> Then I made a quick file provider for coasters, which sends the data
> over the same connection (and upped the parallelism):
> UC-coaster
> 1: 102624
> 2: 31388
> 4: 18042
> 8: 8823
> 16: 5510
> 32: 5053
> 64: 6686
> 128: 5551
>
> Then I ran the same, but instead of transferring to a nfs directory,
> things went to /dev/null:
> 1: 93997
> 2: 35694
> 4: 16269
> 8: 7349
> 16: 4462
> 32: 1865
> 64: 1332
> 128: 1304
>
> I suppose the bad speed with coasters is because things go up on an
> encrypted connection, but it may be something else.
>
> So otherwise, if files are small, one can look at this as the task of
> sending (acknowledged) messages from one side to the other, where the
> communication lag is the problem and the way to solve it is by
> increasing parallelism (which essentially is what tarring things up
> does). That and whatever FS limitations the remote side has.
>
> Mihael
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel




More information about the Swift-devel mailing list