[Swift-devel] transfers of small files

Ian Foster itf at mcs.anl.gov
Wed Nov 28 18:54:50 CST 2007


As mentioned in an email from a few weeks ago, the gridftp guys have implemented support for streaming many small files. I would hope we would try that before implementing our own version.

Ian


Sent via BlackBerry from T-Mobile

-----Original Message-----
From: Mihael Hategan <hategan at mcs.anl.gov>

Date: Wed, 28 Nov 2007 18:31:58 
To:Ian Foster <foster at mcs.anl.gov>
Cc:swift-devel <swift-devel at ci.uchicago.edu>,  John Bresnahan <bresnaha at mcs.anl.gov>
Subject: Re: [Swift-devel] transfers of small files



On Wed, 2007-11-28 at 18:24 -0600, Ian Foster wrote:
> Mihael:
> 
> It isn't clear to me--are you using the "lots of small files" 
> optimization here?

It depends what you mean by "lots of small files optimization".
Obviously this is an optimization for the lots of small files case.

I'm re-using clients with mode E and only sending PASV once per client.
Let's call this A. There was word of "pipelining". We'll call that B. I
assume it to be different from what I did (A) for the following reasons:
1. Jarek had tests for A in JGlobus, so A is not a new deal.
2. Buzz recently committed some code to JGlobus to enable B, which
assumes B was not possible before, therefore B != A.

> 
> I've CCed John Bresnahan so he can comment.
> 
> Ian.
> 
> Mihael Hategan wrote:
> > So I've been playing with that issue. I've made some measurements
> > outside Swift. Here's a summary:
> >
> > 32k files. From terminable to tg-uc
> >
> > 1 - karajan with connection caching. transfers in parallel. tops at
> > 200KB/s
> >
> > 2 - n*globus-url-copy - With 32 parallel transfers it starts failing and
> > gets about 10KB/s
> >
> > 3 - globus-url-copy with a list of files: around 300KB/s
> >
> > 4 - globus-url-copy with a list of files, E mode, and data channel
> > re-use: 500KB/s
> >
> > So I figured I should hack the GridFTP provider to re-use data channels
> > by default. This is where it gets strange. I get averages (over multiple
> > runs) of over 1MB/s, with mins of about 130KB and max of 1.9MB/s, but
> > with a lot of variability. I'll debug this. However, I think there is
> > still value in enabling this by default.
> >
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >
> >   
> 



More information about the Swift-devel mailing list