[Swift-devel] coaster transfer performance

Mihael Hategan hategan at mcs.anl.gov
Sat Jun 15 21:43:54 CDT 2013


I did more testing on this. I believe that the stage-out performance is
due to two factors. One is the default TCP send buffer size on beagle (a
whooping 8192 bytes) and the second is unexplained slowness of the
beagle -> swift.rcc connection.

I did a netcat (sadly the netcat on beagle doesn't allow me to specify
custom buffer sizes) and what I get is 11s for a 256MB file. I get
something somewhat similar if I increase the send buffer size in the
coaster service -> swift connection.

By contrast, the same file sent through netcat from swift.rcc to
login4.beagle takes 2 seconds.

Any ideas?

Mihael

On Fri, 2013-06-14 at 13:29 -0500, Michael Wilde wrote:
> I agree about testing; note though that you need to test on a shared
> filesystem under load before you will see where the bigger buffers
> make a difference.  For example, during times when "ls" on /home
> or /project (on midway) or /lustre (on beagle) run very slow.  Thats
> when reducing the number of trips of the file server makes the most
> difference.
> 
> - Mike
> 
> 
> ----- Original Message -----
> > From: "Mihael Hategan" <hategan at mcs.anl.gov>
> > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > Cc: "Swift Devel" <swift-devel at ci.uchicago.edu>
> > Sent: Friday, June 14, 2013 11:58:33 AM
> > Subject: Re: coaster transfer performance
> > 
> > I would like to point out that I have not seen a difference on
> > stage-out
> > performance when adjusting the buffer size (on the worker side). I
> > have
> > *not* tried a similar experiment for the stage-in buffer size, and I
> > believe that before we increase that permanently, we should see if it
> > does indeed make a difference.
> > 
> > Mihael
> > 
> > On Fri, 2013-06-14 at 08:19 -0500, Michael Wilde wrote:
> > > Im moving this discussion to swift-devel.
> > > 
> > > Mihael's fix to the timeout problem works very well - that problem
> > > has not recurred so far as I can tell.
> > > 
> > > We then discussed performance improvements for coaster provider
> > > staging.
> > > 
> > > Mihael suggested this:
> > > 
> > > "there should be a commented-out line in GSSChannel that says
> > > gssContext.requestConf(false); (line 85)
> > > uncomment that and re-compile, it will disable encryption"
> > > 
> > > ...and observed this:
> > > 
> > > > Disabling encryption makes the stage-ins go from about 6 MB/s to
> > > > 70-90 MB/s.
> > > > 
> > > > However, stageouts are still slow (about 4 MB/s).
> > > 
> > > I tried that, and it speeds things up a great deal. (No
> > > measurements from the fMRI demo case yet; measurements from Mihael
> > > below.)
> > > 
> > > I also tried increasing the transfer buffer size. The current
> > > buffer is 32KB.  I tried 16X (512KB) and 4X (128KB).
> > > 
> > > At 512KB, staging-in to the app goes "very fast" (again, no
> > > measures yet) but output hangs after about 25 of 100 transfers.
> > >  At 128KB, it hasn't hung yet, and goes very fast.  I too am
> > > seeing slower stage-out times from the app back to swift than
> > > stage=in times.
> > > 
> > > I'm testing on midway's shared fs right now (GPFS?) and will test
> > > on hard disk next.
> > > 
> > > But with these two fixes, things are working with very nice
> > > reliability and performance in tests so far.
> > > 
> > > The mods for these changes are below, followed by Mihael's report
> > > on testing w/o encryption.
> > > 
> > > Yadu, please integrate your coaster-provider-staging tests into the
> > > test suite and test across a range of file sizes, durations, and
> > > endpoints. Im happy to discuss this, on this list and/or in
> > > person.
> > > 
> > > Thanks,
> > > 
> > > - Mike
> > > 
> > > mid$ cat mods
> > > modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/buffers/Buffers.java
> > > modules/provider-coaster/resources/worker.pl
> > > modules/karajan/src/org/globus/cog/karajan/workflow/service/channels/GSSChannel.java
> > > mid$ svn diff
> > > Index:
> > > modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/buffers/Buffers.java
> > > ===================================================================
> > > ---
> > > modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/buffers/Buffers.java
> > >      (revision 3672)
> > > +++
> > > modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/buffers/Buffers.java
> > >      (working copy)
> > > @@ -24,7 +24,7 @@
> > >  public class Buffers extends Thread {
> > >      public static final Logger logger =
> > >      Logger.getLogger(Buffers.class);
> > >  
> > > -    public static final int ENTRY_SIZE = 32768;
> > > +  public static final int ENTRY_SIZE = 32768 * 4;
> > >      public static final int ENTRIES_PER_STREAM = 8;
> > >      public static final int MAX_ENTRIES = 1024; // 32 MB
> > >      public static final int PERFORMANCE_LOGGING_INTERVAL = 10000;
> > > Index: modules/provider-coaster/resources/worker.pl
> > > ===================================================================
> > > --- modules/provider-coaster/resources/worker.pl        (revision
> > > 3672)
> > > +++ modules/provider-coaster/resources/worker.pl        (working
> > > copy)
> > > @@ -134,7 +134,7 @@
> > >  my $JOB_COUNT = 0;
> > >  
> > >  use constant BUFSZ => 2048;
> > > -use constant IOBUFSZ => 32768;
> > > +use constant IOBUFSZ => 32768 * 4;
> > >  use constant IOBLOCKSZ => 8;
> > >  
> > >  # If true, enable a profile result that is written to the log
> > > Index:
> > > modules/karajan/src/org/globus/cog/karajan/workflow/service/channels/GSSChannel.java
> > > ===================================================================
> > > ---
> > > modules/karajan/src/org/globus/cog/karajan/workflow/service/channels/GSSChannel.java
> > >        (revision 3672)
> > > +++
> > > modules/karajan/src/org/globus/cog/karajan/workflow/service/channels/GSSChannel.java
> > >        (working copy)
> > > @@ -82,7 +82,7 @@
> > >  
> > >                                 gssContext.requestAnonymity(false);
> > >                                 gssContext.requestCredDeleg(false);
> > > -                               //gssContext.requestConf(false);
> > > +                               gssContext.requestConf(false); //
> > > Uncommented to disable encryption
> > >                                 gssContext.setOption(GSSConstants.GSS_MODE,
> > >                                 GSIConstants.MODE_SSL);
> > >                                 gssContext.setOption(GSSConstants.DELEGATION_TYPE,
> > >                                                 GSIConstants.DELEGATION_TYPE_LIMITED);
> > > mid$
> > > 
> > > 
> > > 
> > > ----- Original Message -----
> > > > From: "Mihael Hategan" <hategan at mcs.anl.gov>
> > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > Sent: Thursday, June 13, 2013 10:47:18 PM
> > > > Subject: coaster transfer performance
> > > > 
> > > > Hi,
> > > > 
> > > > This is what I'm seeing. Disabling encryption makes the stage-ins
> > > > go
> > > > from about 6 MB/s to 70-90 MB/s.
> > > > 
> > > > However, stageouts are still slow (about 4 MB/s). It turns out
> > > > that
> > > > this
> > > > is not due to the shared FS. I disabled writing to disk
> > > > completely,
> > > > and
> > > > the performance is still around 6 MB/s.
> > > > 
> > > > I'll need to find out why, but probably not tonight.
> > > > 
> > > > Mihael
> > > > 
> > > > 
> > 
> > 
> > 





More information about the Swift-devel mailing list