[Swift-devel] coaster transfer performance

Michael Wilde wilde at mcs.anl.gov
Fri Jun 14 13:29:54 CDT 2013


I agree about testing; note though that you need to test on a shared filesystem under load before you will see where the bigger buffers make a difference.  For example, during times when "ls" on /home or /project (on midway) or /lustre (on beagle) run very slow.  Thats when reducing the number of trips of the file server makes the most difference.

- Mike


----- Original Message -----
> From: "Mihael Hategan" <hategan at mcs.anl.gov>
> To: "Michael Wilde" <wilde at mcs.anl.gov>
> Cc: "Swift Devel" <swift-devel at ci.uchicago.edu>
> Sent: Friday, June 14, 2013 11:58:33 AM
> Subject: Re: coaster transfer performance
> 
> I would like to point out that I have not seen a difference on
> stage-out
> performance when adjusting the buffer size (on the worker side). I
> have
> *not* tried a similar experiment for the stage-in buffer size, and I
> believe that before we increase that permanently, we should see if it
> does indeed make a difference.
> 
> Mihael
> 
> On Fri, 2013-06-14 at 08:19 -0500, Michael Wilde wrote:
> > Im moving this discussion to swift-devel.
> > 
> > Mihael's fix to the timeout problem works very well - that problem
> > has not recurred so far as I can tell.
> > 
> > We then discussed performance improvements for coaster provider
> > staging.
> > 
> > Mihael suggested this:
> > 
> > "there should be a commented-out line in GSSChannel that says
> > gssContext.requestConf(false); (line 85)
> > uncomment that and re-compile, it will disable encryption"
> > 
> > ...and observed this:
> > 
> > > Disabling encryption makes the stage-ins go from about 6 MB/s to
> > > 70-90 MB/s.
> > > 
> > > However, stageouts are still slow (about 4 MB/s).
> > 
> > I tried that, and it speeds things up a great deal. (No
> > measurements from the fMRI demo case yet; measurements from Mihael
> > below.)
> > 
> > I also tried increasing the transfer buffer size. The current
> > buffer is 32KB.  I tried 16X (512KB) and 4X (128KB).
> > 
> > At 512KB, staging-in to the app goes "very fast" (again, no
> > measures yet) but output hangs after about 25 of 100 transfers.
> >  At 128KB, it hasn't hung yet, and goes very fast.  I too am
> > seeing slower stage-out times from the app back to swift than
> > stage=in times.
> > 
> > I'm testing on midway's shared fs right now (GPFS?) and will test
> > on hard disk next.
> > 
> > But with these two fixes, things are working with very nice
> > reliability and performance in tests so far.
> > 
> > The mods for these changes are below, followed by Mihael's report
> > on testing w/o encryption.
> > 
> > Yadu, please integrate your coaster-provider-staging tests into the
> > test suite and test across a range of file sizes, durations, and
> > endpoints. Im happy to discuss this, on this list and/or in
> > person.
> > 
> > Thanks,
> > 
> > - Mike
> > 
> > mid$ cat mods
> > modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/buffers/Buffers.java
> > modules/provider-coaster/resources/worker.pl
> > modules/karajan/src/org/globus/cog/karajan/workflow/service/channels/GSSChannel.java
> > mid$ svn diff
> > Index:
> > modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/buffers/Buffers.java
> > ===================================================================
> > ---
> > modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/buffers/Buffers.java
> >      (revision 3672)
> > +++
> > modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/buffers/Buffers.java
> >      (working copy)
> > @@ -24,7 +24,7 @@
> >  public class Buffers extends Thread {
> >      public static final Logger logger =
> >      Logger.getLogger(Buffers.class);
> >  
> > -    public static final int ENTRY_SIZE = 32768;
> > +  public static final int ENTRY_SIZE = 32768 * 4;
> >      public static final int ENTRIES_PER_STREAM = 8;
> >      public static final int MAX_ENTRIES = 1024; // 32 MB
> >      public static final int PERFORMANCE_LOGGING_INTERVAL = 10000;
> > Index: modules/provider-coaster/resources/worker.pl
> > ===================================================================
> > --- modules/provider-coaster/resources/worker.pl        (revision
> > 3672)
> > +++ modules/provider-coaster/resources/worker.pl        (working
> > copy)
> > @@ -134,7 +134,7 @@
> >  my $JOB_COUNT = 0;
> >  
> >  use constant BUFSZ => 2048;
> > -use constant IOBUFSZ => 32768;
> > +use constant IOBUFSZ => 32768 * 4;
> >  use constant IOBLOCKSZ => 8;
> >  
> >  # If true, enable a profile result that is written to the log
> > Index:
> > modules/karajan/src/org/globus/cog/karajan/workflow/service/channels/GSSChannel.java
> > ===================================================================
> > ---
> > modules/karajan/src/org/globus/cog/karajan/workflow/service/channels/GSSChannel.java
> >        (revision 3672)
> > +++
> > modules/karajan/src/org/globus/cog/karajan/workflow/service/channels/GSSChannel.java
> >        (working copy)
> > @@ -82,7 +82,7 @@
> >  
> >                                 gssContext.requestAnonymity(false);
> >                                 gssContext.requestCredDeleg(false);
> > -                               //gssContext.requestConf(false);
> > +                               gssContext.requestConf(false); //
> > Uncommented to disable encryption
> >                                 gssContext.setOption(GSSConstants.GSS_MODE,
> >                                 GSIConstants.MODE_SSL);
> >                                 gssContext.setOption(GSSConstants.DELEGATION_TYPE,
> >                                                 GSIConstants.DELEGATION_TYPE_LIMITED);
> > mid$
> > 
> > 
> > 
> > ----- Original Message -----
> > > From: "Mihael Hategan" <hategan at mcs.anl.gov>
> > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > Sent: Thursday, June 13, 2013 10:47:18 PM
> > > Subject: coaster transfer performance
> > > 
> > > Hi,
> > > 
> > > This is what I'm seeing. Disabling encryption makes the stage-ins
> > > go
> > > from about 6 MB/s to 70-90 MB/s.
> > > 
> > > However, stageouts are still slow (about 4 MB/s). It turns out
> > > that
> > > this
> > > is not due to the shared FS. I disabled writing to disk
> > > completely,
> > > and
> > > the performance is still around 6 MB/s.
> > > 
> > > I'll need to find out why, but probably not tonight.
> > > 
> > > Mihael
> > > 
> > > 
> 
> 
> 



More information about the Swift-devel mailing list