[Swift-devel] coaster status summary

Mihael Hategan hategan at mcs.anl.gov
Tue Apr 8 10:00:48 CDT 2008


You may want to try lowering the window size. The default is in the
order of 100K (as far as I understand from various sources). That may be
quite a bit if you have many connections. It may also be fairly useless
for local LAN connections used to send short messages (i.e. less than
the MTU/MSS).

On Tue, 2008-04-08 at 08:31 -0500, Ioan Raicu wrote:
> We use the default.  For the SiCortex, we had to tweak the TCP 
> keepalives to ensure that the TCP connections were not getting 
> disconnected by the firewall on the SiCortex, which only allowed 180 
> seconds of inactivity before it disconnected connections.  This meant 
> that any job that took more than 180 seconds, or any Falkon idleness for 
> more than 180 seconds resulted in TCP connection terminations.  BTW, we 
> did not experience this kind of firewall rules when running in other 
> environments, so it took us a week to debug and find the root of the 
> problem.  This also happens because the Falkon service was running 
> outside the SiCortex home network, but we had to do this as the SiCortex 
> doesn't support Java, and at the time, didn't have access to any system 
> within the internal network that supported Java.
> 
> Ioan
> 
> Mihael Hategan wrote:
> > Do you tweak the TCP window size or do you use the default?
> >
> > On Mon, 2008-04-07 at 13:18 -0500, Ioan Raicu wrote:
> >   
> >> I agree that the BG/P is the only system I can think of right now that
> >> won't work with the UDP scheme you currently have, assuming that you
> >> will run the service on a login node that has access to both compute
> >> nodes and external world (i.e. Swift).  The compute nodes don't
> >> support Java, so you'd have to have some C/Fortran code, or maybe some
> >> scripting language (which I don't know what kind of support there is).
> >> If you use C or Fortran, MPI becomes a viable alternative.  TCP has
> >> always been an alternative.  Anyways, if UDP doesn't work on the BG/P,
> >> and the BG/P is the only scale large enough (today) that warrants a
> >> connectionless protocol, then I suggest you switch to TCP (which has
> >> worked for us well on the BG/P, and is general enough to work in most
> >> environments) or even MPI (but you loose the generality of TCP, but
> >> might gain performance).
> >>
> >> Ioan
> >>
> >> Mihael Hategan wrote: 
> >>     
> >>> On Mon, 2008-04-07 at 12:49 +0000, Ben Clifford wrote:
> >>>   
> >>>       
> >>>> Wary of excessive optimisation of job completion notification speed in 
> >>>> order to get high 'trivial/useless job' numbers, when there also seem to 
> >>>> be problems getting shared filesystem access fast enough for non-useless 
> >>>> jobs. Getting a ridiculously high trivial job throughput is not (in my 
> >>>> eyes) a design goal of this coaster work.
> >>>>     
> >>>>         
> >>> 200 j/s should be enough for anybody.
> >>>
> >>> Joking aside, the issue was ability to scale to large number of jobs
> >>> rather than speed. But it looks like the issue is only an issue for
> >>> monsters such as the BG/P.
> >>>
> >>>   
> >>>
> >>> _______________________________________________
> >>> Swift-devel mailing list
> >>> Swift-devel at ci.uchicago.edu
> >>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >>>
> >>>   
> >>>       
> >> -- 
> >> ===================================================
> >> Ioan Raicu
> >> Ph.D. Candidate
> >> ===================================================
> >> Distributed Systems Laboratory
> >> Computer Science Department
> >> University of Chicago
> >> 1100 E. 58th Street, Ryerson Hall
> >> Chicago, IL 60637
> >> ===================================================
> >> Email: iraicu at cs.uchicago.edu
> >> Web:   http://www.cs.uchicago.edu/~iraicu
> >> http://dev.globus.org/wiki/Incubator/Falkon
> >> http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page
> >> ===================================================
> >> ===================================================
> >>
> >>     
> >
> >
> >   
> 




More information about the Swift-devel mailing list