[Swift-devel] coaster io with NIO.

Michael Wilde wilde at mcs.anl.gov
Tue Apr 10 22:51:05 CDT 2012


Thanks, Ketan. David, can you try to reproduce the problem with jobsPerNode=1?

- Mike

----- Original Message -----
> From: "Ketan Maheshwari" <ketancmaheshwari at gmail.com>
> To: "Michael Wilde" <wilde at mcs.anl.gov>
> Sent: Tuesday, April 10, 2012 9:31:34 PM
> Subject: Re: [Swift-devel] coaster io with NIO.
> Jobspernode setting were indeed 1 on the tests done on osg.
> 
> 
> I do not recall seeing the blocking messages seen by David's
> current/recent tests.
> 
> 
> On Tuesday, April 10, 2012, Michael Wilde wrote:
> 
> 
> Mihael, while the scenario below seems plausible, I thought that the
> timeout problem was first detected on OSG nodes, which should have
> been running with jobsPerNode=1.
> 
> David, Ketan, can you comment on the jobsPerNode settings for the many
> tests you have done which encountered this problem?
> 
> - Mike
> 
> ----- Original Message -----
> > From: "Mihael Hategan" < hategan at mcs.anl.gov >
> > To: "David Kelly" < davidk at ci.uchicago.edu >
> > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >
> > Sent: Tuesday, April 10, 2012 7:04:56 PM
> > Subject: Re: [Swift-devel] coaster io with NIO.
> > On Tue, 2012-04-10 at 17:25 -0500, David Kelly wrote:
> > > Yep, I gave it a try with automatic coasters, but am still seeing
> > > the timeouts.
> > >
> >
> > I think I see the problem. With multiple jobs per worker the
> > situation
> > may such be that both a stagein and a stageout happen at the same
> > time
> > (on the same TCP connection). If the stageout runs out of buffers
> > the
> > writing to the socket on the worker side blocks causing the read
> > loop
> > to
> > not happen. This eventually fills the other direction on the TCP
> > link
> > and everything deadlocks.
> >
> >
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> 
> --
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> 
> 
> --
> Ketan

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list