[Swift-devel] swift 0.93 deadlock

Michael Wilde wilde at mcs.anl.gov
Thu Sep 15 11:46:59 CDT 2011


David, it sounds like more analysis is needed here. If the SWAT runs are not showing a deadlock (but your runs are) then likely we have two different problems here.

Another case we saw in 0.93 with scripts failing to progress is due to the overAllocation parameter problem that Mihael fixed yesterday. The symptom there is that Swift starts a coaster with a time slot too small for the apps in the script, and no apps wind up running.  I think that situation in general merits a separate ticket, and may have been discussed on swift-devel (but quite a while ago).

Can you determine if indeed Papia's SWAT runs are hanging for a reason other than a Java deadlock?

- Mike


----- Original Message -----
> From: "David Kelly" <davidk at ci.uchicago.edu>
> To: "Michael Wilde" <wilde at mcs.anl.gov>
> Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>, "Papia Rizwan" <papia.rizwan at gmail.com>, "Mihael Hategan"
> <hategan at mcs.anl.gov>
> Sent: Thursday, September 15, 2011 8:03:09 AM
> Subject: Re: [Swift-devel] swift 0.93 deadlock
> The jstack log corresponds to the most recent log file -
> http://www.ci.uchicago.edu/~davidk/swat/cce_ua-20110914-1934-frd3thja.log.
> jstack does not report any deadlocks, but I thought it might be useful
> so I included it. Swift was not making any progress for about 5 hours
> before I sent the logs. I am running the latest 0.93 branch. I will
> try again today.
> 
> David
> 
> ----- Original Message -----
> > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > To: "David Kelly" <davidk at ci.uchicago.edu>
> > Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>, "Papia
> > Rizwan" <papia.rizwan at gmail.com>, "Mihael Hategan"
> > <hategan at mcs.anl.gov>
> > Sent: Thursday, September 15, 2011 5:54:11 AM
> > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > David, which of the many Swift logs in that /swat dir does the
> > jstack.log pertain to? How many of these runs deadlocked?
> >
> > And, did you verify that you (and Papia) are running on the latest
> > rev
> > of the 0.93 branch?
> >
> > - Mike
> >
> > ----- Original Message -----
> > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > To: "Mihael Hategan" <hategan at mcs.anl.gov>
> > > Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>, "Papia
> > > Rizwan" <papia.rizwan at gmail.com>, "Michael Wilde"
> > > <wilde at mcs.anl.gov>
> > > Sent: Wednesday, September 14, 2011 11:04:41 PM
> > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > I was able to reproduce the problem with persistent coasters on
> > > the
> > > MCS servers.
> > >
> > > The jstack output is at
> > > http://www.ci.uchicago.edu/~davidk/swat/jstack.log
> > >
> > > The full collection of logs are at
> > > http://www.ci.uchicago.edu/~davidk/swat.
> > >
> > > David
> > >
> > > ----- Original Message -----
> > > > From: "Mihael Hategan" <hategan at mcs.anl.gov>
> > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>, "Papia
> > > > Rizwan" <papia.rizwan at gmail.com>
> > > > Sent: Wednesday, September 14, 2011 10:30:48 PM
> > > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > > Could you also forward the attachments please?
> > > >
> > > > Mihael
> > > >
> > > > On Wed, 2011-09-14 at 14:46 -0500, Michael Wilde wrote:
> > > > > I think I am seeing a similar deadlock on 0.93 in the ParVis
> > > > > script,
> > > > > and am trying to get a clean log and jstack to confirm.
> > > > >
> > > > > As far as I can tell, Papia is running the correct 0.93 code,
> > > > > but
> > > > > please verify.
> > > > >
> > > > > David will try to replicate this problem as well.
> > > > >
> > > > > - Mike
> > > > >
> > > > > ----- Original Message -----
> > > > > > From: "Papia Rizwan" <papia.rizwan at gmail.com>
> > > > > > To: "swift-devel Devel" <swift-devel at ci.uchicago.edu>,
> > > > > > "Michael
> > > > > > Wilde" <wilde at mcs.anl.gov>, "Michael P. Shields"
> > > > > > <mpshields at anl.gov>
> > > > > > Sent: Wednesday, September 14, 2011 1:56:13 PM
> > > > > > Subject: swift 0.93 deadlock
> > > > > > Attached are the jstack output and the log file.
> > > > > >
> > > > > > --
> > > > > > Papia Rizwan
> > > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Swift-devel mailing list
> > > > Swift-devel at ci.uchicago.edu
> > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >
> > --
> > Michael Wilde
> > Computation Institute, University of Chicago
> > Mathematics and Computer Science Division
> > Argonne National Laboratory

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list