[Swift-devel] Integrity of trunk in SVN (was: Re: swift 0.93 deadlock)
Michael Wilde
wilde at mcs.anl.gov
Fri Sep 16 11:44:54 CDT 2011
Sounds good, thanks, David.
3 questions on this:
- its not related to the 0.93 deadlock that is holding back the SWAT app, is it?
(ie not related to that email thread?)
- can you trace back how the code got damaged in SVN, and see if anything else got similarly back-leveled that we may not yet have detected in trunk?
- can you take the action item of resuming nightly automated test suite execution on trunk, and period (or even nightly) testing on the latest release (to see if the suite catches occasional sporadically-occuring bugs)
Thanks,
- Mike
----- Original Message -----
> From: "David Kelly" <davidk at ci.uchicago.edu>
> To: "Michael Wilde" <wilde at mcs.anl.gov>
> Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>, "Papia Rizwan" <papia.rizwan at gmail.com>
> Sent: Thursday, September 15, 2011 6:55:25 PM
> Subject: Re: [Swift-devel] swift 0.93 deadlock
> Persistent coasters in trunk is fixed now. I think an older version of
> coaster-service somehow got checked in, so I ran a reverse merge and
> resolved the conflicts. I tested on mcs workstations with 1000 cats
> and all seems well.
>
> David
>
> ----- Original Message -----
> > From: "David Kelly" <davidk at ci.uchicago.edu>
> > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>, "Papia
> > Rizwan" <papia.rizwan at gmail.com>
> > Sent: Thursday, September 15, 2011 4:34:02 PM
> > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > I was able to get it running on PADS with trunk. I ran into the same
> > issue.
> >
> > http://www.ci.uchicago.edu/~davidk/swat3/jstack.log
> > http://www.ci.uchicago.edu/~davidk/swat3/cce_ua-20110915-1617-sd4svyo2.log
> >
> > David
> >
> > ----- Original Message -----
> > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>, "Papia
> > > Rizwan" <papia.rizwan at gmail.com>
> > > Sent: Thursday, September 15, 2011 2:39:47 PM
> > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > The sites.xml in /homes/papia/SwiftSCE2 seems to be using passive
> > > persistent coasters. Is there a way to use automatic coasters on
> > > the
> > > MCS workstations? I'll try copying this over to PADS and running
> > > there
> > > to see if I can reproduce it.
> > >
> > > David
> > >
> > > ----- Original Message -----
> > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>, "Papia
> > > > Rizwan" <papia.rizwan at gmail.com>, "Mihael Hategan"
> > > > <hategan at mcs.anl.gov>
> > > > Sent: Thursday, September 15, 2011 2:18:17 PM
> > > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > > Can you make SWAT run under trunk, as Papia is testing using
> > > > standard
> > > > auto coasters, and doesnt need any of the missing
> > > > coaster-service
> > > > options.
> > > >
> > > > - Mike
> > > >
> > > >
> > > > ----- Original Message -----
> > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>, "Papia
> > > > > Rizwan" <papia.rizwan at gmail.com>, "Mihael Hategan"
> > > > > <hategan at mcs.anl.gov>
> > > > > Sent: Thursday, September 15, 2011 2:15:36 PM
> > > > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > > > I got past the compilation errors by renaming the all
> > > > > functions
> > > > > with
> > > > > capitalization, but ran into an issue with coaster-service.
> > > > > Last
> > > > > week
> > > > > I noticed coaster-service was missing options for dynamic
> > > > > ports.
> > > > > I
> > > > > found today that it is also missing -passive. I'll try to
> > > > > track
> > > > > down
> > > > > where this changed and restore the previous version.
> > > > >
> > > > > David
> > > > >
> > > > > ----- Original Message -----
> > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>,
> > > > > > "Papia
> > > > > > Rizwan" <papia.rizwan at gmail.com>, "Mihael Hategan"
> > > > > > <hategan at mcs.anl.gov>
> > > > > > Sent: Thursday, September 15, 2011 12:37:13 PM
> > > > > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > > > > Excellent, thanks - thats good. I also just verified that
> > > > > > Papia
> > > > > > is
> > > > > > not
> > > > > > using the overAllocation tags in the sites file, so this
> > > > > > problem
> > > > > > is
> > > > > > clearly a Java deadlock and has nothing to do with the
> > > > > > scheduling
> > > > > > problem that the (now fixed) overAllocation problem was
> > > > > > causing..
> > > > > >
> > > > > > My understanding is that this SWAT script is failing under
> > > > > > trunk
> > > > > > because of the recent token case handling issue (I think the
> > > > > > camel-case one). Can you work with Papia to see if either
> > > > > > that
> > > > > > issue
> > > > > > is now fixed, or if her script can be changed to avoid that,
> > > > > > so
> > > > > > that
> > > > > > you can both test the SWAT script with trunk, to see if the
> > > > > > deadlock
> > > > > > still occurs?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > - MIke
> > > > > >
> > > > > >
> > > > > > ----- Original Message -----
> > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>,
> > > > > > > "Papia
> > > > > > > Rizwan" <papia.rizwan at gmail.com>, "Mihael Hategan"
> > > > > > > <hategan at mcs.anl.gov>
> > > > > > > Sent: Thursday, September 15, 2011 12:29:03 PM
> > > > > > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > > > > > I narrowed down the problem a bit. Last night I ran jstack
> > > > > > > on
> > > > > > > the
> > > > > > > wrong java process which is why it didn't report a
> > > > > > > deadlock.
> > > > > > >
> > > > > > > Papia and I are seeing the same issue.
> > > > > > >
> > > > > > > My jstack:
> > > > > > > http://www.ci.uchicago.edu/~davidk/swat2/jstack.log
> > > > > > > Papia's jstack:
> > > > > > > http://www.ci.uchicago.edu/~davidk/swat2/papia-jstack.log
> > > > > > >
> > > > > > > It happens in the same place:
> > > > > > >
> > > > > > > org.griphyn.vdl.karajan.lib.cache.File.lock(File.java:100)
> > > > > > > org.griphyn.vdl.karajan.lib.cache.LRUFileCache.addAndLockEntry(LRUFileCache.java:24)
> > > > > > >
> > > > > > > Filed as bug #559
> > > > > > >
> > > > > > > David
> > > > > > >
> > > > > > > ----- Original Message -----
> > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>,
> > > > > > > > "Papia
> > > > > > > > Rizwan" <papia.rizwan at gmail.com>, "Mihael Hategan"
> > > > > > > > <hategan at mcs.anl.gov>
> > > > > > > > Sent: Thursday, September 15, 2011 11:46:59 AM
> > > > > > > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > > > > > > David, it sounds like more analysis is needed here. If
> > > > > > > > the
> > > > > > > > SWAT
> > > > > > > > runs
> > > > > > > > are not showing a deadlock (but your runs are) then
> > > > > > > > likely
> > > > > > > > we
> > > > > > > > have
> > > > > > > > two
> > > > > > > > different problems here.
> > > > > > > >
> > > > > > > > Another case we saw in 0.93 with scripts failing to
> > > > > > > > progress
> > > > > > > > is
> > > > > > > > due
> > > > > > > > to
> > > > > > > > the overAllocation parameter problem that Mihael fixed
> > > > > > > > yesterday.
> > > > > > > > The
> > > > > > > > symptom there is that Swift starts a coaster with a time
> > > > > > > > slot
> > > > > > > > too
> > > > > > > > small for the apps in the script, and no apps wind up
> > > > > > > > running.
> > > > > > > > I
> > > > > > > > think
> > > > > > > > that situation in general merits a separate ticket, and
> > > > > > > > may
> > > > > > > > have
> > > > > > > > been
> > > > > > > > discussed on swift-devel (but quite a while ago).
> > > > > > > >
> > > > > > > > Can you determine if indeed Papia's SWAT runs are
> > > > > > > > hanging
> > > > > > > > for
> > > > > > > > a
> > > > > > > > reason
> > > > > > > > other than a Java deadlock?
> > > > > > > >
> > > > > > > > - Mike
> > > > > > > >
> > > > > > > >
> > > > > > > > ----- Original Message -----
> > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>,
> > > > > > > > > "Papia
> > > > > > > > > Rizwan" <papia.rizwan at gmail.com>, "Mihael Hategan"
> > > > > > > > > <hategan at mcs.anl.gov>
> > > > > > > > > Sent: Thursday, September 15, 2011 8:03:09 AM
> > > > > > > > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > > > > > > > The jstack log corresponds to the most recent log file
> > > > > > > > > -
> > > > > > > > > http://www.ci.uchicago.edu/~davidk/swat/cce_ua-20110914-1934-frd3thja.log.
> > > > > > > > > jstack does not report any deadlocks, but I thought it
> > > > > > > > > might
> > > > > > > > > be
> > > > > > > > > useful
> > > > > > > > > so I included it. Swift was not making any progress
> > > > > > > > > for
> > > > > > > > > about
> > > > > > > > > 5
> > > > > > > > > hours
> > > > > > > > > before I sent the logs. I am running the latest 0.93
> > > > > > > > > branch.
> > > > > > > > > I
> > > > > > > > > will
> > > > > > > > > try again today.
> > > > > > > > >
> > > > > > > > > David
> > > > > > > > >
> > > > > > > > > ----- Original Message -----
> > > > > > > > > > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > To: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > Cc: "swift-devel Devel"
> > > > > > > > > > <swift-devel at ci.uchicago.edu>,
> > > > > > > > > > "Papia
> > > > > > > > > > Rizwan" <papia.rizwan at gmail.com>, "Mihael Hategan"
> > > > > > > > > > <hategan at mcs.anl.gov>
> > > > > > > > > > Sent: Thursday, September 15, 2011 5:54:11 AM
> > > > > > > > > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > > > > > > > > David, which of the many Swift logs in that /swat
> > > > > > > > > > dir
> > > > > > > > > > does
> > > > > > > > > > the
> > > > > > > > > > jstack.log pertain to? How many of these runs
> > > > > > > > > > deadlocked?
> > > > > > > > > >
> > > > > > > > > > And, did you verify that you (and Papia) are running
> > > > > > > > > > on
> > > > > > > > > > the
> > > > > > > > > > latest
> > > > > > > > > > rev
> > > > > > > > > > of the 0.93 branch?
> > > > > > > > > >
> > > > > > > > > > - Mike
> > > > > > > > > >
> > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > > > > > > > > > To: "Mihael Hategan" <hategan at mcs.anl.gov>
> > > > > > > > > > > Cc: "swift-devel Devel"
> > > > > > > > > > > <swift-devel at ci.uchicago.edu>,
> > > > > > > > > > > "Papia
> > > > > > > > > > > Rizwan" <papia.rizwan at gmail.com>, "Michael Wilde"
> > > > > > > > > > > <wilde at mcs.anl.gov>
> > > > > > > > > > > Sent: Wednesday, September 14, 2011 11:04:41 PM
> > > > > > > > > > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > > > > > > > > > I was able to reproduce the problem with
> > > > > > > > > > > persistent
> > > > > > > > > > > coasters
> > > > > > > > > > > on
> > > > > > > > > > > the
> > > > > > > > > > > MCS servers.
> > > > > > > > > > >
> > > > > > > > > > > The jstack output is at
> > > > > > > > > > > http://www.ci.uchicago.edu/~davidk/swat/jstack.log
> > > > > > > > > > >
> > > > > > > > > > > The full collection of logs are at
> > > > > > > > > > > http://www.ci.uchicago.edu/~davidk/swat.
> > > > > > > > > > >
> > > > > > > > > > > David
> > > > > > > > > > >
> > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > From: "Mihael Hategan" <hategan at mcs.anl.gov>
> > > > > > > > > > > > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > > > > > > > > > > > Cc: "swift-devel Devel"
> > > > > > > > > > > > <swift-devel at ci.uchicago.edu>,
> > > > > > > > > > > > "Papia
> > > > > > > > > > > > Rizwan" <papia.rizwan at gmail.com>
> > > > > > > > > > > > Sent: Wednesday, September 14, 2011 10:30:48 PM
> > > > > > > > > > > > Subject: Re: [Swift-devel] swift 0.93 deadlock
> > > > > > > > > > > > Could you also forward the attachments please?
> > > > > > > > > > > >
> > > > > > > > > > > > Mihael
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, 2011-09-14 at 14:46 -0500, Michael Wilde
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > I think I am seeing a similar deadlock on 0.93
> > > > > > > > > > > > > in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > ParVis
> > > > > > > > > > > > > script,
> > > > > > > > > > > > > and am trying to get a clean log and jstack to
> > > > > > > > > > > > > confirm.
> > > > > > > > > > > > >
> > > > > > > > > > > > > As far as I can tell, Papia is running the
> > > > > > > > > > > > > correct
> > > > > > > > > > > > > 0.93
> > > > > > > > > > > > > code,
> > > > > > > > > > > > > but
> > > > > > > > > > > > > please verify.
> > > > > > > > > > > > >
> > > > > > > > > > > > > David will try to replicate this problem as
> > > > > > > > > > > > > well.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - Mike
> > > > > > > > > > > > >
> > > > > > > > > > > > > ----- Original Message -----
> > > > > > > > > > > > > > From: "Papia Rizwan"
> > > > > > > > > > > > > > <papia.rizwan at gmail.com>
> > > > > > > > > > > > > > To: "swift-devel Devel"
> > > > > > > > > > > > > > <swift-devel at ci.uchicago.edu>,
> > > > > > > > > > > > > > "Michael
> > > > > > > > > > > > > > Wilde" <wilde at mcs.anl.gov>, "Michael P.
> > > > > > > > > > > > > > Shields"
> > > > > > > > > > > > > > <mpshields at anl.gov>
> > > > > > > > > > > > > > Sent: Wednesday, September 14, 2011 1:56:13
> > > > > > > > > > > > > > PM
> > > > > > > > > > > > > > Subject: swift 0.93 deadlock
> > > > > > > > > > > > > > Attached are the jstack output and the log
> > > > > > > > > > > > > > file.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Papia Rizwan
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > _______________________________________________
> > > > > > > > > > > > Swift-devel mailing list
> > > > > > > > > > > > Swift-devel at ci.uchicago.edu
> > > > > > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Michael Wilde
> > > > > > > > > > Computation Institute, University of Chicago
> > > > > > > > > > Mathematics and Computer Science Division
> > > > > > > > > > Argonne National Laboratory
> > > > > > > >
> > > > > > > > --
> > > > > > > > Michael Wilde
> > > > > > > > Computation Institute, University of Chicago
> > > > > > > > Mathematics and Computer Science Division
> > > > > > > > Argonne National Laboratory
> > > > > >
> > > > > > --
> > > > > > Michael Wilde
> > > > > > Computation Institute, University of Chicago
> > > > > > Mathematics and Computer Science Division
> > > > > > Argonne National Laboratory
> > > >
> > > > --
> > > > Michael Wilde
> > > > Computation Institute, University of Chicago
> > > > Mathematics and Computer Science Division
> > > > Argonne National Laboratory
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
--
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory
More information about the Swift-devel
mailing list