[Swift-devel] Creating 0.94 RC with fixes for STOMP

Michael Wilde wilde at mcs.anl.gov
Wed Nov 21 08:45:19 CST 2012


was: Re: stripped down script still has some confusing issues

Karen, Mihael, all - thanks again for pushing through and solving these problems.

Its great to see confirmation here of the perhaps obvious thesis that if a system gives clear messages, users can far more readily resolve problems.

Mihael - David is planning to create a new 0.94 release candidate, and was deliberating how much of trunk to include. (My preference would be to start the RC with all of current trunk).  Can you two discuss what to include in the RC, and then David will create the release and start testing it?

Thanks,

- Mike

----- Original Message -----
> From: "Karen L Schuchardt" <Karen.Schuchardt at pnnl.gov>
> To: "Mihael Hategan" <hategan at mcs.anl.gov>
> Cc: "Michael Wilde" <wilde at mcs.anl.gov>, "Jared M Chase" <jared.chase at pnnl.gov>, "Khushbu Agarwal"
> <Khushbu.Agarwal at pnnl.gov>
> Sent: Wednesday, November 21, 2012 1:07:08 AM
> Subject: RE: Fwd: stripped down script still has some confusing issues
> Hey guys, I managed to get the script fully reconstructed and run it
> out quite a way in my test environment on my mac. The improved error
> messages really helped a lot. I had several errors due mainly to
> having to tweak things a bit for the test environment but they were
> easy to pinpoint. That is the good news. The bad news is that it looks
> like part of our work environment on hopper was cleaned out during the
> last disk purge so we will have to reconstruct it before I can truly
> test it. Awww, its always something. But if you want to do an RC off
> this version, that seems reasonable to me. Thanks for your efforts.
> 
> Karen
> ________________________________________
> From: Mihael Hategan [hategan at mcs.anl.gov]
> Sent: Tuesday, November 20, 2012 3:20 AM
> To: Schuchardt, Karen L
> Cc: Michael Wilde; Chase, Jared M; Agarwal, Khushbu
> Subject: RE: Fwd: stripped down script still has some confusing issues
> 
> On Mon, 2012-11-19 at 20:09 -0800, Mihael Hategan wrote:
> > On Mon, 2012-11-19 at 19:56 -0800, Schuchardt, Karen L wrote:
> > > Mihael,
> > >
> > > [...]
> >
> > >   And yet the script still dies in the same place. For me
> > >   iteration 32
> > > always and on either stomp.out or the stomp restart file. So
> > > something else is going on here.
> >
> > I'll give it a shot and see what's happening.
> 
> Runs fine with trunk.
> 
> Fails with 0.93, at iteration 40 for me, due to the deep linking
> issue.
> The symptom is "<some app> failed with an exit code of 1". A
> workaround
> is to specify a scratch tag in sites.xml (it forces swift to copy file
> to scratch instead of linking):
> <config>
> <pool handle="localhost">
> <filesystem provider="local"/>
> <execution provider="local" />
> <workdirectory >/home/mike/tmp/swift</workdirectory>
> <scratch>/home/mike/tmp/swift-scratch</scratch>
> </pool>
> </config>
> 
> Runs fine after that.
> 
> If the above fix doesn't work for you, you might be running into a
> different problem. Please send me the logs in that case.
> 
> [...]
> 
> > >
> > > Also Jared had trouble compiling the trunk. Can you send us a
> > > revised distribution?
> >
> > Sure, I can do that. The improved error messages might help.
> 
> www.mcs.anl.gov/~hategan/swift-trunk-r6065-cog-r3515.tar.gz
> 
> As usual, it's trunk, so it might break in various places.
> However, I added some improved error reporting, including a detector
> of
> mapping badness (it only issues warnings, but they will most likely
> result in errors later). Here's a sample output:
> 
> RunID: 20121120-0311-6sbzpc39
> Progress: time: Tue, 20 Nov 2012 03:12:03 -0800
> SwiftScript trace: 0, 500
> SwiftScript trace: Starting iteration, 0
> 
> Duplicate mapping found:
> h5part_files (line 272) is used to read from
> file://localhost/run-0/sph-0/sph.output.h5part
> sphOut (line 234) is used to write to
> file://localhost/run-0/sph-0/sph.output.h5part
> Execution failed:
> Exception in gpg:
> Arguments: [-m, run-0/writeData.out, -r, 0, -p, run-0/stomp.plot, -f,
> run-0/sph-0/sph.output.h5part]
> Host: localhost
> Directory: kls1-20121120-0311-6sbzpc39/jobs/v/gpg-vs5hh81l
> Caused by:
> File not found:
> /home/mike/tmp/swift-bugs/pnnl/subsurface/./run-0/sph-0/sph.output.h5part
> gpg, kls1.swift, line 276
> hybridModel, kls1.swift, line 303
> iterate, kls1.swift, line 297
> 
> Mihael

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list