[Swift-devel] Re: test v0.1rc1

Mihael Hategan hategan at mcs.anl.gov
Mon Feb 26 21:57:10 CST 2007


Hmm. I made a change to the code that did not seem to be the cause, but
some other, smaller issue and enabled some more debugging in log4j. With
this, I've been running the workflow in a loop on wiggum for two hours
now, and got nothing yet. I don't know what to make of it.

I'll keep running and eventually revert the changes to see if they are
the source.

Mihael

On Mon, 2007-02-26 at 14:47 -0600, Mihael Hategan wrote:
> On Mon, 2007-02-26 at 14:46 -0600, Veronika V. Nefedova wrote:
> > An additional info: This failure happened on TG with 070219 when I was 
> > running 2 molecules at the same time (i.e. two executables at the same 
> > time). When I tried to run just one, it failed with the same exitcode, but 
> > didn't have that handle exception:
> 
> Right. This seems like a different problem, and I'm not sure if it's
> Swift or some problem with TP or the application. That needs to be
> investigated.
> 
> > 
> > 2007-02-26 14:34:41,986 DEBUG vdl:execute2 Application exception: Job chrm 
> > failed with an exit code of 174
> >          sys:throw @ vdl-int.k, line: 108
> >          vdl:checkexitcode @ vdl-int.k, line: 367
> >          vdl:execute2 @ execute-default.k, line: 22
> >          vdl:execute @ swift-MolDyn.kml, line: 69
> >          charmm @ swift-MolDyn.kml, line: 279
> >          vdl:mains @ swift-MolDyn.kml, line: 261
> > <here it re-tries it>
> > 
> > Again, the failure with 070219 happens only on TG, on localhost (wiggum) 
> > its working just fine.
> > 
> > Nika
> > 
> > 
> > At 02:38 PM 2/26/2007, Mihael Hategan wrote:
> > >That's fine. Just wanted to be clear that we're talking about the same
> > >error. It's good that it also occurs in 070219, because there are no
> > >recent changes I could remember that could trigger it. It's also good to
> > >know that it may or may not occur, because I know approximately what
> > >class of problem we're dealing with.
> > >
> > >Mihael
> > >
> > >On Mon, 2007-02-26 at 14:37 -0600, Veronika V. Nefedova wrote:
> > > > Yes, I didn't paste it -- its all in the log. If you'd like I can send you
> > > > the log as an attachment...
> > > >
> > > > Nika
> > > >
> > > > At 02:33 PM 2/26/2007, Mihael Hategan wrote:
> > > > >Wait, because I'm missing something. Wasn't the error supposed to be
> > > > >"TaskHandler can only handle unsubmitted tasks"?
> > > > >
> > > > >On Mon, 2007-02-26 at 14:26 -0600, Veronika V. Nefedova wrote:
> > > > > > And now its getting interesting!
> > > > > >
> > > > > > I have now the same failure (as below) with 070219 as I had on 
> > > localhost
> > > > > > with v0.1rc1 *BUT* when running on TG. Failed at the same point (while
> > > > > > trying to run the last app in the workflow), with the same exceptions.
> > > > > > Strange that 070219 worked on localhost (and still working).
> > > > > >
> > > > > > The log is on wiggum:
> > > > > /sandbox/ydeng/alamines/swift-MolDyn-690y7r1skc8z0.log
> > > > > >
> > > > > > 2007-02-26 14:10:16,543 INFO  vdl:execute2 Running job 
> > > chrm-rmnoet7i chrm
> > > > > > with arguments [system:solv_m001, title:solv, stitle:m001,
> > > > > > rtffile:parm03_gaff_all.rtf, paramfile:parm03_gaffnb_all.prm,
> > > > > > gaff:m001_am1, nwater:400, ligcrd:lyz, rforce:0, iseed:3131887, 
> > > rwater:15,
> > > > > > nstep:100, minstep:100, skipstep:100, startstep:10000] in
> > > > > > swift-MolDyn-690y7r1skc8z0/chrm-rmnoet7i on TG-NCSA
> > > > > > 2007-02-26 14:11:18,586 DEBUG vdl:execute2 Application exception: 
> > > Job chrm
> > > > > > failed with an exit code of 174
> > > > > > <snip>
> > > > > > All input files are staged in...
> > > > > >
> > > > > >
> > > > > > Nika
> > > > > >
> > > > > > At 02:17 PM 2/26/2007, Veronika  V. Nefedova wrote:
> > > > > > >You can try to run my application, or look in the logs. I ran it 
> > > all on
> > > > > > >wiggum. The log is:
> > > > > > >/sandbox/ydeng/alamines/swift-MolDyn-8q6ygr7cy15c2.log
> > > > > > >
> > > > > > >the dtm file I am running is /sandbox/ydeng/alamines/swift-MolDyn.dtm
> > > > > > >
> > > > > > >Nika
> > > > > > >
> > > > > > >At 01:39 PM 2/26/2007, Mihael Hategan wrote:
> > > > > > >>That doesn't sound good. How do I reproduce this?
> > > > > > >>
> > > > > > >>Mihael
> > > > > > >>
> > > > > > >>On Mon, 2007-02-26 at 13:21 -0600, Veronika V. Nefedova wrote:
> > > > > > >> > The one Ben asked us all to test:
> > > > > > >> >
> > > > > > >> > >http://www.ci.uchicago.edu/swift/tests/vdsk-0.1rc1.tar.gz
> > > > > > >> >
> > > > > > >> > At 01:15 PM 2/26/2007, Mihael Hategan wrote:
> > > > > > >> > >On Mon, 2007-02-26 at 13:05 -0600, Veronika V. Nefedova wrote:
> > > > > > >> > > > When I tried to run my working workflow with a new version, it
> > > > > > >> gave me an
> > > > > > >> > > > exception:
> > > > > > >> > >
> > > > > > >> > >Which new version?
> > > > > > >> > >
> > > > > > >> > >Mihael
> > > > > > >> > >
> > > > > > >> > > >
> > > > > > >> > > > Warning: Task handler throws exception but does not set status
> > > > > > >> > > >
> > > > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> > > > > > >> > > > TaskHandler can only handle unsubmitted tasks
> > > > > > >> > > >          at
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >>
> > > > > 
> > > org.globus.cog.abstraction.impl.common.task.CachingFileOperationTaskHandler.submit(CachingFileOperationTaskHandler.java:20)
> > > > > > >> > > >          at
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >>
> > > > > 
> > > org.globus.cog.karajan.scheduler.submitQueue.NonBlockingSubmit.run(NonBlockingSubmit.java:78)
> > > > > > >> > > >          at java.lang.Thread.run(Thread.java:534)
> > > > > > >> > > >
> > > > > > >> > > > [349] wiggum /sandbox/ydeng/alamines > \\
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > I do not have this happening with 070219 built.
> > > > > > >> > > >
> > > > > > >> > > > Nika
> > > > > > >> > > >
> > > > > > >> > > > At 06:12 AM 2/26/2007, Ben Clifford wrote:
> > > > > > >> > > >
> > > > > > >> > > > >On Mon, 26 Feb 2007, Ben Clifford wrote:
> > > > > > >> > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > > > v0.1rc1 was built at the end of last week. please spend
> > > > > some time
> > > > > > >> > > testing
> > > > > > >> > > > >
> > > > > > >> > > > >here's the URL for download:
> > > > > > >> > > > >
> > > > > > >> > > > >http://www.ci.uchicago.edu/swift/tests/vdsk-0.1rc1.tar.gz
> > > > > > >> > > > >
> > > > > > >> > > > >--
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > _______________________________________________
> > > > > > >> > > > Swift-devel mailing list
> > > > > > >> > > > Swift-devel at ci.uchicago.edu
> > > > > > >> > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > > > > > >> > > >
> > > > > > >> >
> > > > > > >> >
> > > > > > >
> > > > > > >
> > > > > > >_______________________________________________
> > > > > > >Swift-devel mailing list
> > > > > > >Swift-devel at ci.uchicago.edu
> > > > > > >http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > > > > >
> > > > > >
> > > >
> > > >
> > 
> > 
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 




More information about the Swift-devel mailing list