[Swift-devel] Coaster error in RaptorLoops run
Michael Wilde
wilde at mcs.anl.gov
Wed Apr 28 18:32:25 CDT 2010
Looking at these logs closely with Wenjun, it seems that his run stumbled into the deepfield bug, for which the 2 patches mentioned in my prior message are needed.
So for now, this problem can be ignored. Wenjun is retesting with a patched stable swift branch.
- Mike
----- "Wenjun Wu" <wwj at ci.uchicago.edu> wrote:
> Sure. My config files are under the folder:
> /gpfs/pads/oops/scienceportal/swift-svn/etc
>
> the logs can be found at
> /gpfs/pads/oops/scienceportal/scriptadmin/oops-raptorloop/test/RaptorLoops-20100426-1314-tna3q0a6.log
>
> Wenjun
> > Wenjun, can you post more details on the problem you describe below,
> to the swift-devel list (cc'ed here) pointing Mihael to a directory
> with all your logs and config files?
> >
> > Thanks,
> >
> > Mike
> >
> > ----- "wenjun wu"<wwjag at mcs.anl.gov> wrote:
> >
> >
> >> Hi Mike,
> >> Now I can run raptorloop locally but when I launch the jobs
> to
> >> PADS
> >> through coaster:ssh:pbs, I keep getting the following exceptions
> >> after the swift finishes the most steps.
> >>
> >> 2010-04-26 13:27:25,408-0500 INFO AbstractStreamKarajanChannel
> >> 01173289853: Channel shut down
> >> java.lang.Throwable
> >> at
> >>
> org.globus.cog.karajan.workflow.service.channels.AbstractTCPChannel.close(AbstractTCPChannel.java:97)
> >> at
> >>
> org.globus.cog.karajan.workflow.service.channels.MetaChannel.close(MetaChannel.java:87)
> >> at
> >>
> org.globus.cog.abstraction.impl.execution.coaster.ServiceManager.statusChanged(ServiceManager.java:232)
> >> at
> >>
> org.globus.cog.abstraction.impl.common.task.TaskImpl.notifyListeners(TaskImpl.java:236)
> >> at
> >>
> org.globus.cog.abstraction.impl.common.task.TaskImpl.setStatus(TaskImpl.java:224)
> >> at
> >>
> org.globus.cog.abstraction.impl.common.task.TaskImpl.setStatus(TaskImpl.java:253)
> >>
> >> at
> >>
> org.globus.cog.abstraction.impl.ssh.execution.JobSubmissionTaskHandler.SSHTaskStatusChanged(JobSubmissionTaskHandler.java:193)
> >> at
> >>
> org.globus.cog.abstraction.impl.ssh.SSHRunner.notifyListeners(SSHRunner.java:84)
> >> at
> >>
> org.globus.cog.abstraction.impl.ssh.SSHRunner.run(SSHRunner.java:43)
> >>
> >> at java.lang.Thread.run(Thread.java:595)
> >> 2010-04-26 13:27:25,408-0500 INFO ChannelManager Handling channel
> >> exception
> >> java.io.IOException: Stream closed. at
> >> java.net.PlainSocketImpl.available(PlainSocketImpl.java:428)
> >> at
> >> java.net.SocketInputStream.available(SocketInputStream.java:217)
> >> at
> >>
> org.globus.gsi.gssapi.net.GssInputStream.available(GssInputStream.java:107)
> >>
> >> at
> >>
> org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel.step(AbstractStreamKarajanChannel.java:113)
> >> at
> >>
> org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel$Multiplexer.run(AbstractStreamKarajanChannel.java:365)
> >>
> >> Progress: Finished successfully:7
> >> Progress: Active:1 Finished successfully:7
> >> Progress: Active:1 Finished successfully:7
> >> Progress: Active:1 Finished successfully:7
> >> Progress: Checking status:1 Finished successfully:7
> >> Progress: Finished successfully:8
> >> Progress: Finished successfully:8
> >> Progress: Finished successfully:8
> >> Progress: Finished successfully:8
> >> Progress: Finished successfully:8
> >> Progress: Finished successfully:8
> >> Progress: Finished successfully:8
> >>
> >> Wenjun
> >>
> >>> was: Re: notes from todays meeting
> >>>
> >>> Hi Aashish,
> >>>
> >>> Wenjun and Tom are integrated the latest OOPS scripts into the
> >>>
> >> portal for Web execution.
> >>
> >>> Wenjun is getting errors, as below. I suspect he's missing some
> >>>
> >> parameters or has incorrect parameters or inputs.
> >>
> >>> Can you send to Wenjun the latest parameters (ie shell calling
> >>>
> >> examples) to run Loops, RaptorLoops, and RaptorLoops with prep
> stage?
> >>
> >>> Best thing to do is quickly update README with the lastest shell
> >>>
> >> invocation lines and check it in; then Wenjun can verify that the
> >> latest documented invocation instructions work for other people
> (which
> >> will be useful for the OOPS group too!)
> >>
> >>> I cant get to this till late today or early this weekend, so any
> >>>
> >> help you can offer will be great.
> >>
> >>> Thanks!
> >>>
> >>> - Mike
> >>>
> >>> ----- "wenjun wu"<wwjag at mcs.anl.gov> wrote:
> >>>
> >>>
> >>>
> >>>> Hi Mike,
> >>>> I run the raptorloop.sh and got the following error. Any
> clue?
> >>>> Wenjun
> >>>>
> >>>> [wwj at login1 wwjtest]$ run.raptorloops.sh -target T1af7 -prepTar
> >>>> T1af7.prep.tar.gz -templatesPerJob 800
> >>>> Running in
> >>>>
> >>>>
> >>
> /gpfs/pads/oops/scienceportal/oops-svn/oops/protlib2-0422/wwjtest/run.raptorloops.9229
> >>
> >>>> Running RaptorLoops with settings: target=T1af7 seqFile=
> >>>> prepTar=T1af7.prep.tar.gz templatesPerJob=800 templateList=
> >>>>
> >> nModels=
> >>
> >>>> nSim=4 execsite=localhost maxSlots=16 resume= rlog=
> >>>> Running from host with compute-node reachable address of
> >>>>
> >> 172.5.86.5
> >>
> >>>> protlib2 home is
> >>>> /gpfs/pads/oops/scienceportal/oops-svn/oops/protlib2-0422
> >>>> cp: warning: source file
> >>>>
> >>>>
> >>
> `/gpfs/pads/oops/scienceportal/oops-svn/oops/protlib2-0422/swift/RaptorOut.map'
> >>
> >>>> specified more than once
> >>>> cp: warning: source file
> >>>>
> >>>>
> >>
> `/gpfs/pads/oops/scienceportal/oops-svn/oops/protlib2-0422/swift/TemplateList.map'
> >>
> >>>> specified more than once
> >>>> cp: missing destination file operand after `.'
> >>>> Try `cp --help' for more information.
> >>>> basename: missing operand
> >>>> Try `basename --help' for more information.
> >>>> Variable nModels defined in scope 7122710 shadows variable of
> same
> >>>> name
> >>>> in scope 4890830
> >>>> Variable tseg defined in scope 26460367 shadows variable of same
> >>>>
> >> name
> >>
> >>>> in
> >>>> scope 4890830
> >>>> Variable preparedInput defined in scope 26460367 shadows
> variable
> >>>>
> >> of
> >>
> >>>> same name in scope 4890830
> >>>> Variable nModels defined in scope 26460367 shadows variable of
> >>>>
> >> same
> >>
> >>>> name
> >>>> in scope 4890830
> >>>> Variable targetId defined in scope 12182618 shadows variable of
> >>>>
> >> same
> >>
> >>>> name in scope 4890830
> >>>> Variable modelIn defined in scope 12182618 shadows variable of
> >>>>
> >> same
> >>
> >>>> name
> >>>> in scope 4890830
> >>>> Variable targetId defined in scope 21925102 shadows variable of
> >>>>
> >> same
> >>
> >>>> name in scope 4890830
> >>>> Variable models defined in scope 21925102 shadows variable of
> same
> >>>> name
> >>>> in scope 4890830
> >>>> Swift svn swift-r3246 cog-r2721
> >>>>
> >>>> RunID: 20100422-1609-aqv1y329
> >>>> Progress:
> >>>> Execution failed:
> >>>> java.lang.NumberFormatException: For input string: ""
> >>>>
> >>>>
> >>>>
> >>>>> Wenjun,
> >>>>>
> >>>>> The first two we need are psim.loops.swift and
> RaptorLoops.swift,
> >>>>>
> >>>>>
> >>>> and their corresponding runs scripts.
> >>>>
> >>>>
> >>>>> We run them from the corresponding .sh sripts in scripts/run
> >>>>>
> >>>>> I'll get back to you on this tonight with more details...after
> I
> >>>>>
> >>>>>
> >>>> look for my 3rd script which is RaptorLoops with an addiitonal
> >>>> pre-process step that takes a raw fasta file as input. I may
> need
> >>>>
> >> to
> >>
> >>>> check that in from my workspace.
> >>>>
> >>>>
> >>>>> - Mike
> >>>>>
> >>>>>
> >>>>> ----- "wenjun wu"<wwjag at mcs.anl.gov> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>> Hi Mike:
> >>>>>> I installed the latest version of protlib from SVN. I'd
> >>>>>>
> >> like
> >>
> >>>>>>
> >>>>>>
> >>>> to
> >>>>
> >>>>
> >>>>>> clarify which swift scripts are needed into the portal.
> >>>>>>
> >>>>>> These are the swift scripts in the latest protlib2:
> >>>>>>
> >>>>>> rw-r--r-- 1 wwj ci-users 737 Apr 22 11:48
> SwiftLib.swift
> >>>>>> -rw-r--r-- 1 wwj ci-users 3237 Apr 22 11:48
> psim.itfixex2.swift
> >>>>>> -rw-r--r-- 1 wwj ci-users 2127 Apr 22 11:48
> psim.itfixex1.swift
> >>>>>> -rwxr-xr-x 1 wwj ci-users 509 Apr 22 11:48
> psim.basicex1.swift
> >>>>>> -rw-r--r-- 1 wwj ci-users 2616 Apr 22 11:48
> BoostThreader.swift
> >>>>>> -rw-r--r-- 1 wwj ci-users 1477 Apr 22 11:48 LoopLib.swift
> >>>>>> -rw-r--r-- 1 wwj ci-users 1193 Apr 22 11:48
> >>>>>>
> >>>>>>
> >>>> BoostThreaderLib.swift
> >>>>
> >>>>
> >>>>>> -rw-r--r-- 1 wwj ci-users 8869 Apr 22 11:48 oops.swift
> >>>>>> -rw-r--r-- 1 wwj ci-users 1525 Apr 22 11:48
> psim.sweepex1.swift
> >>>>>> -rwxr-xr-x 1 wwj ci-users 2188 Apr 22 11:48 psim.swift
> >>>>>> -rw-r--r-- 1 wwj ci-users 2933 Apr 22 11:48 psim.loops.swift
> >>>>>> -rw-r--r-- 1 wwj ci-users 6820 Apr 22 11:48
> >>>>>> RaptorLoops.hanging.swift
> >>>>>> -rw-r--r-- 1 wwj ci-users 2943 Apr 22 11:48 RaptorLoops.swift
> >>>>>>
> >>>>>> I guess the right swift scripts should be: psim.loops,
> >>>>>>
> >>>>>>
> >>>> BoostThreader
> >>>>
> >>>>
> >>>>>> and RaptorLoop.
> >>>>>> I need to create packages for both Raptor-BoostThreader
> and
> >>>>>> RaptorLoop
> >>>>>> by grouping swift scripts and mapper scripts.
> >>>>>>
> >>>>>>
> >>>>>> Wenjun
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> DataPort 2010.0421
> >>>>>>>
> >>>>>>> Coaster proxy issue: can Mihael automate this?
> >>>>>>>
> >>>>>>> Coaster proxy issue - use long proxy for now.
> >>>>>>>
> >>>>>>> Swift run status reporter?
> >>>>>>>
> >>>>>>> Adding new scripts and forms
> >>>>>>> - how to shape the args? Like the email form?
> >>>>>>>
> >>>>>>> Need automation just for caps requests, then manual for
> Aashish
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>> tests, then portal for Carl, Tobin et al
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> Email notification
> >>>>>>>
> >>>>>>> Control over which swift the portal is running
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>
> >
--
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory
More information about the Swift-devel
mailing list