[Swift-devel] Coaster error in RaptorLoops run
Michael Wilde
wilde at mcs.anl.gov
Mon Apr 26 15:12:50 CDT 2010
Wenjun, can you post more details on the problem you describe below, to the swift-devel list (cc'ed here) pointing Mihael to a directory with all your logs and config files?
Thanks,
Mike
----- "wenjun wu" <wwjag at mcs.anl.gov> wrote:
> Hi Mike,
> Now I can run raptorloop locally but when I launch the jobs to
> PADS
> through coaster:ssh:pbs, I keep getting the following exceptions
> after the swift finishes the most steps.
>
> 2010-04-26 13:27:25,408-0500 INFO AbstractStreamKarajanChannel
> 01173289853: Channel shut down
> java.lang.Throwable
> at
> org.globus.cog.karajan.workflow.service.channels.AbstractTCPChannel.close(AbstractTCPChannel.java:97)
> at
> org.globus.cog.karajan.workflow.service.channels.MetaChannel.close(MetaChannel.java:87)
> at
> org.globus.cog.abstraction.impl.execution.coaster.ServiceManager.statusChanged(ServiceManager.java:232)
> at
> org.globus.cog.abstraction.impl.common.task.TaskImpl.notifyListeners(TaskImpl.java:236)
> at
> org.globus.cog.abstraction.impl.common.task.TaskImpl.setStatus(TaskImpl.java:224)
> at
> org.globus.cog.abstraction.impl.common.task.TaskImpl.setStatus(TaskImpl.java:253)
>
> at
> org.globus.cog.abstraction.impl.ssh.execution.JobSubmissionTaskHandler.SSHTaskStatusChanged(JobSubmissionTaskHandler.java:193)
> at
> org.globus.cog.abstraction.impl.ssh.SSHRunner.notifyListeners(SSHRunner.java:84)
> at
> org.globus.cog.abstraction.impl.ssh.SSHRunner.run(SSHRunner.java:43)
>
> at java.lang.Thread.run(Thread.java:595)
> 2010-04-26 13:27:25,408-0500 INFO ChannelManager Handling channel
> exception
> java.io.IOException: Stream closed. at
> java.net.PlainSocketImpl.available(PlainSocketImpl.java:428)
> at
> java.net.SocketInputStream.available(SocketInputStream.java:217)
> at
> org.globus.gsi.gssapi.net.GssInputStream.available(GssInputStream.java:107)
>
> at
> org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel.step(AbstractStreamKarajanChannel.java:113)
> at
> org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel$Multiplexer.run(AbstractStreamKarajanChannel.java:365)
>
> Progress: Finished successfully:7
> Progress: Active:1 Finished successfully:7
> Progress: Active:1 Finished successfully:7
> Progress: Active:1 Finished successfully:7
> Progress: Checking status:1 Finished successfully:7
> Progress: Finished successfully:8
> Progress: Finished successfully:8
> Progress: Finished successfully:8
> Progress: Finished successfully:8
> Progress: Finished successfully:8
> Progress: Finished successfully:8
> Progress: Finished successfully:8
>
> Wenjun
> > was: Re: notes from todays meeting
> >
> > Hi Aashish,
> >
> > Wenjun and Tom are integrated the latest OOPS scripts into the
> portal for Web execution.
> >
> > Wenjun is getting errors, as below. I suspect he's missing some
> parameters or has incorrect parameters or inputs.
> >
> > Can you send to Wenjun the latest parameters (ie shell calling
> examples) to run Loops, RaptorLoops, and RaptorLoops with prep stage?
> >
> > Best thing to do is quickly update README with the lastest shell
> invocation lines and check it in; then Wenjun can verify that the
> latest documented invocation instructions work for other people (which
> will be useful for the OOPS group too!)
> >
> > I cant get to this till late today or early this weekend, so any
> help you can offer will be great.
> >
> > Thanks!
> >
> > - Mike
> >
> > ----- "wenjun wu"<wwjag at mcs.anl.gov> wrote:
> >
> >
> >> Hi Mike,
> >> I run the raptorloop.sh and got the following error. Any clue?
> >> Wenjun
> >>
> >> [wwj at login1 wwjtest]$ run.raptorloops.sh -target T1af7 -prepTar
> >> T1af7.prep.tar.gz -templatesPerJob 800
> >> Running in
> >>
> /gpfs/pads/oops/scienceportal/oops-svn/oops/protlib2-0422/wwjtest/run.raptorloops.9229
> >> Running RaptorLoops with settings: target=T1af7 seqFile=
> >> prepTar=T1af7.prep.tar.gz templatesPerJob=800 templateList=
> nModels=
> >> nSim=4 execsite=localhost maxSlots=16 resume= rlog=
> >> Running from host with compute-node reachable address of
> 172.5.86.5
> >> protlib2 home is
> >> /gpfs/pads/oops/scienceportal/oops-svn/oops/protlib2-0422
> >> cp: warning: source file
> >>
> `/gpfs/pads/oops/scienceportal/oops-svn/oops/protlib2-0422/swift/RaptorOut.map'
> >>
> >> specified more than once
> >> cp: warning: source file
> >>
> `/gpfs/pads/oops/scienceportal/oops-svn/oops/protlib2-0422/swift/TemplateList.map'
> >>
> >> specified more than once
> >> cp: missing destination file operand after `.'
> >> Try `cp --help' for more information.
> >> basename: missing operand
> >> Try `basename --help' for more information.
> >> Variable nModels defined in scope 7122710 shadows variable of same
> >> name
> >> in scope 4890830
> >> Variable tseg defined in scope 26460367 shadows variable of same
> name
> >> in
> >> scope 4890830
> >> Variable preparedInput defined in scope 26460367 shadows variable
> of
> >> same name in scope 4890830
> >> Variable nModels defined in scope 26460367 shadows variable of
> same
> >> name
> >> in scope 4890830
> >> Variable targetId defined in scope 12182618 shadows variable of
> same
> >> name in scope 4890830
> >> Variable modelIn defined in scope 12182618 shadows variable of
> same
> >> name
> >> in scope 4890830
> >> Variable targetId defined in scope 21925102 shadows variable of
> same
> >> name in scope 4890830
> >> Variable models defined in scope 21925102 shadows variable of same
> >> name
> >> in scope 4890830
> >> Swift svn swift-r3246 cog-r2721
> >>
> >> RunID: 20100422-1609-aqv1y329
> >> Progress:
> >> Execution failed:
> >> java.lang.NumberFormatException: For input string: ""
> >>
> >>
> >>> Wenjun,
> >>>
> >>> The first two we need are psim.loops.swift and RaptorLoops.swift,
> >>>
> >> and their corresponding runs scripts.
> >>
> >>> We run them from the corresponding .sh sripts in scripts/run
> >>>
> >>> I'll get back to you on this tonight with more details...after I
> >>>
> >> look for my 3rd script which is RaptorLoops with an addiitonal
> >> pre-process step that takes a raw fasta file as input. I may need
> to
> >> check that in from my workspace.
> >>
> >>> - Mike
> >>>
> >>>
> >>> ----- "wenjun wu"<wwjag at mcs.anl.gov> wrote:
> >>>
> >>>
> >>>
> >>>> Hi Mike:
> >>>> I installed the latest version of protlib from SVN. I'd
> like
> >>>>
> >> to
> >>
> >>>> clarify which swift scripts are needed into the portal.
> >>>>
> >>>> These are the swift scripts in the latest protlib2:
> >>>>
> >>>> rw-r--r-- 1 wwj ci-users 737 Apr 22 11:48 SwiftLib.swift
> >>>> -rw-r--r-- 1 wwj ci-users 3237 Apr 22 11:48 psim.itfixex2.swift
> >>>> -rw-r--r-- 1 wwj ci-users 2127 Apr 22 11:48 psim.itfixex1.swift
> >>>> -rwxr-xr-x 1 wwj ci-users 509 Apr 22 11:48 psim.basicex1.swift
> >>>> -rw-r--r-- 1 wwj ci-users 2616 Apr 22 11:48 BoostThreader.swift
> >>>> -rw-r--r-- 1 wwj ci-users 1477 Apr 22 11:48 LoopLib.swift
> >>>> -rw-r--r-- 1 wwj ci-users 1193 Apr 22 11:48
> >>>>
> >> BoostThreaderLib.swift
> >>
> >>>> -rw-r--r-- 1 wwj ci-users 8869 Apr 22 11:48 oops.swift
> >>>> -rw-r--r-- 1 wwj ci-users 1525 Apr 22 11:48 psim.sweepex1.swift
> >>>> -rwxr-xr-x 1 wwj ci-users 2188 Apr 22 11:48 psim.swift
> >>>> -rw-r--r-- 1 wwj ci-users 2933 Apr 22 11:48 psim.loops.swift
> >>>> -rw-r--r-- 1 wwj ci-users 6820 Apr 22 11:48
> >>>> RaptorLoops.hanging.swift
> >>>> -rw-r--r-- 1 wwj ci-users 2943 Apr 22 11:48 RaptorLoops.swift
> >>>>
> >>>> I guess the right swift scripts should be: psim.loops,
> >>>>
> >> BoostThreader
> >>
> >>>> and RaptorLoop.
> >>>> I need to create packages for both Raptor-BoostThreader and
> >>>> RaptorLoop
> >>>> by grouping swift scripts and mapper scripts.
> >>>>
> >>>>
> >>>> Wenjun
> >>>>
> >>>>
> >>>>> DataPort 2010.0421
> >>>>>
> >>>>> Coaster proxy issue: can Mihael automate this?
> >>>>>
> >>>>> Coaster proxy issue - use long proxy for now.
> >>>>>
> >>>>> Swift run status reporter?
> >>>>>
> >>>>> Adding new scripts and forms
> >>>>> - how to shape the args? Like the email form?
> >>>>>
> >>>>> Need automation just for caps requests, then manual for Aashish
> >>>>>
> >>>>>
> >>>> tests, then portal for Carl, Tobin et al
> >>>>
> >>>>
> >>>>> Email notification
> >>>>>
> >>>>> Control over which swift the portal is running
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>
> >
--
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory
More information about the Swift-devel
mailing list