[Swift-user] Kickstart executable not found
Jing Tie
tiejing at gmail.com
Mon Aug 20 11:13:03 CDT 2007
Right, it's the problem of condor. After replacing jobmanager-condor
with jobmanager, the job finished successfully.
Thanks,
Jing
On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> Right. The condor job manager has a bug. It does not properly quote
> arguments. So you'll see strange things like this if you use it.
>
> Mihael
>
> On Mon, 2007-08-20 at 00:43 -0500, Jing Tie wrote:
> > Sure.
> >
> > On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > > It puzzles me. Can you attach that file?
> > >
> > > On Sun, 2007-08-19 at 21:37 -0500, Jing Tie wrote:
> > > > in $SWIFT_HOME/etc/swift.properties
> > > >
> > > >
> > > > Jing
> > > >
> > > > On 8/19/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > > > > On Sat, 2007-08-18 at 18:24 -0500, Jing Tie wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I am working on SID application now. Job cwtsmall is a script
> > > > > > wavelet.sh on AGLT2 site. In the wavelet.sh, R runs runWaveletsAvg.R
> > > > > > on input data 101_FB-epochs.Rdata, and should output
> > > > > > 101-FBchannel1_cwt-avgResults.Rdata to
> > > > > > 101-FBchannel28_cwt-avgResults.Rdata
> > > > > > these 28 files.
> > > > > >
> > > > > > But when I runed swift client with kickstart.enabled = false,
> > > > >
> > > > > Where did you set this?
> > > > >
> > > > > Mihael
> > > > >
> > > > > > it had
> > > > > > the exit code 1024 error. And the stderr.txt said: Kickstart
> > > > > > executable (101-FBchannel18_cwt-avgResults.Rdata) not found. Details
> > > > > > below:
> > > > > >
> > > > > > site: AGLT2
> > > > > > gatekeeper: gate01.aglt2.org
> > > > > > app_dir: /atlas/data08/OSG/APP/SIDGrid
> > > > > > data_dir: /atlas/data08/OSG/DATA
> > > > > > condor_dir: /opt/condor/bin
> > > > > > R_dir: /atlas/data08/OSG/APP/R-2.5.1/bin/R
> > > > > >
> > > > > > output:
> > > > > > Application exception: Job cwtsmall failed with an exit code of 1024
> > > > > > sys:throw @ vdl-int.k, line: 109
> > > > > > vdl:checkexitcode @ vdl-int.k, line: 370
> > > > > > vdl:execute2 @ execute-default.k, line: 22
> > > > > > vdl:execute @ sid-wf1.kml, line: 20
> > > > > > wavelettransf @ sid-wf1.kml, line: 362
> > > > > > batchtrials @ sid-wf1.kml, line: 402
> > > > > > vdl:mains @ sid-wf1.kml, line: 399
> > > > > > cwtsmall failed
> > > > > > Provenance graph saved in sid-wf1-8cnxmo0qetg10.dot
> > > > > > The following errors have occurred:
> > > > > > 1. Application "cwtsmall" failed (Job cwtsmall failed with an exit code of 1024)
> > > > > > Arguments: "scripts/runWaveletsAvg.R, 101, FB"
> > > > > > Host: NWICG_NotreDame
> > > > > > Directory: sid-wf1-8cnxmo0qetg10/cwtsmall-zeb72rfi
> > > > > > STDERR: Kickstart executable
> > > > > > (101-FBchannel18_cwt-avgResults.Rdata) not found
> > > > > > STDOUT:
> > > > > > Errors detected. Cleanup not done.
> > > > > > Execution completed with errors
> > > > > > sys:throw @ vdl.k, line: 140
> > > > > > vdl:mains @ sid-wf1.kml, line: 399
> > > > > > at org.globus.cog.karajan.workflow.nodes.FlowNode.fail(FlowNode.java:413)
> > > > > > at org.globus.cog.karajan.workflow.nodes.FlowNode.fail(FlowNode.java:417)
> > > > > > at org.globus.cog.karajan.workflow.nodes.GenerateErrorNode.post(GenerateErrorNode.java:28)
> > > > > > at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.childCompleted
> > > > > > at org.globus.cog.karajan.workflow.nodes.Sequential.notificationEvent(Sequential.java:33)
> > > > > > at org.globus.cog.karajan.workflow.nodes.FlowNode.event(FlowNode.java:334)
> > > > > > at org.globus.cog.karajan.workflow.events.EventBus.send(EventBus.java:123)
> > > > > > at org.globus.cog.karajan.workflow.events.EventBus.sendHooked(EventBus.java:97)
> > > > > > at org.globus.cog.karajan.workflow.nodes.FlowNode.fireNotificationEvent(FlowNode.java:172)
> > > > > > at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:298)
> > > > > > at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.executeChildren(AbstractFunction.java:37)
> > > > > > at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63)
> > > > > > at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:239)
> > > > > > at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:280)
> > > > > > at org.globus.cog.karajan.workflow.nodes.FlowNode.controlEvent(FlowNode.java:392)
> > > > > > at org.globus.cog.karajan.workflow.nodes.FlowNode.event(FlowNode.java:331)
> > > > > > at org.globus.cog.karajan.workflow.FlowElementWrapper.event(FlowElementWrapper.java:227)
> > > > > > at org.globus.cog.karajan.workflow.events.EventBus.send(EventBus.java:123)
> > > > > > at org.globus.cog.karajan.workflow.events.EventBus.sendHooked(EventBus.java:97)
> > > > > > at org.globus.cog.karajan.workflow.events.EventWorker.run(EventWorker.java:69)
> > > > > >
> > > > > > I found that there are about 8 sites in OSG having the problem.
> > > > > >
> > > > > > Many thanks,
> > > > > > Jing
> > > > > > _______________________________________________
> > > > > > Swift-user mailing list
> > > > > > Swift-user at ci.uchicago.edu
> > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> > >
>
>
More information about the Swift-user
mailing list