[Swift-user] Kickstart executable not found

Jing Tie tiejing at gmail.com
Mon Aug 20 12:22:03 CDT 2007


Hi,

There is one site running the application successfully with
jobmanager-condor:

site: GLOW
gatekeeper: cmsgrid01.hep.wisc.edu
app_dir: /afs/hep.wisc.edu/osg/app
data_dir: /afs/hep.wisc.edu/osg/data
condor_dir: /condor/bin
R_dir: /afs/hep.wisc.edu/osg/app/R-2.5.1/bin/R

Maybe it has some special configurations or arguments.

Jing

On 8/20/07, Jing Tie <tiejing at gmail.com> wrote:
>
> Right, it's the problem of condor. After replacing jobmanager-condor
> with jobmanager, the job finished successfully.
>
> Thanks,
> Jing
>
> On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov> wrote:
> > Right. The condor job manager has a bug. It does not properly quote
> > arguments. So you'll see strange things like this if you use it.
> >
> > Mihael
> >
> > On Mon, 2007-08-20 at 00:43 -0500, Jing Tie wrote:
> > > Sure.
> > >
> > > On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > > > It puzzles me. Can you attach that file?
> > > >
> > > > On Sun, 2007-08-19 at 21:37 -0500, Jing Tie wrote:
> > > > > in $SWIFT_HOME/etc/swift.properties
> > > > >
> > > > >
> > > > > Jing
> > > > >
> > > > > On 8/19/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > > > > > On Sat, 2007-08-18 at 18:24 -0500, Jing Tie wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > I am working on SID application now. Job cwtsmall is a script
> > > > > > > wavelet.sh on AGLT2 site. In the wavelet.sh, R runs
> runWaveletsAvg.R
> > > > > > > on input data 101_FB-epochs.Rdata, and should output
> > > > > > > 101-FBchannel1_cwt-avgResults.Rdata to
> > > > > > > 101-FBchannel28_cwt- avgResults.Rdata
> > > > > > > these 28 files.
> > > > > > >
> > > > > > > But when I runed swift client with kickstart.enabled = false,
> > > > > >
> > > > > > Where did you set this?
> > > > > >
> > > > > > Mihael
> > > > > >
> > > > > > >  it had
> > > > > > > the exit code 1024 error. And the stderr.txt said: Kickstart
> > > > > > > executable (101-FBchannel18_cwt-avgResults.Rdata) not found.
> Details
> > > > > > > below:
> > > > > > >
> > > > > > > site: AGLT2
> > > > > > > gatekeeper: gate01.aglt2.org
> > > > > > > app_dir: /atlas/data08/OSG/APP/SIDGrid
> > > > > > > data_dir: /atlas/data08/OSG/DATA
> > > > > > > condor_dir: /opt/condor/bin
> > > > > > > R_dir: /atlas/data08/OSG/APP/R-2.5.1/bin/R
> > > > > > >
> > > > > > > output:
> > > > > > > Application exception: Job cwtsmall failed with an exit code
> of 1024
> > > > > > >         sys:throw @ vdl-int.k, line: 109
> > > > > > >         vdl:checkexitcode @ vdl-int.k, line: 370
> > > > > > >         vdl:execute2 @ execute-default.k , line: 22
> > > > > > >         vdl:execute @ sid-wf1.kml, line: 20
> > > > > > >         wavelettransf @ sid-wf1.kml, line: 362
> > > > > > >         batchtrials @ sid-wf1.kml, line: 402
> > > > > > >         vdl:mains @ sid-wf1.kml, line: 399
> > > > > > > cwtsmall failed
> > > > > > > Provenance graph saved in sid-wf1-8cnxmo0qetg10.dot
> > > > > > > The following errors have occurred:
> > > > > > > 1. Application "cwtsmall" failed (Job cwtsmall failed with an
> exit code of 1024)
> > > > > > >         Arguments: "scripts/runWaveletsAvg.R, 101, FB"
> > > > > > >         Host: NWICG_NotreDame
> > > > > > >         Directory: sid-wf1-8cnxmo0qetg10/cwtsmall-zeb72rfi
> > > > > > >         STDERR: Kickstart executable
> > > > > > > (101-FBchannel18_cwt-avgResults.Rdata) not found
> > > > > > >         STDOUT:
> > > > > > > Errors detected. Cleanup not done.
> > > > > > > Execution completed with errors
> > > > > > >         sys:throw @ vdl.k, line: 140
> > > > > > >         vdl:mains @ sid-wf1.kml, line: 399
> > > > > > >         at org.globus.cog.karajan.workflow.nodes.FlowNode.fail(
> FlowNode.java:413)
> > > > > > >         at org.globus.cog.karajan.workflow.nodes.FlowNode.fail
> (FlowNode.java:417)
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.GenerateErrorNode.post (
> GenerateErrorNode.java:28)
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.childCompleted
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.Sequential.notificationEvent (
> Sequential.java:33)
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.FlowNode.event(FlowNode.java:334)
> > > > > > >         at
> org.globus.cog.karajan.workflow.events.EventBus.send (EventBus.java:123)
> > > > > > >         at
> org.globus.cog.karajan.workflow.events.EventBus.sendHooked(EventBus.java
> :97)
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.FlowNode.fireNotificationEvent (
> FlowNode.java:172)
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:298)
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.executeChildren(
> AbstractFunction.java:37)
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(
> FlowContainer.java:63)
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.FlowNode.restart (FlowNode.java:239)
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:280)
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.FlowNode.controlEvent (FlowNode.java
> :392)
> > > > > > >         at
> org.globus.cog.karajan.workflow.nodes.FlowNode.event(FlowNode.java:331)
> > > > > > >         at
> org.globus.cog.karajan.workflow.FlowElementWrapper.event (
> FlowElementWrapper.java:227)
> > > > > > >         at
> org.globus.cog.karajan.workflow.events.EventBus.send(EventBus.java:123)
> > > > > > >         at
> org.globus.cog.karajan.workflow.events.EventBus.sendHooked (EventBus.java
> :97)
> > > > > > >         at
> org.globus.cog.karajan.workflow.events.EventWorker.run(EventWorker.java
> :69)
> > > > > > >
> > > > > > > I found that there are about 8 sites in OSG having the
> problem.
> > > > > > >
> > > > > > > Many thanks,
> > > > > > > Jing
> > > > > > > _______________________________________________
> > > > > > > Swift-user mailing list
> > > > > > > Swift-user at ci.uchicago.edu
> > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20070820/eccc7f98/attachment.html>


More information about the Swift-user mailing list