[Swift-user] Exception in getFile
Mihael Hategan
hategan at mcs.anl.gov
Mon Aug 20 14:03:09 CDT 2007
Local empty files may be created even if the remote files don't exist.
So don't take that as a sign that the application has run.
In the mean time I'll try to convince it to not create empty local
files, if they don't exist remotely.
Mihael
On Mon, 2007-08-20 at 13:43 -0500, Jing Tie wrote:
> I think these files were from the job. Because I deleted all the
> *Results.Rdata before the job submitting, and found these empty files
> after the execution.
>
> output of the process of execution:
> RunID: 3szhlhvg4seu0
> cwtsmall started
> Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429) setting status
> to Active
> Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429) setting status
> to Completed
> Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> to Submitted
> Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> to Active
> Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> to Completed
> ...
> Task(type=2, identity=urn:0-0-0-1-0-1-0-1-1187633646453) setting
> status to Completed
> Staged in scripts/runWaveletsAvg.R to
> sid-wf1-3szhlhvg4seu0/shared/scripts/ on MIT_CMS
> Running job cwtsmall-gt3062gi cwtsmall with arguments
> [scripts/runWaveletsAvg.R, 101, FB] in
> sid-wf1-3szhlhvg4seu0/cwtsmall-gt3062gi on MIT_CMS
> Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> to Submitted
> Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> to Active
> Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> to Completed
> Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459) setting status
> to Active
> Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459) setting status
> to Completed
> Completed job cwtsmall-gt3062gi cwtsmall with arguments
> [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
> Staging out
> sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-avgResults.Rdata to
> 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS
> Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462) setting
> status to Active
> Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462) setting
> status to Completed
> ......
> Task(type=2, identity=urn:0-0-0-1-0-1-0-23-1187633646557) setting
> status to Active
> Task(type=2, identity=urn:0-0-0-1-0-1-0-22-1187633646554) setting
> status to Failed Exception in getFile
> Task(type=2, identity=urn:0-0-0-1-0-1-0-2-1187633646560) setting
> status to Submitted
> ......
>
> Thanks,
> Jing
>
> On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov> wrote:
> But those are not from the same job.
>
> On Mon, 2007-08-20 at 12:28 -0500, Jing Tie wrote:
> > Yes. I saw 101-FBchannel1_cwt-avgResults.Rdata to
> > 101-FBchannel28_cwt-avgResults.Rdata 28 output files on the
> swift
> > client, but all the files were empty.
> >
> > Jing
> >
> >
> > On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > On Mon, 2007-08-20 at 12:21 -0500, Jing Tie wrote:
> > > Yes. There is no * avgResults.Rdata under shared
> directory,
> > only input
> > > file, scripts, wrapper.sh and seq.sh.
> >
> > Did the job actually run?
> >
> > >
> > > Jing
> > >
> > > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov>
> wrote:
> > > Not much we can do if the filesystem is
> broken.
> > > Did you check to confirm that the file is
> not
> > there?
> > >
> > > Mihael
> > >
> > > On Mon, 2007-08-20 at 12:07 -0500, Jing
> Tie wrote:
> > > > Hi,
> > > >
> > > > Here is another problem. It seems like
> something
> > wrong with
> > > GFS
> > > > system.
> > > >
> > > > site: MIT_CMS
> > > > gatekeeper: ce01.cmsaf.mit.edu
> > > > app_dir: /osg/app
> > > > data_dir: /osg/data
> > > > condor_dir: /usr/local/condor/bin
> > > > R_dir: /osg/app/R- 2.5.1/bin/R
> > > >
> > > > output:
> > > > Application exception: Exception in
> getFile
> > > > task:transfer @ vdl-int.k, line:
> 235
> > > > vdl:dostageout @ vdl-int.k,
> line: 378
> > > > vdl:execute2 @
> execute-default.k, line: 22
> > > > vdl:execute @ sid-wf1.kml, line:
> 20
> > > > wavelettransf @ sid-wf1.kml,
> line: 362
> > > > batchtrials @ sid-wf1.kml, line:
> 402
> > > > vdl:mains @ sid-wf1.kml , line:
> 399
> > > > Caused by:
> > >
> >
> org.globus.cog.abstraction.impl.file.FileResourceException:
> > > > Exception in getFile
> > > > Caused by:
> > org.globus.ftp.exception.ServerException : Server
> > > refused
> > > > performing the request. Custom
> message: (error
> > code
> > > 1) cwtsmall
> > > > failed
> > > > Provenance graph saved in
> > sid-wf1-7thy5mbfh09e1.dot
> > > > The following errors have occurred:
> > > > 1. Application "cwtsmall" failed
> (Exception in
> > getFile
> > > > Caused by:
> > > > Server refused performing the request.
> Custom
> > > message: (error code
> > > > 1)
> > > > [Nested exception message: Nested
> exception is
> > > >
> >
> org.globus.ftp.exception.UnexpectedReplyCodeException :
> > > > Custom message: Unexpected reply:
> > > > 500-Command failed. :
> > > >
> >
> globus_gridftp_server_file.c:globus_l_gfs_file_send:2190:
> > > > 500-globus_l_gfs_file_open failed.
> > > >
> > >
> >
> 500-globus_gridftp_server_file.c:globus_l_gfs_file_open:1694:
> > > > 500-globus_xio_register_open failed.
> > > >
> >
> 500-globus_xio_file_driver.c:globus_l_xio_file_open:438:
> > > > 500-Unable to open
> > > >
> > >
> >
> file /osgfs/data/sid-wf1-7thy5mbfh09e1/shared//101-FBchannel16_cwt- avgResults.Rdata
> > > >
> >
> 500-globus_xio_file_driver.c:globus_l_xio_file_open:381:
> > > > 500-System error in open: No such file
> or
> > directory
> > > > 500-globus_xio: A system call failed: No
> such file
> > or
> > > directory
> > > > 500 End.])
> > > > Arguments:
> "scripts/runWaveletsAvg.R, 101,
> > FB"
> > > > Host: UCSDT2
> > > > Directory:
> > sid-wf1-7thy5mbfh09e1/cwtsmall-mb3l3rfi
> > > > STDERR:
> > > > STDOUT:
> > > > Errors detected. Cleanup not done.
> > > > Execution completed with errors
> > > > sys:throw @ vdl.k, line: 140
> > > > vdl:mains @ sid-wf1.kml, line:
> 399
> > > > at
> > > >
> > >
> > org.globus.cog.karajan.workflow.nodes.FlowNode.fail
> (FlowNode.java:413)
> > > > at
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.FlowNode.fail(FlowNode.java:417)
> > > > at
> > > >
> >
> org.globus.cog.karajan.workflow.nodes.GenerateErrorNode.post
> > > > (GenerateErrorNode.java:28)
> > > > at
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.childCompleted
> > > > at
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.Sequential.notificationEvent(Sequential.java :33)
> > > > at
> > > >
> > >
> org.globus.cog.karajan.workflow.nodes.FlowNode.event
> > (FlowNode.java:334)
> > > > at
> > > >
> > >
> > org.globus.cog.karajan.workflow.events.EventBus.send
> (EventBus.java:123)
> > > > at
> > >
> >
> org.globus.cog.karajan.workflow.events.EventBus.sendHooked
> > > > (EventBus.java:97)
> > > > at
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.FlowNode.fireNotificationEvent (FlowNode.java:172)
> > > > at
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:298)
> > > > at
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.executeChildren (AbstractFunction.java:37)
> > > > at
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63)
> > > > at
> > >
> >
> org.globus.cog.karajan.workflow.nodes.FlowNode.restart
> > > > (FlowNode.java :239)
> > > > at
> > > >
> > >
> > org.globus.cog.karajan.workflow.nodes.FlowNode.start
> ( FlowNode.java :280)
> > > > at
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.FlowNode.controlEvent(FlowNode.java:392)
> > > > at
> > >
> > org.globus.cog.karajan.workflow.nodes.FlowNode.event
> > > > (FlowNode.java:331)
> > > > at
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.FlowElementWrapper.event(FlowElementWrapper.java:227)
> > > > at
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.events.EventBus.send(EventBus.java:123)
> > > > at
> > >
> >
> org.globus.cog.karajan.workflow.events.EventBus.sendHooked
> > > > ( EventBus.java:97)
> > > > at
> > > >
> >
> org.globus.cog.karajan.workflow.events.EventWorker.run
> > > ( EventWorker.java:69)
> > > >
> > > > Many thanks,
> > > > Jing
> > > >
> _______________________________________________
> > > > Swift-user mailing list
> > > > Swift-user at ci.uchicago.edu
> > > >
> >
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> > >
> > >
> >
> >
>
>
More information about the Swift-user
mailing list