[Swift-user] Exception in getFile
Jing Tie
tiejing at gmail.com
Mon Aug 20 14:36:03 CDT 2007
I see. Could this output be viewed as a sign?
Completed job cwtsmall-gt3062gi cwtsmall with arguments
[scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
Staging out sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-
avgResults.Rdata to 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS
Thanks,
Jing
On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
>
> Local empty files may be created even if the remote files don't exist.
> So don't take that as a sign that the application has run.
>
> In the mean time I'll try to convince it to not create empty local
> files, if they don't exist remotely.
>
> Mihael
>
> On Mon, 2007-08-20 at 13:43 -0500, Jing Tie wrote:
> > I think these files were from the job. Because I deleted all the
> > *Results.Rdata before the job submitting, and found these empty files
> > after the execution.
> >
> > output of the process of execution:
> > RunID: 3szhlhvg4seu0
> > cwtsmall started
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429) setting status
> > to Active
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429) setting status
> > to Completed
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> > to Submitted
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> > to Active
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> > to Completed
> > ...
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-1-1187633646453) setting
> > status to Completed
> > Staged in scripts/runWaveletsAvg.R to
> > sid-wf1-3szhlhvg4seu0/shared/scripts/ on MIT_CMS
> > Running job cwtsmall-gt3062gi cwtsmall with arguments
> > [scripts/runWaveletsAvg.R, 101, FB] in
> > sid-wf1-3szhlhvg4seu0/cwtsmall-gt3062gi on MIT_CMS
> > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> > to Submitted
> > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> > to Active
> > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> > to Completed
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459) setting status
> > to Active
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459) setting status
> > to Completed
> > Completed job cwtsmall-gt3062gi cwtsmall with arguments
> > [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
> > Staging out
> > sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-avgResults.Rdata to
> > 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462) setting
> > status to Active
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462) setting
> > status to Completed
> > ......
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-23-1187633646557) setting
> > status to Active
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-22-1187633646554) setting
> > status to Failed Exception in getFile
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-2-1187633646560) setting
> > status to Submitted
> > ......
> >
> > Thanks,
> > Jing
> >
> > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov> wrote:
> > But those are not from the same job.
> >
> > On Mon, 2007-08-20 at 12:28 -0500, Jing Tie wrote:
> > > Yes. I saw 101-FBchannel1_cwt-avgResults.Rdata to
> > > 101-FBchannel28_cwt-avgResults.Rdata 28 output files on the
> > swift
> > > client, but all the files were empty.
> > >
> > > Jing
> > >
> > >
> > > On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > > On Mon, 2007-08-20 at 12:21 -0500, Jing Tie wrote:
> > > > Yes. There is no * avgResults.Rdata under shared
> > directory,
> > > only input
> > > > file, scripts, wrapper.sh and seq.sh.
> > >
> > > Did the job actually run?
> > >
> > > >
> > > > Jing
> > > >
> > > > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov>
> > wrote:
> > > > Not much we can do if the filesystem is
> > broken.
> > > > Did you check to confirm that the file is
> > not
> > > there?
> > > >
> > > > Mihael
> > > >
> > > > On Mon, 2007-08-20 at 12:07 -0500, Jing
> > Tie wrote:
> > > > > Hi,
> > > > >
> > > > > Here is another problem. It seems like
> > something
> > > wrong with
> > > > GFS
> > > > > system.
> > > > >
> > > > > site: MIT_CMS
> > > > > gatekeeper: ce01.cmsaf.mit.edu
> > > > > app_dir: /osg/app
> > > > > data_dir: /osg/data
> > > > > condor_dir: /usr/local/condor/bin
> > > > > R_dir: /osg/app/R- 2.5.1/bin/R
> > > > >
> > > > > output:
> > > > > Application exception: Exception in
> > getFile
> > > > > task:transfer @ vdl-int.k, line:
> > 235
> > > > > vdl:dostageout @ vdl-int.k,
> > line: 378
> > > > > vdl:execute2 @
> > execute-default.k, line: 22
> > > > > vdl:execute @ sid-wf1.kml, line:
> > 20
> > > > > wavelettransf @ sid-wf1.kml,
> > line: 362
> > > > > batchtrials @ sid-wf1.kml, line:
> > 402
> > > > > vdl:mains @ sid-wf1.kml , line:
> > 399
> > > > > Caused by:
> > > >
> > >
> > org.globus.cog.abstraction.impl.file.FileResourceException:
> > > > > Exception in getFile
> > > > > Caused by:
> > > org.globus.ftp.exception.ServerException : Server
> > > > refused
> > > > > performing the request. Custom
> > message: (error
> > > code
> > > > 1) cwtsmall
> > > > > failed
> > > > > Provenance graph saved in
> > > sid-wf1-7thy5mbfh09e1.dot
> > > > > The following errors have occurred:
> > > > > 1. Application "cwtsmall" failed
> > (Exception in
> > > getFile
> > > > > Caused by:
> > > > > Server refused performing the request.
> > Custom
> > > > message: (error code
> > > > > 1)
> > > > > [Nested exception message: Nested
> > exception is
> > > > >
> > >
> > org.globus.ftp.exception.UnexpectedReplyCodeException :
> > > > > Custom message: Unexpected reply:
> > > > > 500-Command failed. :
> > > > >
> > >
> > globus_gridftp_server_file.c:globus_l_gfs_file_send:2190:
> > > > > 500-globus_l_gfs_file_open failed.
> > > > >
> > > >
> > >
> > 500-globus_gridftp_server_file.c:globus_l_gfs_file_open:1694:
> > > > > 500-globus_xio_register_open failed.
> > > > >
> > >
> > 500-globus_xio_file_driver.c:globus_l_xio_file_open:438:
> > > > > 500-Unable to open
> > > > >
> > > >
> > >
> > file
> /osgfs/data/sid-wf1-7thy5mbfh09e1/shared//101-FBchannel16_cwt-
> avgResults.Rdata
> > > > >
> > >
> > 500-globus_xio_file_driver.c:globus_l_xio_file_open:381:
> > > > > 500-System error in open: No such file
> > or
> > > directory
> > > > > 500-globus_xio: A system call failed: No
> > such file
> > > or
> > > > directory
> > > > > 500 End.])
> > > > > Arguments:
> > "scripts/runWaveletsAvg.R, 101,
> > > FB"
> > > > > Host: UCSDT2
> > > > > Directory:
> > > sid-wf1-7thy5mbfh09e1/cwtsmall-mb3l3rfi
> > > > > STDERR:
> > > > > STDOUT:
> > > > > Errors detected. Cleanup not done.
> > > > > Execution completed with errors
> > > > > sys:throw @ vdl.k, line: 140
> > > > > vdl:mains @ sid-wf1.kml, line:
> > 399
> > > > > at
> > > > >
> > > >
> > > org.globus.cog.karajan.workflow.nodes.FlowNode.fail
> > (FlowNode.java:413)
> > > > > at
> > > > >
> > > >
> > >
> > org.globus.cog.karajan.workflow.nodes.FlowNode.fail(
> FlowNode.java:417)
> > > > > at
> > > > >
> > >
> > org.globus.cog.karajan.workflow.nodes.GenerateErrorNode.post
> > > > > (GenerateErrorNode.java:28)
> > > > > at
> > > > >
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.childCompleted
> > > > > at
> > > > >
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.Sequential.notificationEvent(
> Sequential.java :33)
> > > > > at
> > > > >
> > > >
> > org.globus.cog.karajan.workflow.nodes.FlowNode.event
> > > (FlowNode.java:334)
> > > > > at
> > > > >
> > > >
> > > org.globus.cog.karajan.workflow.events.EventBus.send
> > (EventBus.java:123)
> > > > > at
> > > >
> > >
> > org.globus.cog.karajan.workflow.events.EventBus.sendHooked
> > > > > (EventBus.java:97)
> > > > > at
> > > > >
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.FlowNode.fireNotificationEvent (
> FlowNode.java:172)
> > > > > at
> > > > >
> > > >
> > >
> > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(
> FlowNode.java:298)
> > > > > at
> > > > >
> > > >
> > >
> >
> org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.executeChildren(
> AbstractFunction.java:37)
> > > > > at
> > > > >
> > > >
> > >
> > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(
> FlowContainer.java:63)
> > > > > at
> > > >
> > >
> > org.globus.cog.karajan.workflow.nodes.FlowNode.restart
> > > > > (FlowNode.java :239)
> > > > > at
> > > > >
> > > >
> > > org.globus.cog.karajan.workflow.nodes.FlowNode.start
> > ( FlowNode.java :280)
> > > > > at
> > > > >
> > > >
> > >
> > org.globus.cog.karajan.workflow.nodes.FlowNode.controlEvent(
> FlowNode.java:392)
> > > > > at
> > > >
> > > org.globus.cog.karajan.workflow.nodes.FlowNode.event
> > > > > (FlowNode.java:331)
> > > > > at
> > > > >
> > > >
> > >
> > org.globus.cog.karajan.workflow.FlowElementWrapper.event(
> FlowElementWrapper.java:227)
> > > > > at
> > > > >
> > > >
> > >
> > org.globus.cog.karajan.workflow.events.EventBus.send(
> EventBus.java:123)
> > > > > at
> > > >
> > >
> > org.globus.cog.karajan.workflow.events.EventBus.sendHooked
> > > > > ( EventBus.java:97)
> > > > > at
> > > > >
> > >
> > org.globus.cog.karajan.workflow.events.EventWorker.run
> > > > ( EventWorker.java:69)
> > > > >
> > > > > Many thanks,
> > > > > Jing
> > > > >
> > _______________________________________________
> > > > > Swift-user mailing list
> > > > > Swift-user at ci.uchicago.edu
> > > > >
> > >
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> > > >
> > > >
> > >
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20070820/770b02eb/attachment.html>
More information about the Swift-user
mailing list