[Swift-user] Exception in getFile

Jing Tie tiejing at gmail.com
Mon Aug 20 14:36:03 CDT 2007


I see. Could this output be viewed as a sign?

Completed job cwtsmall-gt3062gi cwtsmall with arguments
[scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
Staging out sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-
avgResults.Rdata to 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS

Thanks,
Jing

On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
>
> Local empty files may be created even if the remote files don't exist.
> So don't take that as a sign that the application has run.
>
> In the mean time I'll try to convince it to not create empty local
> files, if they don't exist remotely.
>
> Mihael
>
> On Mon, 2007-08-20 at 13:43 -0500, Jing Tie wrote:
> > I think these files were from the job. Because I deleted all the
> > *Results.Rdata before the job submitting, and found these empty files
> > after the execution.
> >
> > output of the process of execution:
> > RunID: 3szhlhvg4seu0
> > cwtsmall started
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429) setting status
> > to Active
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429) setting status
> > to Completed
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> > to Submitted
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> > to Active
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> > to Completed
> > ...
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-1-1187633646453) setting
> > status to Completed
> > Staged in scripts/runWaveletsAvg.R to
> > sid-wf1-3szhlhvg4seu0/shared/scripts/ on MIT_CMS
> > Running job cwtsmall-gt3062gi cwtsmall with arguments
> > [scripts/runWaveletsAvg.R, 101, FB] in
> > sid-wf1-3szhlhvg4seu0/cwtsmall-gt3062gi on MIT_CMS
> > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> > to Submitted
> > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> > to Active
> > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> > to Completed
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459) setting status
> > to Active
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459) setting status
> > to Completed
> > Completed job cwtsmall-gt3062gi cwtsmall with arguments
> > [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
> > Staging out
> > sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-avgResults.Rdata to
> > 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462) setting
> > status to Active
> > Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462) setting
> > status to Completed
> > ......
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-23-1187633646557) setting
> > status to Active
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-22-1187633646554) setting
> > status to Failed Exception in getFile
> > Task(type=2, identity=urn:0-0-0-1-0-1-0-2-1187633646560) setting
> > status to Submitted
> > ......
> >
> > Thanks,
> > Jing
> >
> > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov> wrote:
> >         But those are not from the same job.
> >
> >         On Mon, 2007-08-20 at 12:28 -0500, Jing Tie wrote:
> >         > Yes. I saw 101-FBchannel1_cwt-avgResults.Rdata to
> >         > 101-FBchannel28_cwt-avgResults.Rdata 28 output files on the
> >         swift
> >         > client, but all the files were empty.
> >         >
> >         > Jing
> >         >
> >         >
> >         > On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> >         >         On Mon, 2007-08-20 at 12:21 -0500, Jing Tie wrote:
> >         >         > Yes. There is no * avgResults.Rdata under shared
> >         directory,
> >         >         only input
> >         >         > file, scripts, wrapper.sh and seq.sh.
> >         >
> >         >         Did the job actually run?
> >         >
> >         >         >
> >         >         > Jing
> >         >         >
> >         >         > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov>
> >         wrote:
> >         >         >         Not much we can do if the filesystem is
> >         broken.
> >         >         >         Did you check to confirm that the file is
> >         not
> >         >         there?
> >         >         >
> >         >         >         Mihael
> >         >         >
> >         >         >         On Mon, 2007-08-20 at 12:07 -0500, Jing
> >         Tie wrote:
> >         >         >         > Hi,
> >         >         >         >
> >         >         >         > Here is another problem. It seems like
> >         something
> >         >         wrong with
> >         >         >         GFS
> >         >         >         > system.
> >         >         >         >
> >         >         >         > site: MIT_CMS
> >         >         >         > gatekeeper: ce01.cmsaf.mit.edu
> >         >         >         > app_dir: /osg/app
> >         >         >         > data_dir: /osg/data
> >         >         >         > condor_dir: /usr/local/condor/bin
> >         >         >         > R_dir: /osg/app/R- 2.5.1/bin/R
> >         >         >         >
> >         >         >         > output:
> >         >         >         > Application exception: Exception in
> >         getFile
> >         >         >         >         task:transfer @ vdl-int.k, line:
> >         235
> >         >         >         >         vdl:dostageout @ vdl-int.k,
> >         line: 378
> >         >         >         >         vdl:execute2 @
> >         execute-default.k, line: 22
> >         >         >         >         vdl:execute @ sid-wf1.kml, line:
> >         20
> >         >         >         >         wavelettransf @ sid-wf1.kml,
> >         line: 362
> >         >         >         >         batchtrials @ sid-wf1.kml, line:
> >         402
> >         >         >         >         vdl:mains @ sid-wf1.kml , line:
> >         399
> >         >         >         > Caused by:
> >         >         >
> >         >
> >         org.globus.cog.abstraction.impl.file.FileResourceException:
> >         >         >         > Exception in getFile
> >         >         >         > Caused by:
> >         >         org.globus.ftp.exception.ServerException : Server
> >         >         >         refused
> >         >         >         > performing the request. Custom
> >         message:  (error
> >         >         code
> >         >         >         1)  cwtsmall
> >         >         >         > failed
> >         >         >         > Provenance graph saved in
> >         >         sid-wf1-7thy5mbfh09e1.dot
> >         >         >         > The following errors have occurred:
> >         >         >         > 1. Application "cwtsmall" failed
> >         (Exception in
> >         >         getFile
> >         >         >         > Caused by:
> >         >         >         > Server refused performing the request.
> >         Custom
> >         >         >         message:  (error code
> >         >         >         > 1)
> >         >         >         > [Nested exception message:  Nested
> >         exception is
> >         >         >         >
> >         >
> >         org.globus.ftp.exception.UnexpectedReplyCodeException :
> >         >         >         > Custom message: Unexpected reply:
> >         >         >         > 500-Command failed. :
> >         >         >         >
> >         >
> >         globus_gridftp_server_file.c:globus_l_gfs_file_send:2190:
> >         >         >         > 500-globus_l_gfs_file_open failed.
> >         >         >         >
> >         >         >
> >         >
> >         500-globus_gridftp_server_file.c:globus_l_gfs_file_open:1694:
> >         >         >         > 500-globus_xio_register_open failed.
> >         >         >         >
> >         >
> >         500-globus_xio_file_driver.c:globus_l_xio_file_open:438:
> >         >         >         > 500-Unable to open
> >         >         >         >
> >         >         >
> >         >
> >         file
> /osgfs/data/sid-wf1-7thy5mbfh09e1/shared//101-FBchannel16_cwt-
> avgResults.Rdata
> >         >         >         >
> >         >
> >         500-globus_xio_file_driver.c:globus_l_xio_file_open:381:
> >         >         >         > 500-System error in open: No such file
> >         or
> >         >         directory
> >         >         >         > 500-globus_xio: A system call failed: No
> >         such file
> >         >         or
> >         >         >         directory
> >         >         >         > 500 End.])
> >         >         >         >         Arguments:
> >         "scripts/runWaveletsAvg.R, 101,
> >         >         FB"
> >         >         >         >         Host: UCSDT2
> >         >         >         >         Directory:
> >         >         sid-wf1-7thy5mbfh09e1/cwtsmall-mb3l3rfi
> >         >         >         >         STDERR:
> >         >         >         >         STDOUT:
> >         >         >         > Errors detected. Cleanup not done.
> >         >         >         > Execution completed with errors
> >         >         >         >         sys:throw @ vdl.k, line: 140
> >         >         >         >         vdl:mains @ sid-wf1.kml, line:
> >         399
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >         org.globus.cog.karajan.workflow.nodes.FlowNode.fail
> >         (FlowNode.java:413)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >
> >         org.globus.cog.karajan.workflow.nodes.FlowNode.fail(
> FlowNode.java:417)
> >         >         >         >         at
> >         >         >         >
> >         >
> >         org.globus.cog.karajan.workflow.nodes.GenerateErrorNode.post
> >         >         >         > (GenerateErrorNode.java:28)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >
> >
> org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.childCompleted
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >
> >
> org.globus.cog.karajan.workflow.nodes.Sequential.notificationEvent(
> Sequential.java :33)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         org.globus.cog.karajan.workflow.nodes.FlowNode.event
> >         >         (FlowNode.java:334)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >         org.globus.cog.karajan.workflow.events.EventBus.send
> >         (EventBus.java:123)
> >         >         >         >         at
> >         >         >
> >         >
> >         org.globus.cog.karajan.workflow.events.EventBus.sendHooked
> >         >         >         > (EventBus.java:97)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >
> >
> org.globus.cog.karajan.workflow.nodes.FlowNode.fireNotificationEvent (
> FlowNode.java:172)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >
> >         org.globus.cog.karajan.workflow.nodes.FlowNode.complete(
> FlowNode.java:298)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >
> >
> org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.executeChildren(
> AbstractFunction.java:37)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >
> >         org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(
> FlowContainer.java:63)
> >         >         >         >         at
> >         >         >
> >         >
> >         org.globus.cog.karajan.workflow.nodes.FlowNode.restart
> >         >         >         > (FlowNode.java :239)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >         org.globus.cog.karajan.workflow.nodes.FlowNode.start
> >         ( FlowNode.java :280)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >
> >         org.globus.cog.karajan.workflow.nodes.FlowNode.controlEvent(
> FlowNode.java:392)
> >         >         >         >         at
> >         >         >
> >         >         org.globus.cog.karajan.workflow.nodes.FlowNode.event
> >         >         >         > (FlowNode.java:331)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >
> >         org.globus.cog.karajan.workflow.FlowElementWrapper.event(
> FlowElementWrapper.java:227)
> >         >         >         >         at
> >         >         >         >
> >         >         >
> >         >
> >         org.globus.cog.karajan.workflow.events.EventBus.send(
> EventBus.java:123)
> >         >         >         >         at
> >         >         >
> >         >
> >         org.globus.cog.karajan.workflow.events.EventBus.sendHooked
> >         >         >         > ( EventBus.java:97)
> >         >         >         >         at
> >         >         >         >
> >         >
> >         org.globus.cog.karajan.workflow.events.EventWorker.run
> >         >         >         ( EventWorker.java:69)
> >         >         >         >
> >         >         >         > Many thanks,
> >         >         >         > Jing
> >         >         >         >
> >         _______________________________________________
> >         >         >         > Swift-user mailing list
> >         >         >         > Swift-user at ci.uchicago.edu
> >         >         >         >
> >         >
> >         http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> >         >         >
> >         >         >
> >         >
> >         >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20070820/770b02eb/attachment.html>


More information about the Swift-user mailing list