[Swift-user] Exception in getFile

Mihael Hategan hategan at mcs.anl.gov
Mon Aug 20 14:03:09 CDT 2007


Local empty files may be created even if the remote files don't exist.
So don't take that as a sign that the application has run.

In the mean time I'll try to convince it to not create empty local
files, if they don't exist remotely.

Mihael

On Mon, 2007-08-20 at 13:43 -0500, Jing Tie wrote:
> I think these files were from the job. Because I deleted all the
> *Results.Rdata before the job submitting, and found these empty files
> after the execution.
> 
> output of the process of execution:
> RunID: 3szhlhvg4seu0 
> cwtsmall started
> Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429) setting status
> to Active
> Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429) setting status
> to Completed
> Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> to Submitted 
> Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> to Active
> Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432) setting status
> to Completed
> ...
> Task(type=2, identity=urn:0-0-0-1-0-1-0-1-1187633646453) setting
> status to Completed 
> Staged in scripts/runWaveletsAvg.R to
> sid-wf1-3szhlhvg4seu0/shared/scripts/ on MIT_CMS
> Running job cwtsmall-gt3062gi cwtsmall with arguments
> [scripts/runWaveletsAvg.R, 101, FB] in
> sid-wf1-3szhlhvg4seu0/cwtsmall-gt3062gi on MIT_CMS 
> Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> to Submitted
> Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> to Active
> Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457) setting status
> to Completed 
> Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459) setting status
> to Active
> Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459) setting status
> to Completed
> Completed job cwtsmall-gt3062gi cwtsmall with arguments
> [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS 
> Staging out
> sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-avgResults.Rdata to
> 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS
> Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462) setting
> status to Active
> Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462) setting
> status to Completed
> ......
> Task(type=2, identity=urn:0-0-0-1-0-1-0-23-1187633646557) setting
> status to Active
> Task(type=2, identity=urn:0-0-0-1-0-1-0-22-1187633646554) setting
> status to Failed Exception in getFile 
> Task(type=2, identity=urn:0-0-0-1-0-1-0-2-1187633646560) setting
> status to Submitted
> ......
> 
> Thanks,
> Jing
> 
> On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov> wrote:
>         But those are not from the same job. 
>         
>         On Mon, 2007-08-20 at 12:28 -0500, Jing Tie wrote:
>         > Yes. I saw 101-FBchannel1_cwt-avgResults.Rdata to
>         > 101-FBchannel28_cwt-avgResults.Rdata 28 output files on the
>         swift
>         > client, but all the files were empty. 
>         >
>         > Jing
>         >
>         >
>         > On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
>         >         On Mon, 2007-08-20 at 12:21 -0500, Jing Tie wrote:
>         >         > Yes. There is no * avgResults.Rdata under shared
>         directory,
>         >         only input
>         >         > file, scripts, wrapper.sh and seq.sh.
>         >
>         >         Did the job actually run?
>         >
>         >         >
>         >         > Jing 
>         >         >
>         >         > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov>
>         wrote:
>         >         >         Not much we can do if the filesystem is
>         broken. 
>         >         >         Did you check to confirm that the file is
>         not
>         >         there?
>         >         >
>         >         >         Mihael
>         >         >
>         >         >         On Mon, 2007-08-20 at 12:07 -0500, Jing
>         Tie wrote: 
>         >         >         > Hi,
>         >         >         >
>         >         >         > Here is another problem. It seems like
>         something
>         >         wrong with
>         >         >         GFS
>         >         >         > system. 
>         >         >         >
>         >         >         > site: MIT_CMS
>         >         >         > gatekeeper: ce01.cmsaf.mit.edu
>         >         >         > app_dir: /osg/app 
>         >         >         > data_dir: /osg/data
>         >         >         > condor_dir: /usr/local/condor/bin
>         >         >         > R_dir: /osg/app/R- 2.5.1/bin/R
>         >         >         > 
>         >         >         > output:
>         >         >         > Application exception: Exception in
>         getFile
>         >         >         >         task:transfer @ vdl-int.k, line:
>         235
>         >         >         >         vdl:dostageout @ vdl-int.k,
>         line: 378
>         >         >         >         vdl:execute2 @
>         execute-default.k, line: 22
>         >         >         >         vdl:execute @ sid-wf1.kml, line:
>         20
>         >         >         >         wavelettransf @ sid-wf1.kml,
>         line: 362
>         >         >         >         batchtrials @ sid-wf1.kml, line:
>         402
>         >         >         >         vdl:mains @ sid-wf1.kml , line:
>         399
>         >         >         > Caused by: 
>         >         >
>         >
>         org.globus.cog.abstraction.impl.file.FileResourceException:
>         >         >         > Exception in getFile
>         >         >         > Caused by:
>         >         org.globus.ftp.exception.ServerException : Server
>         >         >         refused
>         >         >         > performing the request. Custom
>         message:  (error
>         >         code
>         >         >         1)  cwtsmall
>         >         >         > failed 
>         >         >         > Provenance graph saved in
>         >         sid-wf1-7thy5mbfh09e1.dot
>         >         >         > The following errors have occurred:
>         >         >         > 1. Application "cwtsmall" failed
>         (Exception in 
>         >         getFile
>         >         >         > Caused by:
>         >         >         > Server refused performing the request.
>         Custom
>         >         >         message:  (error code
>         >         >         > 1) 
>         >         >         > [Nested exception message:  Nested
>         exception is
>         >         >         >
>         >
>         org.globus.ftp.exception.UnexpectedReplyCodeException :
>         >         >         > Custom message: Unexpected reply: 
>         >         >         > 500-Command failed. :
>         >         >         >
>         >
>         globus_gridftp_server_file.c:globus_l_gfs_file_send:2190:
>         >         >         > 500-globus_l_gfs_file_open failed. 
>         >         >         >
>         >         >
>         >
>         500-globus_gridftp_server_file.c:globus_l_gfs_file_open:1694:
>         >         >         > 500-globus_xio_register_open failed.
>         >         >         > 
>         >
>         500-globus_xio_file_driver.c:globus_l_xio_file_open:438:
>         >         >         > 500-Unable to open
>         >         >         >
>         >         >
>         >
>         file /osgfs/data/sid-wf1-7thy5mbfh09e1/shared//101-FBchannel16_cwt- avgResults.Rdata
>         >         >         >
>         >
>         500-globus_xio_file_driver.c:globus_l_xio_file_open:381:
>         >         >         > 500-System error in open: No such file
>         or
>         >         directory 
>         >         >         > 500-globus_xio: A system call failed: No
>         such file
>         >         or
>         >         >         directory
>         >         >         > 500 End.])
>         >         >         >         Arguments:
>         "scripts/runWaveletsAvg.R, 101, 
>         >         FB"
>         >         >         >         Host: UCSDT2
>         >         >         >         Directory:
>         >         sid-wf1-7thy5mbfh09e1/cwtsmall-mb3l3rfi
>         >         >         >         STDERR: 
>         >         >         >         STDOUT:
>         >         >         > Errors detected. Cleanup not done.
>         >         >         > Execution completed with errors
>         >         >         >         sys:throw @ vdl.k, line: 140
>         >         >         >         vdl:mains @ sid-wf1.kml, line:
>         399
>         >         >         >         at
>         >         >         >
>         >         >
>         >         org.globus.cog.karajan.workflow.nodes.FlowNode.fail
>         (FlowNode.java:413)
>         >         >         >         at
>         >         >         >
>         >         >
>         >
>         org.globus.cog.karajan.workflow.nodes.FlowNode.fail(FlowNode.java:417)
>         >         >         >         at 
>         >         >         >
>         >
>         org.globus.cog.karajan.workflow.nodes.GenerateErrorNode.post
>         >         >         > (GenerateErrorNode.java:28)
>         >         >         >         at
>         >         >         >
>         >         >
>         >
>         org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.childCompleted
>         >         >         >         at
>         >         >         > 
>         >         >
>         >
>         org.globus.cog.karajan.workflow.nodes.Sequential.notificationEvent(Sequential.java :33)
>         >         >         >         at
>         >         >         >
>         >         >
>         org.globus.cog.karajan.workflow.nodes.FlowNode.event
>         >         (FlowNode.java:334)
>         >         >         >         at
>         >         >         >
>         >         >
>         >         org.globus.cog.karajan.workflow.events.EventBus.send
>         (EventBus.java:123)
>         >         >         >         at
>         >         >
>         >
>         org.globus.cog.karajan.workflow.events.EventBus.sendHooked
>         >         >         > (EventBus.java:97)
>         >         >         >         at 
>         >         >         >
>         >         >
>         >
>         org.globus.cog.karajan.workflow.nodes.FlowNode.fireNotificationEvent (FlowNode.java:172)
>         >         >         >         at
>         >         >         > 
>         >         >
>         >
>         org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:298)
>         >         >         >         at
>         >         >         >
>         >         >
>         >
>         org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.executeChildren (AbstractFunction.java:37)
>         >         >         >         at
>         >         >         >
>         >         >
>         >
>         org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63)
>         >         >         >         at
>         >         >
>         >
>         org.globus.cog.karajan.workflow.nodes.FlowNode.restart
>         >         >         > (FlowNode.java :239)
>         >         >         >         at
>         >         >         >
>         >         >
>         >         org.globus.cog.karajan.workflow.nodes.FlowNode.start
>         ( FlowNode.java :280)
>         >         >         >         at
>         >         >         >
>         >         >
>         >
>         org.globus.cog.karajan.workflow.nodes.FlowNode.controlEvent(FlowNode.java:392)
>         >         >         >         at 
>         >         >
>         >         org.globus.cog.karajan.workflow.nodes.FlowNode.event
>         >         >         > (FlowNode.java:331)
>         >         >         >         at
>         >         >         > 
>         >         >
>         >
>         org.globus.cog.karajan.workflow.FlowElementWrapper.event(FlowElementWrapper.java:227)
>         >         >         >         at
>         >         >         >
>         >         > 
>         >
>         org.globus.cog.karajan.workflow.events.EventBus.send(EventBus.java:123)
>         >         >         >         at
>         >         >
>         >
>         org.globus.cog.karajan.workflow.events.EventBus.sendHooked 
>         >         >         > ( EventBus.java:97)
>         >         >         >         at
>         >         >         >
>         >
>         org.globus.cog.karajan.workflow.events.EventWorker.run
>         >         >         ( EventWorker.java:69)
>         >         >         >
>         >         >         > Many thanks,
>         >         >         > Jing
>         >         >         >
>         _______________________________________________ 
>         >         >         > Swift-user mailing list
>         >         >         > Swift-user at ci.uchicago.edu
>         >         >         >
>         >
>         http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>         >         >
>         >         >
>         >
>         >
>         
> 




More information about the Swift-user mailing list