[Swift-user] Exception in getFile

Jing Tie tiejing at gmail.com
Mon Aug 20 14:55:15 CDT 2007


I see. So at this point, the problem could be caused by two reasons:
1. GFS system is broken, and missed the output files;
2. Swift has problem to create output files.

Is it right?

Thanks,
Jing

On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
>
> No. Swift will always try to stage out the output files if it has no
> indication that something went wrong with the job. But if the filesystem
> is broken, and the files are not actually there, well, that's what you
> seem to be observing.
>
> On Mon, 2007-08-20 at 14:36 -0500, Jing Tie wrote:
> > I see. Could this output be viewed as a sign?
> >
> > Completed job cwtsmall-gt3062gi cwtsmall with arguments
> > [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
> > Staging out
> > sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-avgResults.Rdata to
> > 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS
> >
> > Thanks,
> > Jing
> >
> > On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> >         Local empty files may be created even if the remote files
> >         don't exist.
> >         So don't take that as a sign that the application has run.
> >
> >         In the mean time I'll try to convince it to not create empty
> >         local
> >         files, if they don't exist remotely.
> >
> >         Mihael
> >
> >         On Mon, 2007-08-20 at 13:43 -0500, Jing Tie wrote:
> >         > I think these files were from the job. Because I deleted all
> >         the
> >         > *Results.Rdata before the job submitting, and found these
> >         empty files
> >         > after the execution.
> >         >
> >         > output of the process of execution:
> >         > RunID: 3szhlhvg4seu0
> >         > cwtsmall started
> >         > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429)
> >         setting status
> >         > to Active
> >         > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429)
> >         setting status
> >         > to Completed
> >         > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432)
> >         setting status
> >         > to Submitted
> >         > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432)
> >         setting status
> >         > to Active
> >         > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432)
> >         setting status
> >         > to Completed
> >         > ...
> >         > Task(type=2, identity=urn:0-0-0-1-0-1-0-1-1187633646453)
> >         setting
> >         > status to Completed
> >         > Staged in scripts/runWaveletsAvg.R to
> >         > sid-wf1-3szhlhvg4seu0/shared/scripts/ on MIT_CMS
> >         > Running job cwtsmall-gt3062gi cwtsmall with arguments
> >         > [scripts/runWaveletsAvg.R, 101, FB] in
> >         > sid-wf1-3szhlhvg4seu0/cwtsmall-gt3062gi on MIT_CMS
> >         > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457)
> >         setting status
> >         > to Submitted
> >         > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457)
> >         setting status
> >         > to Active
> >         > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457)
> >         setting status
> >         > to Completed
> >         > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459)
> >         setting status
> >         > to Active
> >         > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459)
> >         setting status
> >         > to Completed
> >         > Completed job cwtsmall-gt3062gi cwtsmall with arguments
> >         > [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
> >         > Staging out
> >         >
> >         sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-
> avgResults.Rdata to
> >         > 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS
> >         > Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462)
> >         setting
> >         > status to Active
> >         > Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462)
> >         setting
> >         > status to Completed
> >         > ......
> >         > Task(type=2, identity=urn:0-0-0-1-0-1-0-23-1187633646557)
> >         setting
> >         > status to Active
> >         > Task(type=2, identity=urn:0-0-0-1-0-1-0-22-1187633646554)
> >         setting
> >         > status to Failed Exception in getFile
> >         > Task(type=2, identity=urn:0-0-0-1-0-1-0-2-1187633646560)
> >         setting
> >         > status to Submitted
> >         > ......
> >         >
> >         > Thanks,
> >         > Jing
> >         >
> >         > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov> wrote:
> >         >         But those are not from the same job.
> >         >
> >         >         On Mon, 2007-08-20 at 12:28 -0500, Jing Tie wrote:
> >         >         > Yes. I saw 101-FBchannel1_cwt-avgResults.Rdata to
> >         >         > 101-FBchannel28_cwt-avgResults.Rdata 28 output
> >         files on the
> >         >         swift
> >         >         > client, but all the files were empty.
> >         >         >
> >         >         > Jing
> >         >         >
> >         >         >
> >         >         > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov>
> >         wrote:
> >         >         >         On Mon, 2007-08-20 at 12:21 -0500, Jing
> >         Tie wrote:
> >         >         >         > Yes. There is no * avgResults.Rdata
> >         under shared
> >         >         directory,
> >         >         >         only input
> >         >         >         > file, scripts, wrapper.sh and seq.sh.
> >         >         >
> >         >         >         Did the job actually run?
> >         >         >
> >         >         >         >
> >         >         >         > Jing
> >         >         >         >
> >         >         >         > On 8/20/07, Mihael Hategan <
> >         hategan at mcs.anl.gov>
> >         >         wrote:
> >         >         >         >         Not much we can do if the
> >         filesystem is
> >         >         broken.
> >         >         >         >         Did you check to confirm that
> >         the file is
> >         >         not
> >         >         >         there?
> >         >         >         >
> >         >         >         >         Mihael
> >         >         >         >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20070820/bc7cc2b9/attachment.html>


More information about the Swift-user mailing list