[Swift-user] Exception in getFile
Jing Tie
tiejing at gmail.com
Mon Aug 20 14:55:15 CDT 2007
I see. So at this point, the problem could be caused by two reasons:
1. GFS system is broken, and missed the output files;
2. Swift has problem to create output files.
Is it right?
Thanks,
Jing
On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
>
> No. Swift will always try to stage out the output files if it has no
> indication that something went wrong with the job. But if the filesystem
> is broken, and the files are not actually there, well, that's what you
> seem to be observing.
>
> On Mon, 2007-08-20 at 14:36 -0500, Jing Tie wrote:
> > I see. Could this output be viewed as a sign?
> >
> > Completed job cwtsmall-gt3062gi cwtsmall with arguments
> > [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
> > Staging out
> > sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-avgResults.Rdata to
> > 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS
> >
> > Thanks,
> > Jing
> >
> > On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > Local empty files may be created even if the remote files
> > don't exist.
> > So don't take that as a sign that the application has run.
> >
> > In the mean time I'll try to convince it to not create empty
> > local
> > files, if they don't exist remotely.
> >
> > Mihael
> >
> > On Mon, 2007-08-20 at 13:43 -0500, Jing Tie wrote:
> > > I think these files were from the job. Because I deleted all
> > the
> > > *Results.Rdata before the job submitting, and found these
> > empty files
> > > after the execution.
> > >
> > > output of the process of execution:
> > > RunID: 3szhlhvg4seu0
> > > cwtsmall started
> > > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429)
> > setting status
> > > to Active
> > > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646429)
> > setting status
> > > to Completed
> > > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432)
> > setting status
> > > to Submitted
> > > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432)
> > setting status
> > > to Active
> > > Task(type=2, identity=urn:0-0-0-1-0-1-0-1187633646432)
> > setting status
> > > to Completed
> > > ...
> > > Task(type=2, identity=urn:0-0-0-1-0-1-0-1-1187633646453)
> > setting
> > > status to Completed
> > > Staged in scripts/runWaveletsAvg.R to
> > > sid-wf1-3szhlhvg4seu0/shared/scripts/ on MIT_CMS
> > > Running job cwtsmall-gt3062gi cwtsmall with arguments
> > > [scripts/runWaveletsAvg.R, 101, FB] in
> > > sid-wf1-3szhlhvg4seu0/cwtsmall-gt3062gi on MIT_CMS
> > > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457)
> > setting status
> > > to Submitted
> > > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457)
> > setting status
> > > to Active
> > > Task(type=1, identity=urn:0-0-0-1-0-1-0-1187633646457)
> > setting status
> > > to Completed
> > > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459)
> > setting status
> > > to Active
> > > Task(type=4, identity=urn:0-0-0-1-0-1-0-1187633646459)
> > setting status
> > > to Completed
> > > Completed job cwtsmall-gt3062gi cwtsmall with arguments
> > > [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
> > > Staging out
> > >
> > sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-
> avgResults.Rdata to
> > > 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS
> > > Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462)
> > setting
> > > status to Active
> > > Task(type=4, identity=urn:0-0-0-1-0-1-0-7-1187633646462)
> > setting
> > > status to Completed
> > > ......
> > > Task(type=2, identity=urn:0-0-0-1-0-1-0-23-1187633646557)
> > setting
> > > status to Active
> > > Task(type=2, identity=urn:0-0-0-1-0-1-0-22-1187633646554)
> > setting
> > > status to Failed Exception in getFile
> > > Task(type=2, identity=urn:0-0-0-1-0-1-0-2-1187633646560)
> > setting
> > > status to Submitted
> > > ......
> > >
> > > Thanks,
> > > Jing
> > >
> > > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov> wrote:
> > > But those are not from the same job.
> > >
> > > On Mon, 2007-08-20 at 12:28 -0500, Jing Tie wrote:
> > > > Yes. I saw 101-FBchannel1_cwt-avgResults.Rdata to
> > > > 101-FBchannel28_cwt-avgResults.Rdata 28 output
> > files on the
> > > swift
> > > > client, but all the files were empty.
> > > >
> > > > Jing
> > > >
> > > >
> > > > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov>
> > wrote:
> > > > On Mon, 2007-08-20 at 12:21 -0500, Jing
> > Tie wrote:
> > > > > Yes. There is no * avgResults.Rdata
> > under shared
> > > directory,
> > > > only input
> > > > > file, scripts, wrapper.sh and seq.sh.
> > > >
> > > > Did the job actually run?
> > > >
> > > > >
> > > > > Jing
> > > > >
> > > > > On 8/20/07, Mihael Hategan <
> > hategan at mcs.anl.gov>
> > > wrote:
> > > > > Not much we can do if the
> > filesystem is
> > > broken.
> > > > > Did you check to confirm that
> > the file is
> > > not
> > > > there?
> > > > >
> > > > > Mihael
> > > > >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20070820/bc7cc2b9/attachment.html>
More information about the Swift-user
mailing list