[Swift-user] Exception in getFile
Jing Tie
tiejing at gmail.com
Tue Aug 28 14:00:45 CDT 2007
Hi,
Could we know whether the problem is cause by 1 or 2 now?
1. GFS system is broken, and missed the output files;
2. the application not actually having run correctly, and thus not
having produced the output files.
Thanks,
Jing
On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> On Mon, 2007-08-20 at 14:55 -0500, Jing Tie wrote:
> > I see. So at this point, the problem could be caused by two reasons:
> > 1. GFS system is broken, and missed the output files;
> > 2. Swift has problem to create output files.
> >
> > Is it right?
>
> Swift doesn't really create output files. It's the application that
> does. So I don't see how (2) can be the problem.
>
> There are other possibilities, including the application not actually
> having run correctly, and thus not having produced the output files.
>
> >
> > Thanks,
> > Jing
> >
> > On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > No. Swift will always try to stage out the output files if it
> > has no
> > indication that something went wrong with the job. But if the
> > filesystem
> > is broken, and the files are not actually there, well, that's
> > what you
> > seem to be observing.
> >
> > On Mon, 2007-08-20 at 14:36 -0500, Jing Tie wrote:
> > > I see. Could this output be viewed as a sign?
> > >
> > > Completed job cwtsmall-gt3062gi cwtsmall with arguments
> > > [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
> > > Staging out
> > >
> > sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-avgResults.Rdata to
> > > 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS
> > >
> > > Thanks,
> > > Jing
> > >
> > > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov> wrote:
> > > Local empty files may be created even if the remote
> > files
> > > don't exist.
> > > So don't take that as a sign that the application
> > has run.
> > >
> > > In the mean time I'll try to convince it to not
> > create empty
> > > local
> > > files, if they don't exist remotely.
> > >
> > > Mihael
> > >
> > > On Mon, 2007-08-20 at 13:43 -0500, Jing Tie wrote:
> > > > I think these files were from the job. Because I
> > deleted all
> > > the
> > > > *Results.Rdata before the job submitting, and
> > found these
> > > empty files
> > > > after the execution.
> > > >
> > > > output of the process of execution:
> > > > RunID: 3szhlhvg4seu0
> > > > cwtsmall started
> > > > Task(type=4,
> > identity=urn:0-0-0-1-0-1-0-1187633646429)
> > > setting status
> > > > to Active
> > > > Task(type=4,
> > identity=urn:0-0-0-1-0-1-0-1187633646429)
> > > setting status
> > > > to Completed
> > > > Task(type=2,
> > identity=urn:0-0-0-1-0-1-0-1187633646432)
> > > setting status
> > > > to Submitted
> > > > Task(type=2,
> > identity=urn:0-0-0-1-0-1-0-1187633646432)
> > > setting status
> > > > to Active
> > > > Task(type=2,
> > identity=urn:0-0-0-1-0-1-0-1187633646432)
> > > setting status
> > > > to Completed
> > > > ...
> > > > Task(type=2,
> > identity=urn:0-0-0-1-0-1-0-1-1187633646453)
> > > setting
> > > > status to Completed
> > > > Staged in scripts/runWaveletsAvg.R to
> > > > sid-wf1-3szhlhvg4seu0/shared/scripts/ on MIT_CMS
> > > > Running job cwtsmall-gt3062gi cwtsmall with
> > arguments
> > > > [scripts/runWaveletsAvg.R, 101, FB] in
> > > > sid-wf1-3szhlhvg4seu0/cwtsmall-gt3062gi on MIT_CMS
> > > > Task(type=1,
> > identity=urn:0-0-0-1-0-1-0-1187633646457)
> > > setting status
> > > > to Submitted
> > > > Task(type=1,
> > identity=urn:0-0-0-1-0-1-0-1187633646457)
> > > setting status
> > > > to Active
> > > > Task(type=1,
> > identity=urn:0-0-0-1-0-1-0-1187633646457)
> > > setting status
> > > > to Completed
> > > > Task(type=4,
> > identity=urn:0-0-0-1-0-1-0-1187633646459)
> > > setting status
> > > > to Active
> > > > Task(type=4,
> > identity=urn:0-0-0-1-0-1-0-1187633646459)
> > > setting status
> > > > to Completed
> > > > Completed job cwtsmall-gt3062gi cwtsmall with
> > arguments
> > > > [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
> > > > Staging out
> > > >
> > >
> > sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-avgResults.Rdata to
> > > > 101-FBchannel15_cwt- avgResults.Rdata from MIT_CMS
> > > > Task(type=4,
> > identity=urn:0-0-0-1-0-1-0-7-1187633646462)
> > > setting
> > > > status to Active
> > > > Task(type=4,
> > identity=urn:0-0-0-1-0-1-0-7-1187633646462)
> > > setting
> > > > status to Completed
> > > > ......
> > > > Task(type=2,
> > identity=urn:0-0-0-1-0-1-0-23-1187633646557)
> > > setting
> > > > status to Active
> > > > Task(type=2,
> > identity=urn:0-0-0-1-0-1-0-22-1187633646554)
> > > setting
> > > > status to Failed Exception in getFile
> > > > Task(type=2,
> > identity=urn:0-0-0-1-0-1-0-2-1187633646560)
> > > setting
> > > > status to Submitted
> > > > ......
> > > >
> > > > Thanks,
> > > > Jing
> > > >
> > > > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov>
> > wrote:
> > > > But those are not from the same job.
> > > >
> > > > On Mon, 2007-08-20 at 12:28 -0500, Jing
> > Tie wrote:
> > > > > Yes. I saw
> > 101-FBchannel1_cwt-avgResults.Rdata to
> > > > > 101-FBchannel28_cwt-avgResults.Rdata 28
> > output
> > > files on the
> > > > swift
> > > > > client, but all the files were empty.
> > > > >
> > > > > Jing
> > > > >
> > > > >
> > > > > On 8/20/07, Mihael Hategan <
> > hategan at mcs.anl.gov>
> > > wrote:
> > > > > On Mon, 2007-08-20 at 12:21
> > -0500, Jing
> > > Tie wrote:
> > > > > > Yes. There is no *
> > avgResults.Rdata
> > > under shared
> > > > directory,
> > > > > only input
> > > > > > file, scripts, wrapper.sh and
> > seq.sh .
> > > > >
> > > > > Did the job actually run?
> > > > >
> > > > > >
> > > > > > Jing
> > > > > >
> > > > > > On 8/20/07, Mihael Hategan <
> > > hategan at mcs.anl.gov>
> > > > wrote:
> > > > > > Not much we can do if
> > the
> > > filesystem is
> > > > broken.
> > > > > > Did you check to
> > confirm that
> > > the file is
> > > > not
> > > > > there?
> > > > > >
> > > > > > Mihael
> > > > > >
> >
> >
>
>
More information about the Swift-user
mailing list