[Swift-user] Exception in getFile
Mihael Hategan
hategan at mcs.anl.gov
Mon Aug 20 14:58:47 CDT 2007
On Mon, 2007-08-20 at 14:55 -0500, Jing Tie wrote:
> I see. So at this point, the problem could be caused by two reasons:
> 1. GFS system is broken, and missed the output files;
> 2. Swift has problem to create output files.
>
> Is it right?
Swift doesn't really create output files. It's the application that
does. So I don't see how (2) can be the problem.
There are other possibilities, including the application not actually
having run correctly, and thus not having produced the output files.
>
> Thanks,
> Jing
>
> On 8/20/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> No. Swift will always try to stage out the output files if it
> has no
> indication that something went wrong with the job. But if the
> filesystem
> is broken, and the files are not actually there, well, that's
> what you
> seem to be observing.
>
> On Mon, 2007-08-20 at 14:36 -0500, Jing Tie wrote:
> > I see. Could this output be viewed as a sign?
> >
> > Completed job cwtsmall-gt3062gi cwtsmall with arguments
> > [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
> > Staging out
> >
> sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-avgResults.Rdata to
> > 101-FBchannel15_cwt-avgResults.Rdata from MIT_CMS
> >
> > Thanks,
> > Jing
> >
> > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov> wrote:
> > Local empty files may be created even if the remote
> files
> > don't exist.
> > So don't take that as a sign that the application
> has run.
> >
> > In the mean time I'll try to convince it to not
> create empty
> > local
> > files, if they don't exist remotely.
> >
> > Mihael
> >
> > On Mon, 2007-08-20 at 13:43 -0500, Jing Tie wrote:
> > > I think these files were from the job. Because I
> deleted all
> > the
> > > *Results.Rdata before the job submitting, and
> found these
> > empty files
> > > after the execution.
> > >
> > > output of the process of execution:
> > > RunID: 3szhlhvg4seu0
> > > cwtsmall started
> > > Task(type=4,
> identity=urn:0-0-0-1-0-1-0-1187633646429)
> > setting status
> > > to Active
> > > Task(type=4,
> identity=urn:0-0-0-1-0-1-0-1187633646429)
> > setting status
> > > to Completed
> > > Task(type=2,
> identity=urn:0-0-0-1-0-1-0-1187633646432)
> > setting status
> > > to Submitted
> > > Task(type=2,
> identity=urn:0-0-0-1-0-1-0-1187633646432)
> > setting status
> > > to Active
> > > Task(type=2,
> identity=urn:0-0-0-1-0-1-0-1187633646432)
> > setting status
> > > to Completed
> > > ...
> > > Task(type=2,
> identity=urn:0-0-0-1-0-1-0-1-1187633646453)
> > setting
> > > status to Completed
> > > Staged in scripts/runWaveletsAvg.R to
> > > sid-wf1-3szhlhvg4seu0/shared/scripts/ on MIT_CMS
> > > Running job cwtsmall-gt3062gi cwtsmall with
> arguments
> > > [scripts/runWaveletsAvg.R, 101, FB] in
> > > sid-wf1-3szhlhvg4seu0/cwtsmall-gt3062gi on MIT_CMS
> > > Task(type=1,
> identity=urn:0-0-0-1-0-1-0-1187633646457)
> > setting status
> > > to Submitted
> > > Task(type=1,
> identity=urn:0-0-0-1-0-1-0-1187633646457)
> > setting status
> > > to Active
> > > Task(type=1,
> identity=urn:0-0-0-1-0-1-0-1187633646457)
> > setting status
> > > to Completed
> > > Task(type=4,
> identity=urn:0-0-0-1-0-1-0-1187633646459)
> > setting status
> > > to Active
> > > Task(type=4,
> identity=urn:0-0-0-1-0-1-0-1187633646459)
> > setting status
> > > to Completed
> > > Completed job cwtsmall-gt3062gi cwtsmall with
> arguments
> > > [scripts/runWaveletsAvg.R, 101, FB] on MIT_CMS
> > > Staging out
> > >
> >
> sid-wf1-3szhlhvg4seu0/shared//101-FBchannel15_cwt-avgResults.Rdata to
> > > 101-FBchannel15_cwt- avgResults.Rdata from MIT_CMS
> > > Task(type=4,
> identity=urn:0-0-0-1-0-1-0-7-1187633646462)
> > setting
> > > status to Active
> > > Task(type=4,
> identity=urn:0-0-0-1-0-1-0-7-1187633646462)
> > setting
> > > status to Completed
> > > ......
> > > Task(type=2,
> identity=urn:0-0-0-1-0-1-0-23-1187633646557)
> > setting
> > > status to Active
> > > Task(type=2,
> identity=urn:0-0-0-1-0-1-0-22-1187633646554)
> > setting
> > > status to Failed Exception in getFile
> > > Task(type=2,
> identity=urn:0-0-0-1-0-1-0-2-1187633646560)
> > setting
> > > status to Submitted
> > > ......
> > >
> > > Thanks,
> > > Jing
> > >
> > > On 8/20/07, Mihael Hategan < hategan at mcs.anl.gov>
> wrote:
> > > But those are not from the same job.
> > >
> > > On Mon, 2007-08-20 at 12:28 -0500, Jing
> Tie wrote:
> > > > Yes. I saw
> 101-FBchannel1_cwt-avgResults.Rdata to
> > > > 101-FBchannel28_cwt-avgResults.Rdata 28
> output
> > files on the
> > > swift
> > > > client, but all the files were empty.
> > > >
> > > > Jing
> > > >
> > > >
> > > > On 8/20/07, Mihael Hategan <
> hategan at mcs.anl.gov>
> > wrote:
> > > > On Mon, 2007-08-20 at 12:21
> -0500, Jing
> > Tie wrote:
> > > > > Yes. There is no *
> avgResults.Rdata
> > under shared
> > > directory,
> > > > only input
> > > > > file, scripts, wrapper.sh and
> seq.sh .
> > > >
> > > > Did the job actually run?
> > > >
> > > > >
> > > > > Jing
> > > > >
> > > > > On 8/20/07, Mihael Hategan <
> > hategan at mcs.anl.gov>
> > > wrote:
> > > > > Not much we can do if
> the
> > filesystem is
> > > broken.
> > > > > Did you check to
> confirm that
> > the file is
> > > not
> > > > there?
> > > > >
> > > > > Mihael
> > > > >
>
>
More information about the Swift-user
mailing list