[Swift-devel] Provider staging is failing

wilde at mcs.anl.gov wilde at mcs.anl.gov
Mon Aug 30 23:52:44 CDT 2010


Nope - I was wrong again. The "-d |outdir" form has been generated all along. The problem was that this causes a mkdir -p in _swiftwrap.staging to be invoked with a null value. This was obscured in _swiftwrap, which had a jobdir in front of the null input dir, and was thus silently ignored by mkdir -p.

I committed a fix (skip mkdir if dir is null), but please keep an eye on _swiftwrap.staging in case it causes other issues.

There was also a typo in a var $STDER -> STDERR.

- Mike



----- wilde at mcs.anl.gov wrote:

> ----- "Justin M Wozniak" <wozniak at mcs.anl.gov> wrote:
> 
> > I think that's ok.
> 
> Right: Mihael pointed out to me in IM that the exec'ed program is
> /bin/bash with _swiftwrap.staging as an arg.
> 
> Digging deeper it looks like _swiftwrap.staging is getting run with
> this command line:
> 
> /bin/bash _swiftwrap.staging -e /bin/cat -out outdir/f.0001.out -err
> stderr.txt -i -d '|outdir' -if data.txt -of outdir/f.0001.out -k
> -cdmfile  -status provider -a data.txt
> 
> and the extra "|" separator in the -d 'outdir' arg (quotes mine) is
> causing a spurious mkdir to get invoked for what would have been the
> "in dirs" argument.That in turn is causing the ret code 254.
> 
> I think that extra | separator is not supposed to be there when there
> are no input directories (as in this case). vdl-int.staging has:
>   "-d", flatten(each(fileDirs)),
> and I now suspect a null value for the dirs of stagein is not being
> handled right, somewhere around:
>    fileDirs := fileDirs(stagein, stageout)
> 
> - Mike
> 
> 
> 
> 
> > Do you have the wrapper.log/info files?
> > 
> > On Mon, 30 Aug 2010, Michael Wilde wrote:
> > 
> > > _swiftwrap.staging didnt sem to get marked executable:
> > 
> > 
> > > ----- "Michael Wilde" <wilde at mcs.anl.gov> wrote:
> > >
> > >> WIth proxy the stageins seem to complete. Then a get a 254 when
> it
> > >> tries to run; Im looking at that now:
> > >>
> > >> 1283218480.397 DEBUG 000000 CWD: /
> > >> 1283218480.397 DEBUG 000000 Running /bin/bash
> > >> 1283218480.397 DEBUG 000000 Directory:
> > >> /home/wilde/swiftwork/catsn-20100830-2034-hotqv61h-o-cat-oy\
> > >> un22yj
> > >> 1283218480.397 DEBUG 000000 Command: _swiftwrap.staging -e
> > /bin/cat
> > >> -out outdir/f.0001.out -err st\
> > >> derr.txt -i -d |outdir -if data.txt -of outdir/f.0001.out -k
> > -cdmfile
> > >> -status provider -a data.tx\
> > >> t
> > >> 1283218480.397 DEBUG 000000 Command: /bin/bash
> _swiftwrap.staging
> > -e
> > >> /bin/cat -out outdir/f.0001.o\
> > >> ut -err stderr.txt -i -d |outdir -if data.txt -of
> outdir/f.0001.out
> > -k
> > >> -cdmfile  -status provider \
> > >> -a data.txt
> > >> 1283218480.397 DEBUG 000000 1283218479990 Forked process 17949.
> > >> Waiting for its completion
> > >> 1283218480.408 DEBUG 000000 Checking jobs status (1 active)
> > >> 1283218480.408 DEBUG 000000 1283218479990 Checking pid 17949
> > >> 1283218480.408 DEBUG 000000 1283218479990 Job 17949 still
> running
> > >> 1283218480.408 TRACE 000000  IN: len=2, actuallen=2, tag=4,
> > flags=3,
> > >> OK
> > >> 1283218480.408 DEBUG 000000 Fin flag set
> > >> 1283218480.508 DEBUG 000000 Checking jobs status (1 active)
> > >> 1283218480.508 DEBUG 000000 1283218479990 Checking pid 17949
> > >> 1283218480.508 DEBUG 000000 1283218479990 Child process 17949
> > >> terminated. Status is 254.
> > >>
> > >>
> > >> - Mike
> > >>
> > >> ----- "Mihael Hategan" <hategan at mcs.anl.gov> wrote:
> > >>
> > >>> On Mon, 2010-08-30 at 19:26 -0600, wilde at mcs.anl.gov wrote:
> > >>>> I turned on the TRACE output level in worker.pl. I need to dig
> > >>> deeper but it looks to me that the pathnames its trying to
> fetch
> > >> are
> > >>> getting mangled/confused with the file:// portion of the URI:
> > >>>>
> > >>>> org.globus.cog.karajan.workflow.service.ProtocolException:
> > >>> java.io.FileNotFoundException:
> > >>>
> > >>
> >
> /autonfs/home/wilde/./file:/localhost/home/wilde/swift/rev/trunk/bin/../libexec/_swiftwrap.staging
> > >>> (No such file or directory)
> > >>>>
> > >>>> The file
> > >>> "/home/wilde/swift/rev/trunk/bin/../libexec/_swiftwrap.staging"
> > >> does
> > >>> exist on the client side.
> > >>>
> > >>> Seems to. I gather "file" is broken.
> > >>>
> > >>> Can you try "proxy", and see if it fails? If not, I'll know a
> bit
> > >>> better
> > >>> where to look.
> > >>>
> > >>> Mihael
> > >>
> > >> --
> > >> Michael Wilde
> > >> Computation Institute, University of Chicago
> > >> Mathematics and Computer Science Division
> > >> Argonne National Laboratory
> > >
> > >
> > 
> > -- 
> > Justin M Wozniak
> 
> -- 
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list