[Swift-user] Error message on Cray XE6

Michael Wilde wilde at mcs.anl.gov
Sat Apr 14 10:13:40 CDT 2012


stackoverflow says this should work:

java -Duser.home=<new_location> <your_program>

Need to get that in via the swift command.

- Mike


----- Original Message -----
> From: "Michael Wilde" <wilde at mcs.anl.gov>
> To: "Jonathan Monette" <jonmon at mcs.anl.gov>
> Cc: "Lorenzo Pesce" <lpesce at uchicago.edu>, swift-user at ci.uchicago.edu
> Sent: Saturday, April 14, 2012 10:10:00 AM
> Subject: Re: [Swift-user] Error message on Cray XE6
> I just tried both setting HOME=/lustre/beagle/wilde and setting
> user.home to the same thing. Neither works. I think user.home is
> coming from the Java property, and that doesnt seem to be influenced
> by the HOME env var. I was about to look if Java can be asked to
> change home. Maybe by setting a command line arg to Java.
> 
> - Mike
> 
> ----- Original Message -----
> > From: "Jonathan Monette" <jonmon at mcs.anl.gov>
> > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > Cc: "Lorenzo Pesce" <lpesce at uchicago.edu>,
> > swift-user at ci.uchicago.edu
> > Sent: Saturday, April 14, 2012 10:02:14 AM
> > Subject: Re: [Swift-user] Error message on Cray XE6
> > That is an easy fix I believe. I know where the code is so I will
> > change and test.
> >
> > In the mean time could you try something? Try setting
> > user.home=<someplace.on.lustre>
> > in your config file and try again.
> >
> > On Apr 14, 2012, at 9:58, Michael Wilde <wilde at mcs.anl.gov> wrote:
> >
> > > /home is no longer mounted by the compute nodes, per the
> > > post-maitenance summary:
> > >
> > > "External filesystem dependencies minimized: Compute nodes and the
> > > scheduler should now continue to process and complete jobs without
> > > the threat of interference of external filesystem outages.
> > > /gpfs/pads is only available on login1 through login5; /home is on
> > > login and mom nodes only."
> > >
> > > So we need to (finally) remove Swift's dependence on $HOME/.globus
> > > and $HOME/.globus/scripts in particular.
> > >
> > > I suggest - since the swift command already needs to write to "."
> > > -
> > > that we create a scripts/ directory in "." instead of
> > > $HOME/.globus.
> > > And this should be used by any provider that would have previously
> > > created files below .globus.
> > >
> > > I'll echo this to swift-devel and start a thread there to discuss.
> > > Its possible there's already a property to cause scripts/ to be
> > > created elsewhere. If not, I think we should make one. I think
> > > grouping the scripts created by a run into the current dir, along
> > > with the swift log, _concurrent, and (in the conventions I use in
> > > my
> > > run scripts) swiftwork/.
> > >
> > > Lorenzo, hopefully we can at least get you a workaround for this
> > > soon.
> > >
> > > You *might* be able to trick swift into doing this by setting
> > > HOME=/lustre/beagle/$USER. I already tried a symlink under .globus
> > > and that didnt work, as /home is not even readable by the compute
> > > nodes, which in this case need to run the coaster worker (.pl)
> > > script.
> > >
> > > - Mike
> > >
> > >
> > > ----- Original Message -----
> > >> From: "Lorenzo Pesce" <lpesce at uchicago.edu>
> > >> To: "Jonathan Monette" <jonmon at mcs.anl.gov>
> > >> Cc: swift-user at ci.uchicago.edu
> > >> Sent: Saturday, April 14, 2012 8:15:39 AM
> > >> Subject: Re: [Swift-user] Error message on Cray XE6
> > >> In principle the access to the /home filesystem should still be
> > >> there.
> > >>
> > >> The only thing I did was to chance the cf file to remove some
> > >> errors I
> > >> had into it, so that might also be the source of the problem.
> > >> This
> > >> is
> > >> what it looks like now:
> > >> (BTW, the comments are not mine, I run swift only from lustre)
> > >>
> > >>
> > >> # Whether to transfer the wrappers from the compute nodes
> > >> # I like to launch from my home dir, but keep everything on
> > >> # lustre
> > >> wrapperlog.always.transfer=false
> > >>
> > >> #Indicates whether the working directory on the remote site
> > >> # should be left intact even when a run completes successfully
> > >> sitedir.keep=true
> > >>
> > >> #try only once
> > >> execution.retries=1
> > >>
> > >> # Attempt to run as much as possible, i.g., ignore non-fatal
> > >> errors
> > >> lazy.errors=true
> > >>
> > >> # to reduce filesystem access
> > >> status.mode=provider
> > >>
> > >> use.provider.staging=false
> > >>
> > >> provider.staging.pin.swiftfiles=false
> > >>
> > >> foreach.max.threads=100
> > >>
> > >> provenance.log=false
> > >>
> > >>
> > >>
> > >>
> > >> On Apr 14, 2012, at 12:10 AM, Jonathan Monette wrote:
> > >>
> > >>> The perl script is the worker script that is submitted with PBS.
> > >>> I
> > >>> have not tried to run on Beagle since the maintenance period has
> > >>> ended so I am not exactly sure why the error popped up. One
> > >>> reason
> > >>> could be that the home file system is no longer mounted on the
> > >>> compute nodes. I know they spoke about that being a possibility
> > >>> but
> > >>> not sure they implemented that during the maintenance period. Do
> > >>> you
> > >>> know if the home file system is still mounted on the compute
> > >>> nodes?
> > >>>
> > >>> On Apr 13, 2012, at 17:18, Lorenzo Pesce <lpesce at uchicago.edu>
> > >>> wrote:
> > >>>
> > >>>> Hi --
> > >>>> I haven't seen this one before:
> > >>>>
> > >>>> Can't open perl script
> > >>>> "/home/lpesce/.globus/coasters/cscript7176272791806289394.pl":
> > >>>> No
> > >>>> such file or directory
> > >>>>
> > >>>> The config of the cray has changed, might this have anything to
> > >>>> do
> > >>>> with it?
> > >>>> I have no idea what perl script is it talking about and why it
> > >>>> is
> > >>>> looking to home.
> > >>>>
> > >>>> Thanks a lot,
> > >>>>
> > >>>> Lorenzo
> > >>>>
> > >>>>
> > >>>>
> > >>>> _______________________________________________
> > >>>> Swift-user mailing list
> > >>>> Swift-user at ci.uchicago.edu
> > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> > >>
> > >> _______________________________________________
> > >> Swift-user mailing list
> > >> Swift-user at ci.uchicago.edu
> > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> > >
> > > --
> > > Michael Wilde
> > > Computation Institute, University of Chicago
> > > Mathematics and Computer Science Division
> > > Argonne National Laboratory
> > >
> 
> --
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-user mailing list