[Swift-user] Error message on Cray XE6

Michael Wilde wilde at mcs.anl.gov
Sat Apr 14 10:10:00 CDT 2012


I just tried both setting HOME=/lustre/beagle/wilde and setting user.home to the same thing. Neither works. I think user.home is coming from the Java property, and that doesnt seem to be influenced by the HOME env var. I was about to look if Java can be asked to change home. Maybe by setting a command line arg to Java.

- Mike

----- Original Message -----
> From: "Jonathan Monette" <jonmon at mcs.anl.gov>
> To: "Michael Wilde" <wilde at mcs.anl.gov>
> Cc: "Lorenzo Pesce" <lpesce at uchicago.edu>, swift-user at ci.uchicago.edu
> Sent: Saturday, April 14, 2012 10:02:14 AM
> Subject: Re: [Swift-user] Error message on Cray XE6
> That is an easy fix I believe. I know where the code is so I will
> change and test.
> 
> In the mean time could you try something? Try setting
> user.home=<someplace.on.lustre>
> in your config file and try again.
> 
> On Apr 14, 2012, at 9:58, Michael Wilde <wilde at mcs.anl.gov> wrote:
> 
> > /home is no longer mounted by the compute nodes, per the
> > post-maitenance summary:
> >
> > "External filesystem dependencies minimized: Compute nodes and the
> > scheduler should now continue to process and complete jobs without
> > the threat of interference of external filesystem outages.
> > /gpfs/pads is only available on login1 through login5; /home is on
> > login and mom nodes only."
> >
> > So we need to (finally) remove Swift's dependence on $HOME/.globus
> > and $HOME/.globus/scripts in particular.
> >
> > I suggest - since the swift command already needs to write to "." -
> > that we create a scripts/ directory in "." instead of $HOME/.globus.
> > And this should be used by any provider that would have previously
> > created files below .globus.
> >
> > I'll echo this to swift-devel and start a thread there to discuss.
> > Its possible there's already a property to cause scripts/ to be
> > created elsewhere. If not, I think we should make one. I think
> > grouping the scripts created by a run into the current dir, along
> > with the swift log, _concurrent, and (in the conventions I use in my
> > run scripts) swiftwork/.
> >
> > Lorenzo, hopefully we can at least get you a workaround for this
> > soon.
> >
> > You *might* be able to trick swift into doing this by setting
> > HOME=/lustre/beagle/$USER. I already tried a symlink under .globus
> > and that didnt work, as /home is not even readable by the compute
> > nodes, which in this case need to run the coaster worker (.pl)
> > script.
> >
> > - Mike
> >
> >
> > ----- Original Message -----
> >> From: "Lorenzo Pesce" <lpesce at uchicago.edu>
> >> To: "Jonathan Monette" <jonmon at mcs.anl.gov>
> >> Cc: swift-user at ci.uchicago.edu
> >> Sent: Saturday, April 14, 2012 8:15:39 AM
> >> Subject: Re: [Swift-user] Error message on Cray XE6
> >> In principle the access to the /home filesystem should still be
> >> there.
> >>
> >> The only thing I did was to chance the cf file to remove some
> >> errors I
> >> had into it, so that might also be the source of the problem. This
> >> is
> >> what it looks like now:
> >> (BTW, the comments are not mine, I run swift only from lustre)
> >>
> >>
> >> # Whether to transfer the wrappers from the compute nodes
> >> # I like to launch from my home dir, but keep everything on
> >> # lustre
> >> wrapperlog.always.transfer=false
> >>
> >> #Indicates whether the working directory on the remote site
> >> # should be left intact even when a run completes successfully
> >> sitedir.keep=true
> >>
> >> #try only once
> >> execution.retries=1
> >>
> >> # Attempt to run as much as possible, i.g., ignore non-fatal errors
> >> lazy.errors=true
> >>
> >> # to reduce filesystem access
> >> status.mode=provider
> >>
> >> use.provider.staging=false
> >>
> >> provider.staging.pin.swiftfiles=false
> >>
> >> foreach.max.threads=100
> >>
> >> provenance.log=false
> >>
> >>
> >>
> >>
> >> On Apr 14, 2012, at 12:10 AM, Jonathan Monette wrote:
> >>
> >>> The perl script is the worker script that is submitted with PBS. I
> >>> have not tried to run on Beagle since the maintenance period has
> >>> ended so I am not exactly sure why the error popped up. One reason
> >>> could be that the home file system is no longer mounted on the
> >>> compute nodes. I know they spoke about that being a possibility
> >>> but
> >>> not sure they implemented that during the maintenance period. Do
> >>> you
> >>> know if the home file system is still mounted on the compute
> >>> nodes?
> >>>
> >>> On Apr 13, 2012, at 17:18, Lorenzo Pesce <lpesce at uchicago.edu>
> >>> wrote:
> >>>
> >>>> Hi --
> >>>> I haven't seen this one before:
> >>>>
> >>>> Can't open perl script
> >>>> "/home/lpesce/.globus/coasters/cscript7176272791806289394.pl": No
> >>>> such file or directory
> >>>>
> >>>> The config of the cray has changed, might this have anything to
> >>>> do
> >>>> with it?
> >>>> I have no idea what perl script is it talking about and why it is
> >>>> looking to home.
> >>>>
> >>>> Thanks a lot,
> >>>>
> >>>> Lorenzo
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Swift-user mailing list
> >>>> Swift-user at ci.uchicago.edu
> >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> >>
> >> _______________________________________________
> >> Swift-user mailing list
> >> Swift-user at ci.uchicago.edu
> >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> >
> > --
> > Michael Wilde
> > Computation Institute, University of Chicago
> > Mathematics and Computer Science Division
> > Argonne National Laboratory
> >

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-user mailing list