[Swift-devel] next hurdle
Mike Kubal
mikekubal at yahoo.com
Tue Feb 12 17:00:28 CST 2008
One of the applications (antechamber) being launched
by swift on the uc-teragrid is failing with an exit
code of 1 and a message of 'cannot execute binary'
file. It sounds like it might be attempting to run on
one of the 32-bit nodes, though in my tc-file, it
specifies to run only on the 64-bit nodes.
The only difference between a successful run and the
error above are the lines below in from the
sites-file:
(with this line I get the error above)
<execution provider="gt4" jobmanager="PBS"
url="tg-grid1.uc.teragrid.org" />
(with this line instead the job succeeds)
<jobmanager universe="vanilla"
url="tg-grid1.uc.teragrid.org" major="4" minor="0"
patch="0"/>
I rsynced the log and kickstart files to
/home/benc/swift-logs at UC.
Cheers,
Mike
--- Mihael Hategan <hategan at mcs.anl.gov> wrote:
> Would it be worth trying to find out why it worked
> with pre-WS GRAM?
>
> Mihael
>
> On Tue, 2008-02-12 at 14:20 -0800, Mike Kubal wrote:
> > Thanks Joe. This solved the account id problem.
> >
> > --- joseph insley <insley at mcs.anl.gov> wrote:
> >
> > > Mike K,
> > >
> > > looks like you have the wrong value in your
> > > .tg_default_project file:
> > >
> > > insley at tg-viz-login1:~> more
> > > ~kubal/.tg_default_project
> > > TG-MCA01S018
> > >
> > > you should be using: TG-MCB010025N
> > >
> > > insley at tg-viz-login1:~> tgusage -i -u kubal
> > >
> > > [snip]
> > >
> > > Account: TG-MCA01S018
> > > Title: Computational Studies of Complex
> Processes in
> > > Biological
> > > Macromolecular Systems
> > > Resource: teragrid
> > >
> > > ****
> > > Local project name on dtf.anl.teragrid is
> > > TG-MCB010025N
> > > ****
> > >
> > > Allocation Period: 2007-08-03 to 2008-03-31
> > >
> > > Name (Last First) or Account Total
> > > Remaining Usage
> > > ---------------------------- ----------
> > > ------------ ----------
> > > Kubal Michael 101880 SU
> > > 99358 SU 296 SU
> > >
> >
>
----------------------------------------------------------------------
> > > TG-MCA01S018 101880 SU
> > > 99358 SU 2522 SU
> > >
> > >
> > >
> > > On Feb 12, 2008, at 2:37 PM, Mihael Hategan
> wrote:
> > >
> > > > You should probably remove the line
> completely.
> > > >
> > > > Did you chose a default project on the login
> node
> > > with tgprojects?
> > > >
> > > > On Tue, 2008-02-12 at 12:34 -0800, Mike Kubal
> > > wrote:
> > > >> I tried running with the account id removed
> from
> > > the
> > > >> sites.file as in the following line:
> > > >>
> > > >> <profile namespace="globus" key=""></profile>
> > > >>
> > > >> but received the same error.
> > > >>
> > > >>
> > > >>
> > > >> --- Mihael Hategan <hategan at mcs.anl.gov>
> wrote:
> > > >>
> > > >>> Is this the same for pre-WS GRAM?
> > > >>>
> > > >>> On Tue, 2008-02-12 at 14:20 -0600, Stuart
> Martin
> > > >>> wrote:
> > > >>>> that's right, qsub is used for PBS (and
> some
> > > >>> others too)
> > > >>>> bsub is LSF
> > > >>>> condor_q for condor
> > > >>>> ...
> > > >>>>
> > > >>>> -Stu
> > > >>>>
> > > >>>> On Feb 12, 2008, at Feb 12, 2:15 PM, Mihael
> > > >>> Hategan wrote:
> > > >>>>
> > > >>>>>
> > > >>>>> On Tue, 2008-02-12 at 12:09 -0800, Mike
> Kubal
> > > >>> wrote:
> > > >>>>>> I'll give it a try.
> > > >>>>>>
> > > >>>>>> When using GRAM4, is qsub the method used
> to
> > > >>>>>> ultimately put the job in the queue?
> > > >>>>>
> > > >>>>> Looks like it. I also believe it's the
> case
> > > with
> > > >>> pre-ws gram. Stu
> > > >>>>> may be
> > > >>>>> able to clarify.
> > > >>>>>
> > > >>>>>>
> > > >>>>>> MikeK
> > > >>>>>> --- Mihael Hategan <hategan at mcs.anl.gov>
> > > wrote:
> > > >>>>>>
> > > >>>>>>> While this doesn't solve the underlying
> > > >>> problem, it
> > > >>>>>>> may help you get
> > > >>>>>>> this to work: log into tg-login1.uc...,
> set
> > > >>> this
> > > >>>>>>> project as default,
> > > >>>>>>> then remove the project spec from the
> sites
> > > >>> file and
> > > >>>>>>> try again.
> > > >>>>>>>
> > > >>>>>>> Mihael
> > > >>>>>>>
> > > >>>>>>> On Tue, 2008-02-12 at 11:36 -0800, Mike
> > > Kubal
> > > >>> wrote:
> > > >>>>>>>> Yes, I believe you are right. The
> kickstart
> > > >>>>>>> message
> > > >>>>>>>> may be only a warning. After digging a
> > > little
> > > >>>>>>> deeper
> > > >>>>>>>> it appears the job is failing due to a
> > > >>>>>>> project/account
> > > >>>>>>>> id problem. I get the following error:
> > > >>>>>>>>
> > > >>>>>>>> Caused by:
> > > >>>>>>>> The executable could not be
> > > started.,
> > > >>>>>>> qsub:
> > > >>>>>>>> Invalid Account MSG=invalid account
> > > >>>>>>>>
> > > >>>>>>>> I am specifying the same TG-account in
> my
> > > >>>>>>> site-file
> > > >>>>>>>> for the gram4 run that fails, as in the
> > > >>> site-file
> > > >>>>>>> for
> > > >>>>>>>> the pre-ws job that suceeds. This is
> the
> > > same
> > > >>>>>>> project,
> > > >>>>>>>> TG-MCA01S018, that is set in my
> > > >>>>>>> .tg_default_project
> > > >>>>>>>> file in ~kubal/ on the UC teragrid.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> --- Ben Clifford <benc at hawaga.org.uk>
> > > wrote:
> > > >>>>>>>>
> > > >>>>>>>>> yeah, run that same without kickstart.
> the
> > > >>> error
> > > >>>>>>>>> reported is that
> > > >>>>>>>>> kickstart didn't work right - but
> there's
> > > >>>>>>> perhaps
> > > >>>>>>>>> some underlying error.
> > > >>>>>>>>> --
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>
> > > >>
> > >
> >
>
_____________________________________________________________________
> > >
> > > >> _______________
>
=== message truncated ===
____________________________________________________________________________________
Never miss a thing. Make Yahoo your home page.
http://www.yahoo.com/r/hs
More information about the Swift-devel
mailing list