[Swift-devel] Re: latest attempt with GRAM4

Mihael Hategan hategan at mcs.anl.gov
Tue Feb 12 17:53:50 CST 2008


On Tue, 2008-02-12 at 17:41 -0600, Michael Wilde wrote:
> That would certainly explain this clobbered the head node.
> Im sorry that we all missed this last week.


This is what Joe killed at the time:

> kubal    28202 19438  1 16:50 ?        00:00:00 /usr/bin/perl /soft/ 
> prews-gram-4.0.1-r3/libexec/globus-job-manager-script.pl -m pbs -f / 
> tmp/gram_SPsdme -c poll

Looks like PBS.

> 
> If true, we would have seen the applications running on the headnode.
> I wonder if anyone noticed?
> 
> Mike, heres a sample entry Ive used in the past for UC-TG:
> 
> <pool handle="UC" gridlaunch="/home/wilde/swift/tools/mystart" 
> sysinfo="INTEL32::LINUX">
>      <gridftp url="gsiftp://tg-gridftp.uc.teragrid.org" 
> storage="/home/wilde/swiftdata/UC/storage" major="2" minor="2" />
>      <jobmanager universe="vanilla" 
> url="tg-grid.uc.teragrid.org/jobmanager-pbs" major="2" minor="2"/>
>      <workdirectory>/home/wilde/swiftdata/UC/work</workdirectory>
> </pool>
> 
> 
> The missing part is the "/jobmanager-pbs" in the url= tag of the 
> <jobmanager> element.

I think we may want to discourage that since it's not portable. I'd say
instead of <jobmanager>, one should use <execution provider="gt2"
jobManager="pbs" url="tg-grid.uc.teragrid.org"/>

Mihael

> 
> - mikew
> 
> 
> On 2/12/08 5:17 PM, Mihael Hategan wrote:
> > Are you sure you were using PBS with pre-ws GRAM and not fork?
> > 
> > On Tue, 2008-02-12 at 15:14 -0800, Mike Kubal wrote:
> >> With pre-WS-GRAM, it doesn't seem to matter which
> >> account/project id I use where. I can have the
> >> TG-MCB010025N specified in the sites-files and
> >> TG-MCA01S018 specified in ~kubal/.tg-default_project
> >> on the uc teragrid, or vice versa and it still works,
> >> or having them match in both places.
> >>
> >> With WS-GRAM, I have to use TG-MCB010025N, the local
> >> uc-teragrid project id, in both places. Using
> >> TG-MCA01S018, the teragrid wide charge number/account
> >> number, causes the qsub failure error.
> >>
> >>
> >>
> >>
> >> --- Michael Wilde <wilde at mcs.anl.gov> wrote:
> >>
> >>> Mike, did you do a recent test with pre-WS-GRAM with
> >>> the 
> >>> .tg_default_project file set *incorrectly*?
> >>>
> >>> I think the puzzle was why this would cause WS-GRAM
> >>> to fail but not 
> >>> pre-WS-GRAM, as it would seem they would both get
> >>> the TG account to use 
> >>> in the same manner.
> >>>
> >>> - mikew
> >>>
> >>> On 2/12/08 4:43 PM, Mike Kubal wrote:
> >>>> Just to be sure I tested with pre-WS and it worked
> >>>> also. 
> >>>>  
> >>>> --- Mihael Hategan <hategan at mcs.anl.gov> wrote:
> >>>>
> >>>>> Would it be worth trying to find out why it
> >>> worked
> >>>>> with pre-WS GRAM?
> >>>>>
> >>>>> Mihael
> >>>>>
> >>>>> On Tue, 2008-02-12 at 14:20 -0800, Mike Kubal
> >>> wrote:
> >>>>>> Thanks Joe. This solved the account id problem.
> >>>>>>
> >>>>>> --- joseph insley <insley at mcs.anl.gov> wrote:
> >>>>>>
> >>>>>>> Mike K,
> >>>>>>>
> >>>>>>> looks like you have the wrong value in your
> >>>>>>> .tg_default_project file:
> >>>>>>>
> >>>>>>> insley at tg-viz-login1:~> more
> >>>>>>> ~kubal/.tg_default_project
> >>>>>>> TG-MCA01S018
> >>>>>>>
> >>>>>>> you should be using: TG-MCB010025N
> >>>>>>>
> >>>>>>> insley at tg-viz-login1:~> tgusage -i -u kubal
> >>>>>>>
> >>>>>>> [snip]
> >>>>>>>
> >>>>>>> Account: TG-MCA01S018
> >>>>>>> Title: Computational Studies of Complex
> >>>>> Processes in
> >>>>>>> Biological  
> >>>>>>> Macromolecular Systems
> >>>>>>> Resource: teragrid
> >>>>>>>
> >>>>>>> ****
> >>>>>>> Local project name on dtf.anl.teragrid is
> >>>>>>> TG-MCB010025N
> >>>>>>> ****
> >>>>>>>
> >>>>>>> Allocation Period: 2007-08-03 to 2008-03-31
> >>>>>>>
> >>>>>>> Name (Last First) or Account       Total     
> >>>>>>> Remaining        Usage
> >>>>>>> ----------------------------     ---------- 
> >>>>>>> ------------   ----------
> >>>>>>>     Kubal  Michael                101880 SU    
> >>>>>>> 99358 SU       296 SU
> >>>>>>>
> >> ----------------------------------------------------------------------
> >>>>>>>     TG-MCA01S018                  101880 SU    
> >>>>>>> 99358 SU      2522 SU
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Feb 12, 2008, at 2:37 PM, Mihael Hategan
> >>>>> wrote:
> >>>>>>>> You should probably remove the line
> >>>>> completely.
> >>>>>>>> Did you chose a default project on the login
> >>>>> node
> >>>>>>> with tgprojects?
> >>>>>>>> On Tue, 2008-02-12 at 12:34 -0800, Mike Kubal
> >>>>>>> wrote:
> >>>>>>>>> I tried running with the account id removed
> >>>>> from
> >>>>>>> the
> >>>>>>>>> sites.file as in the following line:
> >>>>>>>>>
> >>>>>>>>> <profile namespace="globus" key=""></profile>
> >>>>>>>>>
> >>>>>>>>> but received the same error.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --- Mihael Hategan <hategan at mcs.anl.gov>
> >>>>> wrote:
> >>>>>>>>>> Is this the same for pre-WS GRAM?
> >>>>>>>>>>
> >>>>>>>>>> On Tue, 2008-02-12 at 14:20 -0600, Stuart
> >>>>> Martin
> >>>>>>>>>> wrote:
> >>>>>>>>>>> that's right, qsub is used for PBS (and
> >>>>> some
> >>>>>>>>>> others too)
> >>>>>>>>>>> bsub is LSF
> >>>>>>>>>>> condor_q for condor
> >>>>>>>>>>> ...
> >>>>>>>>>>>
> >>>>>>>>>>> -Stu
> >>>>>>>>>>>
> >>>>>>>>>>> On Feb 12, 2008, at Feb 12, 2:15 PM, Mihael
> >>>>>>>>>> Hategan wrote:
> >>>>>>>>>>>> On Tue, 2008-02-12 at 12:09 -0800, Mike
> >>>>> Kubal
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>> I'll give it a try.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> When using GRAM4, is qsub the method used
> >>>>> to
> >>>>>>>>>>>>> ultimately put the job in the queue?
> >>>>>>>>>>>> Looks like it. I also believe it's the
> >>>>> case
> >>>>>>> with
> >>>>>>>>>> pre-ws gram. Stu
> >>>>>>>>>>>> may be
> >>>>>>>>>>>> able to clarify.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> MikeK
> >>>>>>>>>>>>> --- Mihael Hategan <hategan at mcs.anl.gov>
> >>>>>>> wrote:
> >>>>>>>>>>>>>> While this doesn't solve the underlying
> >>>>>>>>>> problem, it
> >>>>>>>>>>>>>> may help you get
> >>>>>>>>>>>>>> this to work: log into tg-login1.uc...,
> >>>>> set
> >>>>>>>>>> this
> >>>>>>>>>>>>>> project as default,
> >>>>>>>>>>>>>> then remove the project spec from the
> >>>>> sites
> >>>>>>>>>> file and
> >>>>>>>>>>>>>> try again.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Mihael
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, 2008-02-12 at 11:36 -0800, Mike
> >>>>>>> Kubal
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>> Yes, I believe you are right. The
> >>>>> kickstart
> >>>>>>>>>>>>>> message
> >>>>>>>>>>>>>>> may be only a warning. After digging a
> >>>>>>> little
> >>>>>>>>>>>>>> deeper
> >>>>>>>>>>>>>>> it appears the job is failing due to a
> >>>>>>>>>>>>>> project/account
> >>>>>>>>>>>>>>> id problem. I get the following error:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Caused by:
> >>>>>>>>>>>>>>>        The executable could not be
> >>>>>>> started.,
> >>>>>>>>>>>>>> qsub:
> >>>>>>>>>>>>>>> Invalid Account MSG=invalid account
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I am specifying the same TG-account in
> >>>>> my
> >>>>>>>>>>>>>> site-file
> >>>>>>>>>>>>>>> for the gram4 run that fails, as in the
> >>>>>>>>>> site-file
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>> the pre-ws job that suceeds. This is
> >>>>> the
> >>>>>>> same
> >>>>>>>>>>>>>> project,
> >>>>>>>>>>>>>>> TG-MCA01S018, that is set in my
> >>>>>>>>>>>>>> .tg_default_project
> >>>>>>>>>>>>>>> file in ~kubal/ on the UC teragrid.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --- Ben Clifford <benc at hawaga.org.uk>
> >>>>>>> wrote:
> >>>>>>>>>>>>>>>> yeah, run that same without kickstart.
> >>>>> the
> >>>>>>>>>> error
> >>>>>>>>>>>>>>>> reported is that
> >>>>>>>>>>>>>>>> kickstart didn't work right - but
> >>>>> there's
> >>>>>>>>>>>>>> perhaps
> >>>>>>>>>>>>>>>> some underlying error.
> >>>>>>>>>>>>>>>> -- 
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >> === message truncated ===
> >>
> >>
> >>
> >>       ____________________________________________________________________________________
> >> Looking for last minute shopping deals?  
> >> Find them fast with Yahoo! Search.  http://tools.search.yahoo.com/newsearch/category.php?category=shopping
> >>
> > 
> > 
> 




More information about the Swift-devel mailing list