[Swift-devel] coastersPerNode not recognized by GT2 GRAM

Michael Wilde wilde at mcs.anl.gov
Wed Jul 30 18:59:42 CDT 2008


Ben applied a fix. I will test:

-------- Original Message --------
Subject: Re: [Swift-devel] coastersPerNode not recognized by GT2 GRAM
Date: Wed, 30 Jul 2008 16:54:47 +0000 (GMT)
From: Ben Clifford <benc at hawaga.org.uk>
To: Michael Wilde <wilde at mcs.anl.gov>
CC: swift-devel <swift-devel at ci.uchicago.edu>
References: <489053A9.4080906 at mcs.anl.gov> 
<48907DF1.8020009 at mcs.anl.gov> <48909148.1010504 at mcs.anl.gov>

try cog r2123. i just tested that against ncsa teragrid. it now filters
out that attribute before sending on to gram2.
-- 

On 7/30/08 6:32 PM, Mihael Hategan wrote:
> On Wed, 2008-07-30 at 09:42 -0500, Michael Wilde wrote:
>> I need to set this aside for now, but would appreciate any help in 
>> debugging it.
>>
>> My sites.xml file is:
>>
>> <config>
>> <pool handle="abe" >
>>    <execution provider="coaster" url="grid-abe.ncsa.teragrid.org" 
>> jobManager="gt2:gt2:pbs" />
>>    <gridftp url="gsiftp://gridftp-abe.ncsa.teragrid.org"/>
>>    <workdirectory>/u/ac/wilde/swiftwork</workdirectory>
>>    <profile namespace="globus" key="coastersPerNode">8</profile>
>> </pool>
>> </config>
>>
>> The logs are on CI net at /home/wilde/coast/run17.
>>
>> I dont see how this works. Is the code picking up the parameter from the 
>>   globus RSL and then passing it to bootstrap.sh to in turn pass it to 
>> the coaster server? It needs to be stripped off the globus profile 
>> before the GT2 job that launches bootstrap.sh is run, right?  Else 
>> Globus will complain that its not valid RSL?
> 
> Right. Actually the coaster code should explicitly avoid passing that
> attribute to gt2. So I consider this a bug.
> 
>> I see the one test case for this in tests/sites/coaster is for 
>> localhost. Was it tested on gt2:gt2:pbs?
>>
>> - Mike
>>
>>
>>
>> On 7/30/08 6:42 AM, Michael Wilde wrote:
>>> When I use coastersPerNode on Abe I get an error:
>>>
>>> 2008-07-30 01:06:43,498-0500 DEBUG vdl:execute2 APPLICATION_EXCEPTION 
>>> jobid=echo-9g3mu8xi - Application exception: Cannot submit job
>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: 
>>> Cannot submit job
>>> at 
>>> org.globus.cog.abstraction.impl.execution.gt2.JobSubmissionTaskHandler.submitSingleJob(JobSubmissionTaskHandler.java:162) 
>>>
>>> ...
>>> Caused by: org.globus.gram.GramException: Parameter not supported
>>>
>>> Do the globus namespace options get put into RSL in GRAM2, and do these 
>>> need to be valid Globus options?
>>>
>>> I assume this option is in the globus namespace because that was a 
>>> convenient way of adding per-site options?  Or was this mean to be in 
>>> the karajan namespace?
>>>
>>> I'll dig deeper, but wonder if this has been tested on TeraGrid sites.
>>>
>>> - Mike
>>>
>>>
>>> On 7/27/08 2:58 PM, Ben Clifford wrote:
>>>  > cog svn r2094 introduces a profile property coastersPerNode which allows
>>>  > you to spawn multiple coaster workers on a node. this should allow 
>>> you to
>>>  > take advantage of sites which have multicore CPUs but allocate the whole
>>>  > node, rather than an individual core, when a job is submitted.
>>>  >
>>>  > When using coasters, add this to the site definition:
>>>  >
>>>  >     <profile namespace="globus" key="coastersPerNode">5</profile>
>>>  >
>>>  >
>>>  > to get eg 5 workers on each node.
>>>  >
>>> _______________________________________________
>>> Swift-devel mailing list
>>> Swift-devel at ci.uchicago.edu
>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 



More information about the Swift-devel mailing list