[Swift-devel] Problems running coaster
Michael Wilde
wilde at mcs.anl.gov
Sun Jul 27 23:16:22 CDT 2008
On 7/27/08 11:07 PM, Mihael Hategan wrote:
> There's something I'm missing here. What sites.xml file are you using?
<config>
<pool handle="abe" >
<execution provider="coaster" url="grid-abe.ncsa.teragrid.org"
jobManager="gt2:pbs" />
<profile namespace="karajan" key="jobThrottle">4</profile>
<gridftp url="gsiftp://gridftp-abe.ncsa.teragrid.org"/>
<workdirectory>/u/ac/wilde/swiftwork</workdirectory>
<profile namespace="globus" key="project">TG-MCA01S018</profile>
<!--altworkdirectory>/cfs/scratch/users/wilde/swiftwork</altworkdirectory-->
<!--SwiftDACprofile namespace="globus"
key="project">TG-CCR080002N</SwiftDACprofile-->
</pool>
</config>
When I run to localhost I get a coaster-boot logfile in my home dir on
the submit host (swift host).
When I run to abe I dont get such a log.
Is there something I can turn on in the coaster bootstrap phase to get
more logging?
Is there anything special done in the gram request to start the coaster
service that is unusual and may not work on abe?
- Mike
>
> On Sun, 2008-07-27 at 22:50 -0500, Michael Wilde wrote:
>> On 7/27/08 2:51 PM, Ben Clifford wrote:
>>> I got the logs eventually.
>>>
>>> On the Abe log, I see this error:
>>>
>>> Caused by:
>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
>>> Task ended before registration was received.
>>> STDOUT: This node is in dedicated user mode.
>>>
>>>
>>>
>>> The string 'This node is in dedicated user mode.' is coming from
>>> something outside of Swift, perhaps the local scheduler getting upset by
>>> coasters. Could you submit successfully without coasters at the same time
>>> (within a few minutes) of not being able to submit with coasters?
>> Yes. Just *before* I tried abe with coasters, I did a simple
>> globus-job-run to its pbs jobmanager. That worked fine.
>>
>> Ive sent ticket to TG Help asking if they recognize the "dedicated" message.
>>
>> Is the coaster server started with any special GRAM attributes, that I
>> could provide to globus-job-run or globusrun to try to re-create the
>> problem?
>>
>>> For the localhost run, have a look in your home directory root for coaster
>>> and coaster worker log files that were generated at the same time as you
>>> did that run and send those / look in them.
>> I found the localhost problem in these logs - I didnt realize I needed a
>> grid proxy for localhost coaster runs. I made one, and that works now.
>>
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>
More information about the Swift-devel
mailing list