[Swift-devel] Coasters and std's on ranger
Michael Wilde
wilde at mcs.anl.gov
Tue Jul 14 14:46:17 CDT 2009
So are any of the following reasonable ways to proceed?
1) Develop an SGE provider (hopefully heavily based on the PBS provider)
and run on Ranger locally.
2) Debug getting Coasters, GRAM and SGE to coexist nicely (ie the
debugging route in progress now)
3) Start the coaster service manually in one block allocation and have
it rendezvous with Swift
For (2) can we create a GRAM test job outside of Swift that we can
debug, to try to find a set of GRAM options that work? I need to read
the thread more carefully, but I dont understand if the problem is in
Ranger SGE, the GRAM SGE jobmanager, or the interaction between them.
I'll re-read the thread first before asking for more clarification; I
didnt get it on first read.
- Mike
On 7/14/09 2:33 PM, Mihael Hategan wrote:
> This isn't the app stdout/stderr, but the job stdout/stderr. They are
> redirected in coasters for debugging/accounting purposes, and with SGE
> because the [censored] thing doesn't work otherwise.
>
> On Tue, 2009-07-14 at 14:29 -0500, Michael Wilde wrote:
>> Will the current code work for swift programs that dont use stdout or
>> stderr? (Ie where the app wrappers redirect these to a file?)
>>
>> - Mike
>>
>> On 7/14/09 2:21 PM, Mihael Hategan wrote:
>>> On Tue, 2009-07-14 at 13:22 -0500, skenny at uchicago.edu wrote:
>>>> darn...
>>>>
>>>> Execution failed:
>>>> Exception in RInvoke:
>>>> Arguments: [scripts/4reg_dummy.R,
>>>> matrices/4_reg/network1/gestspeech.cov, 29, 0.5, speech]
>>>> Host: RANGER
>>>> Directory:
>>>> 4reg_speech-20090714-1309-b650zi68/jobs/3/RInvoke-351ksndj
>>>> stderr.txt:
>>>>
>>>> stdout.txt:
>>>>
>>>> ----
>>>>
>>>> Caused by:
>>>> Block task failed: 0714-090151-000000Block task ended
>>>> prematurely
>>>>
>>>> Cleaning up...
>>>> Shutting down service at https://129.114.50.163:38571
>>>>
>>>> i can file a bug report with TG if need be, but i'm not quite
>>>> sure the best thing to tell them (?) also, i'm wondering how
>>>> coasters was previously able to work around this bug?
>>> By redirecting stdout+stderr to memory, but that causes the "job manager
>>> could not stage out a file" problem.
>>>
>>>
>>> _______________________________________________
>>> Swift-devel mailing list
>>> Swift-devel at ci.uchicago.edu
>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>
More information about the Swift-devel
mailing list