[mpich-discuss] Hydra issues

Pavan Balaji balaji at mcs.anl.gov
Wed Aug 26 14:37:42 CDT 2009


stdin should not effect this for multiple reasons: (1) All sockets are 
in a poll(), (2) if you don't have stdin, there's no reading from stdin, 
and (3) two different processes are dealing with stdout/stderr and 
stdin; the proxies read stdout/stderr & the UI (mpiexec) reads stdin.

  -- Pavan

On 08/26/2009 02:33 PM, Scott Atchley wrote:
> Only rank 0 gets stdin. Is it blocking on read()?
> 
> Scott
> 
> On Aug 26, 2009, at 3:22 PM, Darius Buntinas wrote:
> 
>> This may be related:  On trunk, I get the output from rank 1
>> immediately, but I only get rank 0 output when the app exits.
>>
>> Maybe the buffer settings on the proxy aren't set correctly for rank 0's
>> stdout socket?
>>
>> -d
>>
>> On 08/26/2009 01:55 PM, Scott Atchley wrote:
>>> On Aug 26, 2009, at 1:38 PM, Dave Goodell wrote:
>>>
>>>> On Aug 26, 2009, at 9:19 AM, Dave Goodell wrote:
>>>>
>>>>> On Aug 26, 2009, at 8:58 AM, Scott Atchley wrote:
>>>>>
>>>>>> Ok, I was not patient enough. If I let it run, it eventually starts.
>>>>>> The actual walltime is nearly the same as when I use proxies, but
>>>>>> the stdout is delayed until the application completes.
>>>>>
>>>>> I think I know why this happens.  It ought to be fixed once I finish
>>>>> up some Hydra I/O work that was mostly done by one of our students
>>>>> this summer.  The work is basically done and lives on a development
>>>>> branch but it needs a little bit of cleanup and to be rebased onto
>>>>> the current trunk.
>>>>
>>>> I was mistaken on this particular issue.  There was another buffering
>>>> issue that we ran into during the student's work and I was thinking of
>>>> that one.  There isn't any current hydra-side fix for this that I know
>>>> of.
>>>>
>>>> However you could use one of the various workarounds for this, such as
>>>> an LD_PRELOADed setvbuf call:
>>>> http://lists.gnu.org/archive/html/bug-coreutils/2008-11/msg00164.html
>>>
>>> This does not change the behavior.
>>>
>>> I am still stumped as to why there is no delay when using persistent
>>> (launch-mode=2) versus a delay with no proxies (launch-mode=1).
>>>
>>> Scott
> 

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list