[mpich-discuss] Problem with spawning child with same executable name

Pavan Balaji balaji at mcs.anl.gov
Wed Feb 9 16:31:19 CST 2011


So the first set of spawned processes have terminated before the next set is started, is it?

Pavan Balaji @ iPhone
(Big fingers. Small email.)

On Feb 9, 2011, at 2:20 PM, Yauheni Zelenko <zelenko at cadence.com> wrote:

> But there is still living process on host1. I think accounting fact this will be more correct Hydra behaviour.
> 
> Eugene.
> ________________________________________
> From: Pavan Balaji [balaji at mcs.anl.gov]
> Sent: Wednesday, February 09, 2011 2:11 PM
> To: mpich-discuss at mcs.anl.gov
> Cc: Yauheni Zelenko
> Subject: Re: [mpich-discuss] Problem with spawning child with same executable name
> 
> On 02/09/2011 04:06 PM, Yauheni Zelenko wrote:
>> Then I run program with Hydra: mpiexec -host "host1:2,host2:2"
>> 
>> Master process is run on host1. At first spawn 1 child was run on
>> host1 and 2 on host2, but on consequent spawns, 2 children was on
>> host1 and 1 on host2.
>> 
>> I think such resources allocation may create balancing problems and
>> Hydra should not spawn children processes on hosts still in use.
> 
> That sound correct to me. Hydra looks at the host list as:
> 
> host1, host1, host2, host2, ..., [wrap around].
> 
> The master process is launched on the first "host1". When you spawn
> three processes the first time, it launches them on "host1", "host2",
> and "host2". When you spawn three processes the second time, it launches
> them on "host1", "host1", "host2". The next spawn of three processes
> will be "host2", "host1", "host1", etc.
> 
>  -- Pavan
> 
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list