[mpich-discuss] Problem with spawning child with same executable name

Yauheni Zelenko zelenko at cadence.com
Thu Feb 10 20:41:59 CST 2011


Hi, Pavan!

I added some debugging output with timestamps.

New set of children spawned after previous set of children call MPI_Finalize. However all processes exited only after mater terminated.

I definitely could lead to more resource usage in supposed program usage since children still some amount of system resources.

Also I'm not sure that at this stage Hydra will have enough information to launch new child processes on freed hosts.

Eugene.
________________________________________
From: Pavan Balaji [balaji at mcs.anl.gov]
Sent: Wednesday, February 09, 2011 2:31 PM
To: Yauheni Zelenko
Cc: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] Problem with spawning child with same executable name

So the first set of spawned processes have terminated before the next set is started, is it?

Pavan Balaji @ iPhone
(Big fingers. Small email.)

On Feb 9, 2011, at 2:20 PM, Yauheni Zelenko <zelenko at cadence.com> wrote:

> But there is still living process on host1. I think accounting fact this will be more correct Hydra behaviour.
>
> Eugene.
> ________________________________________
> From: Pavan Balaji [balaji at mcs.anl.gov]
> Sent: Wednesday, February 09, 2011 2:11 PM
> To: mpich-discuss at mcs.anl.gov
> Cc: Yauheni Zelenko
> Subject: Re: [mpich-discuss] Problem with spawning child with same executable name
>
> On 02/09/2011 04:06 PM, Yauheni Zelenko wrote:
>> Then I run program with Hydra: mpiexec -host "host1:2,host2:2"
>>
>> Master process is run on host1. At first spawn 1 child was run on
>> host1 and 2 on host2, but on consequent spawns, 2 children was on
>> host1 and 1 on host2.
>>
>> I think such resources allocation may create balancing problems and
>> Hydra should not spawn children processes on hosts still in use.
>
> That sound correct to me. Hydra looks at the host list as:
>
> host1, host1, host2, host2, ..., [wrap around].
>
> The master process is launched on the first "host1". When you spawn
> three processes the first time, it launches them on "host1", "host2",
> and "host2". When you spawn three processes the second time, it launches
> them on "host1", "host1", "host2". The next spawn of three processes
> will be "host2", "host1", "host1", etc.
>
>  -- Pavan
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Spawn.c
Type: application/octet-stream
Size: 2696 bytes
Desc: Spawn.c
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20110210/65abc774/attachment.obj>


More information about the mpich-discuss mailing list