[MPICH] Changing the comm size at runtime

Rajeev Thakur thakur at mcs.anl.gov
Wed Mar 14 11:25:07 CDT 2007


You should be able to use the MPI_Comm_spawn functionality for this. You can
run a single parent process as the master, which spawns some number of
slaves using MPI_Comm_spawn. When the slaves are done, they (and the master)
call MPI_Comm_disconnect. Then the slaves can call MPI_Finalize and exit.
The master can then spawn a new number of slaves and repeat. 

Rajeev


> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Patrick Gräbel
> Sent: Wednesday, March 14, 2007 9:33 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: [MPICH] Changing the comm size at runtime
> 
> Hi!
> 
> I have two executables (Windows/MPICH-2):
> 
> master.exe
> slave.exe
> 
> Both programms set several environment variables (e.g. PMI_SIZE) to  
> specify the communicator size, rank number and so on, so that the  
> initial calls to MPI::Init block until the size matches the 
> available  
> number of ranks. After MPI::Init de-blocks, the cooperative  
> calculation between master and slave (successfully) begins.
> 
> My problem is that I want to change the size while the master is  
> running and that I can't call MPI::Init again. I can't afford 
> to close  
> the master programm, i.e. I need to reuse the master so that 
> I can do  
> another calculation with a varying number of slaves being involved.  
> Here is an example of what I mean:
> 
> 1. master starts with SIZE = 2 (RANK 0), blocks
> 2. slave starts with RANK 1
> 3. master and slave de-block as 2 available procs match SIZE
> 4. calculation starts
> 5. calculation is done, result is available
> 6. slave leaves.
> 7. master now wants to start again, but with SIZE = 5
> 
> Is there any chance to implement this behaviour with MPICH-2? What  
> functions/tools do I need for a solution?
> 
> Thanks
> Patrick
> 
> 
> 




More information about the mpich-discuss mailing list