[mpich-discuss] problem with collective on sub-communicator
Pavan Balaji
balaji at mcs.anl.gov
Thu Nov 10 22:39:20 CST 2011
Ok, there seems to be a problem with your application. After you do the
final MPI_Send from a process, you don't wait for a response before
doing the MPI_Reduce. Here's what's happening. Consider a three process
run, where rank 0 is the master. Ranks 1 and 2 first send regular
messages to rank 0, and then rank 1 alone sends a final message.
rank 0:
MPI_Recv(REGULAR_MESSAGE);
MPI_Recv(REGULAR_MESSAGE);
MPI_Recv(FINAL_SUM);
rank 1:
MPI_Send(REGULAR_MESSAGE);
MPI_Barrier(between ranks 1 and 2);
MPI_Send(FINAL_SUM);
rank 2:
MPI_Send(REGULAR_MESSAGE);
MPI_Barrier(between ranks 1 and 2);
In this example, suppose all communication is fast, but assume that the
MPI_Send from rank 2 to rank 0 is really really slow. In this case,
before rank 2's message reaches rank 0, it can finish a barrier with
rank 1, and rank 1 can send the final message to rank 0.
Does that make sense?
-- Pavan
On 11/10/2011 10:24 PM, Miguel Oliveira wrote:
> Hi,
>
> Here is why....
>
> 93:master_slave miguel$ mpicc -o master_slave master_slave.c
> 93:master_slave miguel$ mpiexec -n 10 ./master_slave
>
> Average=5.478889e+00
>
> 93:master_slave miguel$ mpiexec -n 10 ./master_slave
>
> Average=5.563333e+00
>
> 93:master_slave miguel$ mpiexec -n 10 ./master_slave
>
> Average=5.384444e+00
>
> 93:master_slave miguel$ mpiexec -n 10 ./master_slave
>
> Average=5.565556e+00
>
> 93:master_slave miguel$ mpiexec -n 10 ./master_slave
> Oops...
>
> This output is generated with the attached code which only differs from the one I sent previously by the if statement that prints the "Oops..." in the deadlock case....
>
> Cheers,
>
> MAO
>
>
>
>
>
>
>
> On Nov 11, 2011, at 04:16 , Pavan Balaji wrote:
>
>>
>> Even if I remove it, the program just prints a number. What makes you believe that the MPI_Send is overtaking the Reduce?
>>
>> -- Pavan
>>
>> On 11/10/2011 10:12 PM, Miguel Oliveira wrote:
>>> Hi,
>>>
>>> Sorry, sent the version with the only correction I found that makes it work. Remove the MPI_Barrier(world) in both the master and the slaves and you should be able to reproduce the problem.
>>>
>>> Cheers,
>>>
>>> MAO
>>>
>>> On Nov 11, 2011, at 03:31 , Pavan Balaji wrote:
>>>
>>>>
>>>> The test code you sent seems to work fine for me. I don't see any such problem.
>>>>
>>>> -- Pavan
>>>>
>>>> On 11/10/2011 03:14 PM, Miguel Oliveira wrote:
>>>>> Hi all,
>>>>>
>>>>> I wrote a very simple master/slave code in MPI and I'm having problems with MPI_Reduce, or even, MPI_Barrier, inside a subset of the world communicator.
>>>>> These operations don't seem to be waiting for all the processes in the subgroup.
>>>>>
>>>>> The code is a straightforward master/slave case where the master generates random numbers when requested and then retrieves a reduction of the sum of these
>>>>> done on the slaves.
>>>>>
>>>>> When run on more than three processes sometimes it happens that the message after the reduction, done from one of the slaves to inform the master of the final
>>>>> result gets to the master before some of the requests for random numbers... This ought to be impossible with a blocking reduction...
>>>>>
>>>>> Am I missing something?
>>>>>
>>>>> Code is attached.
>>>>>
>>>>> Help is appreciated.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> MAO
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>>>>> To manage subscription options or unsubscribe:
>>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>>
>>>> --
>>>> Pavan Balaji
>>>> http://www.mcs.anl.gov/~balaji
>>>
>>
>> --
>> Pavan Balaji
>> http://www.mcs.anl.gov/~balaji
>
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list