[mpich-discuss] problem with collective on sub-communicator

Miguel Oliveira m.a.oliveira at coimbra.lip.pt
Fri Nov 11 06:25:32 CST 2011


Hi Pavan,

You were clear as crystal!!! So much so that I tested 3 solutions: a global barrier, using MPI_Ssend and restructuring the master so it can cope with "out-of-order" messages. All solutions worked!
Performance-wise the restructured and the barrier are basically the same while the MPI_Ssend is clearly the worst.

Cheers,

MAO
On Nov 11, 2011, at 04:39 , Pavan Balaji wrote:

> 
> Ok, there seems to be a problem with your application. After you do the final MPI_Send from a process, you don't wait for a response before doing the MPI_Reduce. Here's what's happening. Consider a three process run, where rank 0 is the master. Ranks 1 and 2 first send regular messages to rank 0, and then rank 1 alone sends a final message.
> 
> rank 0:
> MPI_Recv(REGULAR_MESSAGE);
> MPI_Recv(REGULAR_MESSAGE);
> MPI_Recv(FINAL_SUM);
> 
> rank 1:
> MPI_Send(REGULAR_MESSAGE);
> MPI_Barrier(between ranks 1 and 2);
> MPI_Send(FINAL_SUM);
> 
> rank 2:
> MPI_Send(REGULAR_MESSAGE);
> MPI_Barrier(between ranks 1 and 2);
> 
> In this example, suppose all communication is fast, but assume that the MPI_Send from rank 2 to rank 0 is really really slow. In this case, before rank 2's message reaches rank 0, it can finish a barrier with rank 1, and rank 1 can send the final message to rank 0.
> 
> Does that make sense?
> 
> -- Pavan
> 
> On 11/10/2011 10:24 PM, Miguel Oliveira wrote:
>> Hi,
>> 
>> Here is why....
>> 
>> 93:master_slave miguel$ mpicc -o master_slave master_slave.c
>> 93:master_slave miguel$ mpiexec -n 10 ./master_slave
>> 
>> Average=5.478889e+00
>> 
>> 93:master_slave miguel$ mpiexec -n 10 ./master_slave
>> 
>> Average=5.563333e+00
>> 
>> 93:master_slave miguel$ mpiexec -n 10 ./master_slave
>> 
>> Average=5.384444e+00
>> 
>> 93:master_slave miguel$ mpiexec -n 10 ./master_slave
>> 
>> Average=5.565556e+00
>> 
>> 93:master_slave miguel$ mpiexec -n 10 ./master_slave
>> Oops...
>> 
>> This output is generated with the attached code which only differs from the one I sent previously by the if statement that prints the "Oops..." in the deadlock case....
>> 
>> Cheers,
>> 
>> MAO
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Nov 11, 2011, at 04:16 , Pavan Balaji wrote:
>> 
>>> 
>>> Even if I remove it, the program just prints a number. What makes you believe that the MPI_Send is overtaking the Reduce?
>>> 
>>> -- Pavan
>>> 
>>> On 11/10/2011 10:12 PM, Miguel Oliveira wrote:
>>>> Hi,
>>>> 
>>>> Sorry, sent the version with the only correction I found that makes it work. Remove the MPI_Barrier(world) in both the master and the slaves and you should be able to reproduce the problem.
>>>> 
>>>> Cheers,
>>>> 
>>>> MAO
>>>> 
>>>> On Nov 11, 2011, at 03:31 , Pavan Balaji wrote:
>>>> 
>>>>> 
>>>>> The test code you sent seems to work fine for me. I don't see any such problem.
>>>>> 
>>>>> -- Pavan
>>>>> 
>>>>> On 11/10/2011 03:14 PM, Miguel Oliveira wrote:
>>>>>> Hi all,
>>>>>> 
>>>>>> I wrote a very simple master/slave code in MPI and I'm having problems with MPI_Reduce, or even, MPI_Barrier, inside a subset of the world communicator.
>>>>>> These operations don't seem to be waiting for all the processes in the subgroup.
>>>>>> 
>>>>>> The code is a straightforward master/slave case where the master generates random numbers when requested and then retrieves a reduction of the sum of these
>>>>>> done on the slaves.
>>>>>> 
>>>>>> When run on more than three processes sometimes it happens that the message after the reduction, done from one of the slaves to inform the master of the final
>>>>>> result gets to the master before some of the requests for random numbers... This ought to be impossible with a blocking reduction...
>>>>>> 
>>>>>> Am I missing something?
>>>>>> 
>>>>>> Code is attached.
>>>>>> 
>>>>>> Help is appreciated.
>>>>>> 
>>>>>> Cheers,
>>>>>> 
>>>>>> MAO
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>>>>>> To manage subscription options or unsubscribe:
>>>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>>> 
>>>>> --
>>>>> Pavan Balaji
>>>>> http://www.mcs.anl.gov/~balaji
>>>> 
>>> 
>>> --
>>> Pavan Balaji
>>> http://www.mcs.anl.gov/~balaji
>> 
> 
> -- 
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1580 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111111/735932c1/attachment.bin>


More information about the mpich-discuss mailing list