[mpich-discuss] problem with collective on sub-communicator

Pavan Balaji balaji at mcs.anl.gov
Thu Nov 10 22:16:05 CST 2011


Even if I remove it, the program just prints a number. What makes you 
believe that the MPI_Send is overtaking the Reduce?

  -- Pavan

On 11/10/2011 10:12 PM, Miguel Oliveira wrote:
> Hi,
>
> Sorry, sent the version with the only correction I found that makes it work. Remove the MPI_Barrier(world) in both the master and the slaves and you should be able to reproduce the problem.
>
> Cheers,
>
> MAO
>
> On Nov 11, 2011, at 03:31 , Pavan Balaji wrote:
>
>>
>> The test code you sent seems to work fine for me. I don't see any such problem.
>>
>> -- Pavan
>>
>> On 11/10/2011 03:14 PM, Miguel Oliveira wrote:
>>> Hi all,
>>>
>>> I wrote a very simple master/slave code in MPI and I'm having problems with MPI_Reduce, or even, MPI_Barrier, inside a subset of the world communicator.
>>> These operations don't seem to be waiting for all the processes in the subgroup.
>>>
>>> The code is a straightforward master/slave case where the master generates random numbers when requested and then retrieves a reduction of the sum of these
>>> done on the slaves.
>>>
>>> When run on more than three processes sometimes it happens that the message after the reduction, done from one of the slaves to inform the master of the final
>>> result gets to the master before some of the requests for random numbers... This ought to be impossible with a blocking reduction...
>>>
>>> Am I missing something?
>>>
>>> Code is attached.
>>>
>>> Help is appreciated.
>>>
>>> Cheers,
>>>
>>> MAO
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>>> To manage subscription options or unsubscribe:
>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>> --
>> Pavan Balaji
>> http://www.mcs.anl.gov/~balaji
>

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list