[MPICH] MPI_REDUCE with MPI_IN_PLACE fails with memory error

Martin Kleinschmidt mk at theochem.uni-duesseldorf.de
Wed Mar 14 06:27:14 CDT 2007


On Di, 13 M?r 2007, Rajeev Thakur wrote:

>> Does this mean that mpich will allocate its own buffer regardless of
>> what is passed as recvbuf?
>
>There is no recvbuf on non-root nodes (that argument is ignored), so
>allocation is necessary. 

And there is really no way to force MPI to use a user-supplied buffer?
It will always allocate its own buffer?
I must say: I don't like that ;-)
Is this behaviour intended? For what reason? 

>> I rewrote my code by identifying a vector  which can easily be swapped
>> to disk, and using this vector as the recvbuf argument, then rereading
>> this vector from disk:
>
>You don't need to do that. Just split the one big reduce into 5 smaller
>reduces. As an example for a 100 element buffer, you could do:
>
>call MPI_Reduce(buf(1), count=20,...)
>call MPI_Reduce(buf(21), count=20,...)
>call MPI_Reduce(buf(41), count=20,...)
>call MPI_Reduce(buf(61), count=20,...)
>call MPI_Reduce(buf(81), count=20,...)
>
>You can continue using MPI_IN_PLACE as before. 

yeah, ok.
but - as said in some other post - we are very often running at the
limit of the available memory. So I would have to break it down to
really small pieces in a loop - say 1MB blocks to be on the safe side.
Wouldn't that lead to a remarkable decrease in throughput?
For my current problem, this is not an issue, because this point in the
calculation is reached every 4 hours of calculating on these 600 MB,
there could be other scenarios...

   ...martin




More information about the mpich-discuss mailing list