[mpich-discuss] Re: MPI_Brecv vs multiple MPI_Irecv

Fri Aug 29 12:18:35 CDT 2008

On Wed, Aug 27, 2008 at 3:20 PM, Jeff Squyres <jsquyres at cisco.com> wrote:

> (Note: this thread is separately spanning the two different MPI
> implementation mailing lists...)
>
>
> On Aug 27, 2008, at 1:51 PM, Robert Kubrick wrote:
>
>  For mpich2, the internal buffer space is limited by available memory. For
>>> each unexpected small message (<=128K for ch3:sock) mpich2 does a malloc and
>>> receives the message into that buffer.  So even unexpected small messages
>>> shouldn't block program flow...but you'll eventually crash if you run out of
>>> memory.
>>>
>>
>> Good to know.
>>
>
> Most MPI implementations use a similar strategy.
>
>  Yes. If you have a process that sends many small messages, such a logging
>> strings to a spooler process, by reading the MPI standard you're left with
>> the impression that MPI_Send might block until a matching receiving has been
>> posted on the other side.
>>
>
> You should always write your code to assume that MPI_SEND *will* block.
>  Failure to do so will almost certainly result in "my code runs properly in
> MPI implementation X, but hangs in MPI implementation Y" (because X and Y
> provide differing amounts of internal buffer space).  This is a common
> complaint among newbie MPI programmers, but the standard is fairly clear on
> this point.
>
>  If sender performance is a priority, the solution is to queue those log
>> messages somewhere (either on the sending side or better off on the
>> receiving side) to let the process continue execution. MPI_Isend won't make
>> it because the overhead to manage hundreds of request would probably slow
>> down execution more.
>>
>
> Maybe, maybe not (I assume you mean Irecv?).  With MPI_Irecv, the
> implementation may receive the message directly into your buffer (vs. an
> intermediate and then later memcpy).  Meaning: assuming that the performance
> is offset is not necessarily true.

So what it all boils down is that the only way to control buffering in the
current standard is through the use of multiple MPI_Irecv?

>
>
>  If process priority is reversed (sending process has low priority,
>> receiving process high), it's probably better to use MPI_Battach/MPI_Bsend
>> to move the buffering copy overhead to the sender?
>>
>
> If you have a slow sender and a fast receiver, why not send immediately?
>  (vs. forcing a buffered send, which will almost certainly slow down your
> overall performance)

If the library implementation is multi-threaded, there might be a slight
advantage in buffering messages and continue execution. Then again, if the
sender is a low priority process it might make more sense to simply send
messages right away as you point out.

>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080829/3df93fa9/attachment.htm>