[mpich-discuss] messages queue
Jarosław Bułat
kwant at agh.edu.pl
Fri Jul 11 03:16:38 CDT 2008
Dnia 10-07-2008, Cz o godzinie 14:16 -0500, Darius Buntinas pisze:
> The limits you are seeing (67msgs, 2K-4K msgs and 18 msgs) are due to
> the available buffers on unreceived sends. What happens is that in a
> blocking send, mpich2 tries to send the message, either over a shared
> memory queue (for nemesis and shm) or a tcp socket (for sock). If it
> can't send the message completely, it waits until it can. A message
> can't be sent if the shared memory queue is full (for nemesis or shm) or
> if the tcp buffers are full (for sock). And the queues or buffers fill
> up when the receiver is not receiving the messages fast enough.
>
> It seems like what's happening in your application is that the receiver
> processes are not calling an mpi call often enough to allow mpich2 to
> pull incoming messages off the queue or read messages from the socket.
> So once the socket buffers or queues fill up, the sender will block in
> mpi_send() until the receiver calls an mpi call and the messages are
> received. Note that even if the application doesn't post receives for
> the messages, MPICH2 will still receive them and buffer them internally
> as unexpected messages. The solution is to call an mpi function, like a
> send or receive function, or probe or iprobe from time to time to allow
> mpich2 to make progress and drain the incoming messages. Generally
> people use iprobe (because it's non-blocking and doesn't send or receive
> anything) to "poke" the mpi library and allow it to make progress.
I understand this behaviour, what's more, I expected something similar,
however I thought the internal buffer is much more bigger or at least it
can be enlargement. My application is working with video frames (~1MB).
I thought it can be possible to send a few messages (frames) to a queue
and receive it at once (sometimes sender is much more faster then
receiver which is very busy and is not able to receive all messages in
time).
I assume, that application of non-blocking sending instead of MPI_Send()
resolve this problem. Is it true?
Is the internal MPICH2 buffer fixed in size and cannot be enlargement?
> You can try to restructure your application so that either the receiver
> is calling an mpi function from time to time, or try using non-blocking
> sends then call wait on all of them at once, or even make all of your
> sends and receives non-blocking then call wait on everything. Another
> option is to create a "progress" thread which makes mpi calls to allow
> the library to make progress. E.g.:
>
> The progress thread would do something like this:
>
> prog_thread_func() {
> MPI_Recv(NULL, 0, MPI_INT, 0, DONE_TAG, MPI_COMM_SELF,
> MPI_STATUS_IGNORE);
> return; /* thread exits */
> }
>
> The main thread would start the progress thread at the beginning, then
> send a message to it just before joining with the progress thread:
>
> MPI_Send(NULL, 0, MPI_INT, 0, DONE_TAG, MPI_COMM_SELF);
> pthread_join(...);
>
> Note that the nemesis channel does busy polling, meaning that it is
> actively checking for incoming messages, even when it's in a blocking
> mpi call. This can have a performance impact if you have more threads
> than processors since this progress thread will be stealing cycles from
> the other threads doing real work. The sock channel doesn't have this
> issue.
I was wonder why MPI_Waitany() with nemesis channel took 100% CPU... I
thought it's an error, now I see it's by design. That is why change
MPI_Waitany() with MPI_Testany(); usleep(a few us).
Your explanation helps me a lot. Thanks!
Jarek!
>
> Hope this helps,
> -d
>
>
> On 07/07/2008 02:10 PM, Jarosław Bułat wrote:
> > On Mon, 2008-07-07 at 12:50 -0500, Rajeev Thakur wrote:
> >>>> What is happening is a flow control problem, and the above
> >>> are ways to get
> >>>> around it.
> >>> Is it problem of MPICH library or my implementation of this library?
> >> One can blame it on the implementation, but an application can help by not
> >> sending too many messages without receiving them.
> >
> > I've resolve problem partially. I've found an error in my code: the
> > MPI_Send and MPI_Testany+MPI_Irecv were placed in different threads and
> > MPI was initialized by means of MPI_Init. Replacing MPI_Init with
> > MPI_Init_thread resolve problem with crashing ch3:nemesis. In case the
> > queue is ,,full'' system is waiting until receiver process at least one
> > message, however, the queue isn't long enough for my application.
> >
> > Is it any chance to increase memory for queue (enlarge the number of
> > messages that could be stored in the queue)?
> >
> >
> > Regards,
> > Jarek!
> >
> >
> >
> >
> >
> >
More information about the mpich-discuss
mailing list