[mpich-discuss] messages queue

Rajeev Thakur thakur at mcs.anl.gov
Fri Jul 4 11:18:02 CDT 2008


In case you replaced each send by isend + wait, I had meant post all the
sends as isends and then wait for all of them in a single waitall. Not sure
if it will help, but worth trying.

Another option is after every 20 or 30 sends, insert a barrier. If all
processes are not participating, you could have the receiver send a small
message and have the sender wait for it. 

What is happening is a flow control problem, and the above are ways to get
around it.

Rajeev

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Jaroslaw Bulat
> Sent: Friday, July 04, 2008 10:23 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: RE: [mpich-discuss] messages queue
> 
> Changing MPI_Send to MPI_Isend + MPI_Wait does not change 
> anything. The system is to complex to send a small test 
> program, however, I'm working on it - I'm trying to extract a 
> smal fragment of it.
> 
> I've tested a few configuration with the following results:
> 
> mpich2-1.0.7:
> 1) ch3:nemesis
> 2) ch3:sock
> 3) ch3:shm
> 
> First configuration crash after 67 (every time) unprocessed 
> messages, second configuration freeze after 2000-4000 
> unprocessed messages and third after 18 (every time) 
> unprocessed messages. 
> Freeze means I wasn't be able to send another message, 
> however I was able to receive and process queued messages and 
> thus shorten queue and send another message which is expected 
> behaviour.
> 
> The length of one message described above is 28 Bytes, 
> however before this test (second phase) several other 
> (longer, up to 1.2 MBytes) messages have been successfully 
> send, received and processed (first phase).
> 
> configuration of MPI:
> mpich2version
> MPICH2 Version:        1.0.7
> MPICH2 Release date:    Unknown, built on Fri Jul  4 15:06:00 
> CEST 2008
> MPICH2 Device:        ch3:shm
> MPICH2 configure:     --enable-sharedlibs=gcc -prefix=/usr/ 
> --enable-cxx
> --with-device=ch3:shm
> MPICH2 CC:     gcc  -O2
> MPICH2 CXX:     g++  -O2
> MPICH2 F77:     g77  -O2
> 
> Everything is working on the Ubuntu 8.04 with CoreDuo 
> processor (2core).
> MPICH as well as program was compile by means of gcc 4.2.3 
> (Ubuntu 4.2.3-2-ubuntu7).
> 
> Any ideas? how can I test it more precisely? 
> 
> 
> Regards,
> Jarek !
> 
> 
> On Thu, 2008-07-03 at 14:51 -0500, Rajeev Thakur wrote:
> > A queue of 100 messages of 100 bytes is not too big. What 
> happens if 
> > you replace MPI_Send with MPI_Isend? Can you send us a 
> small test program?
> > 
> > Rajeev
> >  
> > 
> > > -----Original Message-----
> > > From: owner-mpich-discuss at mcs.anl.gov 
> > > [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of 
> Jaroslaw Bulat
> > > Sent: Thursday, July 03, 2008 7:57 AM
> > > To: mpich2
> > > Subject: [mpich-discuss] messages queue
> > > 
> > > Hi All!
> > > 
> > > I found the following problem using MPICH2 (1.0.6 and 1.0.7 with 
> > > sock and nemesis channel). There are 5 unique process which 
> > > interchange messages by means of MPI_Send() and MPI_Irecv() + 
> > > MPI_Waitany() or MPI_Testany(). Since the MPI_Send() doesn't wait 
> > > until receiver receive message and process it, it is 
> possible to see 
> > > a queue of messages waiting for processing at the 
> receiver.  In such 
> > > a situation my sender proces is crashing during calling 
> MPI_Send() 
> > > function.  Queue of unprocessed messages is of the length of ~100 
> > > messages of the length 100 Bytes each.
> > > I cannot use MPI_Ssend() which resolve this problem 
> because in this 
> > > way my system is less responsive.
> > > 
> > > How can I control length of queue? 
> > > Is it possible to allocate more memory for internal MPI buffer? 
> > > 
> > > 
> > > Regards,
> > > Jarek!
> > > 
> > > 
> > > 
> > > 
> 
> 




More information about the mpich-discuss mailing list