[mpich-discuss] messages queue
Rajeev Thakur
thakur at mcs.anl.gov
Fri Jul 4 11:18:02 CDT 2008
In case you replaced each send by isend + wait, I had meant post all the
sends as isends and then wait for all of them in a single waitall. Not sure
if it will help, but worth trying.
Another option is after every 20 or 30 sends, insert a barrier. If all
processes are not participating, you could have the receiver send a small
message and have the sender wait for it.
What is happening is a flow control problem, and the above are ways to get
around it.
Rajeev
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Jaroslaw Bulat
> Sent: Friday, July 04, 2008 10:23 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: RE: [mpich-discuss] messages queue
>
> Changing MPI_Send to MPI_Isend + MPI_Wait does not change
> anything. The system is to complex to send a small test
> program, however, I'm working on it - I'm trying to extract a
> smal fragment of it.
>
> I've tested a few configuration with the following results:
>
> mpich2-1.0.7:
> 1) ch3:nemesis
> 2) ch3:sock
> 3) ch3:shm
>
> First configuration crash after 67 (every time) unprocessed
> messages, second configuration freeze after 2000-4000
> unprocessed messages and third after 18 (every time)
> unprocessed messages.
> Freeze means I wasn't be able to send another message,
> however I was able to receive and process queued messages and
> thus shorten queue and send another message which is expected
> behaviour.
>
> The length of one message described above is 28 Bytes,
> however before this test (second phase) several other
> (longer, up to 1.2 MBytes) messages have been successfully
> send, received and processed (first phase).
>
> configuration of MPI:
> mpich2version
> MPICH2 Version: 1.0.7
> MPICH2 Release date: Unknown, built on Fri Jul 4 15:06:00
> CEST 2008
> MPICH2 Device: ch3:shm
> MPICH2 configure: --enable-sharedlibs=gcc -prefix=/usr/
> --enable-cxx
> --with-device=ch3:shm
> MPICH2 CC: gcc -O2
> MPICH2 CXX: g++ -O2
> MPICH2 F77: g77 -O2
>
> Everything is working on the Ubuntu 8.04 with CoreDuo
> processor (2core).
> MPICH as well as program was compile by means of gcc 4.2.3
> (Ubuntu 4.2.3-2-ubuntu7).
>
> Any ideas? how can I test it more precisely?
>
>
> Regards,
> Jarek !
>
>
> On Thu, 2008-07-03 at 14:51 -0500, Rajeev Thakur wrote:
> > A queue of 100 messages of 100 bytes is not too big. What
> happens if
> > you replace MPI_Send with MPI_Isend? Can you send us a
> small test program?
> >
> > Rajeev
> >
> >
> > > -----Original Message-----
> > > From: owner-mpich-discuss at mcs.anl.gov
> > > [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of
> Jaroslaw Bulat
> > > Sent: Thursday, July 03, 2008 7:57 AM
> > > To: mpich2
> > > Subject: [mpich-discuss] messages queue
> > >
> > > Hi All!
> > >
> > > I found the following problem using MPICH2 (1.0.6 and 1.0.7 with
> > > sock and nemesis channel). There are 5 unique process which
> > > interchange messages by means of MPI_Send() and MPI_Irecv() +
> > > MPI_Waitany() or MPI_Testany(). Since the MPI_Send() doesn't wait
> > > until receiver receive message and process it, it is
> possible to see
> > > a queue of messages waiting for processing at the
> receiver. In such
> > > a situation my sender proces is crashing during calling
> MPI_Send()
> > > function. Queue of unprocessed messages is of the length of ~100
> > > messages of the length 100 Bytes each.
> > > I cannot use MPI_Ssend() which resolve this problem
> because in this
> > > way my system is less responsive.
> > >
> > > How can I control length of queue?
> > > Is it possible to allocate more memory for internal MPI buffer?
> > >
> > >
> > > Regards,
> > > Jarek!
> > >
> > >
> > >
> > >
>
>
More information about the mpich-discuss
mailing list