[MPICH] Out of memory problem

Rajeev Thakur thakur at mcs.anl.gov
Thu Oct 18 13:29:19 CDT 2007


I was able to reproduce this with 1.0.5p4 but not with 1.0.6. Try 1.0.6.

Rajeev 

> -----Original Message-----
> From: Dmitri Chubarov [mailto:dmitri.chubarov at gmail.com] 
> Sent: Thursday, October 18, 2007 10:12 AM
> To: Rajeev Thakur
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [MPICH] Out of memory problem
> 
> Hello, Rajeev,
> 
> I have managed to isolate the problem in some 40 lines.
> Unfortunately, MPI_Barrier did not help "out of the box".
> 
> I suspect, that the problem might be due to a memory leak in MPICH2.
> When running the program I observe that the resident set is growing
> very fast, which normally should not happen.
> 
> I would be very grateful if someone could run the following code with
> mpich2 1.0.5
> to see if the problem is indeed reproducible and with 1.0.6 to see if
> this is fixed
> in the latest version. It takes about 10 minutes with 3 processes on a
> 2.4GHz Opteron DC system.
> 
> Thank you,
>   Dima
> 
> --- code sample starts ---
> ! "outofmemory.f90"
> include "mpif.h"
> 
> integer :: NMAX = 200001
> integer :: NSTEP = 1
> 
> real*8 psi0(1000000),psi(200000)
> real*8 dens(100000),dens0(200000)
> 
> integer myrank,mysize
> integer M
> integer ierr
> integer i
> 
>   call MPI_Init(ierr)
>   call MPI_Comm_size(MPI_COMM_WORLD,mysize,ierr)
>   call MPI_Comm_rank(MPI_COMM_WORLD,myrank,ierr)
> 
>   do i = 0,NMAX,NSTEP
> ! compute some random M
>       M = abs(sin(i/100 + 1.0))*100000.0/mysize
> 
> ! Bcast M
>       call MPI_Bcast(M,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)
> 
>       if ((myrank .eq. 0) .AND. (mod(i,100000) .eq. 0)) then
>         write (*,*) myrank, M
>       endif
> 
> ! Do a scatter
>       call 
> MPI_Scatter(psi0,M*2,MPI_REAL8,psi,M*2,MPI_REAL8,0,MPI_COMM_WO
RLD,ierr)
> 
> ! Do a gather
>       call 
> MPI_Gather(dens,M,MPI_REAL8,dens0,M,MPI_REAL8,0,MPI_COMM_WORLD,ierr)
> 
>       if (mod(i,100) .eq. 0) then
>          call MPI_Barrier(MPI_COMM_WORLD,ierr) ! Have a barrier
>       endif
> 
>    end do
>    call MPI_Finalize(ierr)
> 
> end
> 
> -- code sample ends --
> 
> > >
> > > Here is the problem.
> > > We use MPICH 2 version 1.0.5 with SunStudio compilers on 
> AMD Opterons.
> > >
> > > There is a code that fails with the following message:
> > >
> > > Fatal error in MPI_Scatter: Other MPI error, error stack:
> > > MPI_Scatter(760)..........: MPI_Scatter(sbuf=0xef0860, 
> scount=2211,
> > > MPI_DOUBLE_COMPLEX, rbuf=0x4828fb0, rcount=2211, 
> MPI_DOUBLE_COMPLEX,
> > > root=0, MPI_COMM_WORLD) failed
> > > MPIR_Scatter(253).........:
> > > MPIC_Send(36).............:
> > > MPIDI_EagerContigSend(146): failure occurred while 
> attempting to send
> > > an eager message
> > > MPIDI_CH3_iStartMsgv(132).: Out of memory
> > >
> > > I wonder what might have caused "Out of memory" here.
> > >
> 
> 




More information about the mpich-discuss mailing list