[MPICH] MPI_REDUCE with MPI_IN_PLACE fails with memory error

Martin Kleinschmidt mk at theochem.uni-duesseldorf.de
Tue Mar 13 04:08:03 CDT 2007


Hi,

my mpi program fails with the following error:

#################
[cli_0]: aborting job:
Fatal error in MPI_Reduce: Other MPI error, error stack:
MPI_Reduce(850).: MPI_Reduce(sbuf=MPI_IN_PLACE, rbuf=0x956c8008,
count=76160987, MPI_DOUBLE_PRECISION, MPI_SUM, root=0, MPI_COMM_WORLD)
failed
MPIR_Reduce(149): Unable to allocate 609287896 bytes of memory for
temporary buffer (probably out of memory)
##################

which is, of course, quite self-explaining.

The corresponding lines of code are:

#################
#ifdef PARALLEL
         if (myid .eq. 0) then
            call MPI_Reduce(MPI_IN_PLACE, vecf2(1),
     $           n*nneue,
     $           MPI_double_precision, MPI_SUM, 0,
     $           MPI_Comm_World, MPIerr)
         else
            call MPI_Reduce(vecf2(1),MPI_IN_PLACE,
     $           n*nneue,
     $           MPI_double_precision, MPI_SUM, 0,
     $           MPI_Comm_World, MPIerr)
         endif
#endif
#################
with n*nneue = 76160987, and 76160987*8 = 609287896, about 600 MB

The point is: I thought, I could avoid the need for allocationg
additional memory by using MPI_IN_PLACE, which obviously does not work.

- do I use MPI_IN_PLACE in the right way?
- why does MPI_IN_PLACE need additional memory?
- is it possible to rewrite this code in a way that eliminates the need
  for allocating additional memory? This part of the code is not
  time-critical - it is executed once every few hours.

(I'm using mpich2-1.0.5p2, Intel fortran compiler 9.1.040, Intel C
compiler 9.1.045 for compiling both mpich and my code)

   ...martin




More information about the mpich-discuss mailing list