[mpich-discuss] Fatal error in PMPI_Alltoall: Other MPI error, error stack
Ryan Crocker
rcrocker at uvm.edu
Thu Oct 18 17:24:00 CDT 2012
I'm implementing MPICH2_1.3.1 compiled with gcc and gfortran in 64bit on a linux cluster on 144 processors with 2GB per node. I'm running an in house flow solver coded in fortran and for some reason i get this error:
MXMPI:FATAL-ERROR:0:Fatal error in PMPI_Alltoall: Other MPI error, error stack:
PMPI_Alltoall(773).....................: MPI_Alltoall(sbuf=0x273436e0, scount=22, MPI_DOUBLE_PRECISION, rbuf=0x1cd98fb0, rcount=22, MPI_DOUBLE_PRECISION, comm=0x84000001) failed
MPIR_Alltoall_impl(651)................:
MPIR_Alltoall(619).....................:
MPIR_Alltoall_intra(206)...............:
MPIR_Type_create_indexed_block_impl(48):
MPID_Type_vector(57)...................: Out of memory
APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)
it happens about 2000 iteration into my run. I've put run the exact same simulation on a mac workstation and i do not get this error. I've also watched the memory usage, it does not increase during my run on the work station.
So far i've tried adding MPI_BARRIER in front of my alltoall calls but does not seem to help. I've also updated a local version of mpich2_1.5, and though slow i've run into the same problem, at the same iteration number.
Ryan Crocker
University of Vermont, School of Engineering
Mechanical Engineering Department
rcrocker at uvm.edu
315-212-7331
More information about the mpich-discuss
mailing list