[mpich-discuss] MPI-IO ERROR
Rob Latham
robl at mcs.anl.gov
Fri Oct 1 13:03:35 CDT 2010
On Mon, Sep 27, 2010 at 08:49:55AM -0700, Weiqiang Wang wrote:
> I'm trying to use MPI-IO on BlueGene/P cluster by incorporating it into my Fortran77 code.
Good to hear from you Weiqiang. I see you also emailed me, but I was
on vacation last week. I'm still working through the accumulated
emails.
> The program works fine, and have reduced the time of writing files
> by several times faster when compared to writing out files from each
> core.
Glad to hear that.
> However, I found out that, after I scale my program to more CPUs
> (from 32,768 to 65,5536), some problem starts appearing. The system
> has complained in my two tests that no sufficient memory can be
> allocated in the I/O nodes. In these two tests, I tried to write
> out totally 12,582,912 atom info (including x,y,z coordinates and
> velocities and datatype all in double precision). These data are
> distributed uniformly among all the processors.
In the past, this type of error typically comes when all processes
perform a collective read of a small config file. I haven't seen this
error in the write path yet.
> Here below are the details of the messages in the two tests:
>
> 1) ======================
> <Sep 24 22:40:53.496483> FE_MPI (Info) : Starting job 1636055
> <Sep 24 22:40:53.576159> FE_MPI (Info) : Waiting for job to terminate
> <Sep 24 22:40:55.770176> BE_MPI (Info) : IO - Threads initialized
> <Sep 24 22:40:55.784851> BE_MPI (Info) : I/O input runner thread terminated
> "remd22.f", line 903: 1525-037 The I/O statement cannot be processed because the I/O subsystem is unable to allocate sufficient memory for the oper
> ation. The program will stop.
> <Sep 24 22:42:06.025409> BE_MPI (Info) : I/O output runner thread terminated
> <Sep 24 22:42:06.069553> BE_MPI (Info) : Job 1636055 switched to state TERMINATED ('T')
This one I don't recognize, but as Dave suggests, you're likely
running out of memory on the compute node.
> 2) =======================
> Out of memory in file /bghome/bgbuild/V1R4M2_200_2010-100508P/ppc/bgp/comm/lib/dev/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_wrcoll.c, line 498
> Out of memory in file /bghome/bgbuild/V1R4M2_200_2010-100508P/ppc/bgp/comm/lib/dev/mpich2/src/mpi/romio/adio/ad_bgl/ad_bgl_wrcoll.c, line 498
This particular line (ad_bgl_wrcoll.c, line 498) is where the MPI-IO
library allocates a temporary buffer for the two-phase optimization.
By default that temporary buffer is 16 MiB. You could use a smaller
size but it could be the case that you have so much memory pressure
that an extra 12 MiB won't matter. Any smaller than 4 MiB and you'll
likely stop seeing those nice performance gains.
in my crude and limited understanding of fortran, here's how you might
try setting a smaller hint to see if you get further:
INTEGER INFO, IERROR
CHARACTER*(*) KEY, VALUE
KEY="cb_buffer_size"
VALUE="4194304"
MPI_INFO_CREATE(INFO, IERROR)
MPI_INFO_SET(INFO, KEY, VALUE, IERROR)
MPI_FILE_OPEN(MPI_COMM_WORLD, "myfile.dat",
MPI_MODE_CREATE+MPI_MODE_RDWR, INFO, fh, IERROR)
MPI_INFO_FREE(INFO, IERROR)
==rob
--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
More information about the mpich-discuss
mailing list