possible bug in pnetcdf: cdf5 issue

Wei-keng Liao wkliao at ece.northwestern.edu
Mon Feb 18 13:38:30 CST 2013


I tried mpich2-1.2.1p1 and mpich2-1.5 on a Linux box. Both gave me the
same error. It is interesting that the first MPI_File_write works
in your case. Can you replace MPI_DOUBLE with MPI_DOUBLE_PRECISION and
try again? I can see the place in MPICH that generates the error. In your case,
it is line 100 of file mpich2-1.4.1p1/src/mpi/romio/mpi-io/write.c
    MPIO_CHECK_COUNT_SIZE(fh, count, datatype_size, myname, error_code);

It checks the product of count and datatype_size for possible overflow in file
mpich2-1.4.1p1/src/mpi/romio/adio/include/adioi_error.h in the statement below.

if (count*datatype_size != (ADIO_Offset)(unsigned)count*(ADIO_Offset)(unsigned)datatype_size) { \

This error will propagate from MPI-IO to PnetCDF.


Regarding to your comments on using MPI_Type_create_subarray, here is my explanation.

PnetCDF uses MPI_Type_create_subarray() to create a fileview, i.e. for read/write
a subarray to a bigger array defined in a nc file. The use of filetype is different
from the buffer type. The former describes the noncontiguous layout in the file and
the latter describes the memory layout of the I/O buffer. PnetCDF creates a filetype
for each nonblocking request using MPI_Type_create_subarray() and combines them using
MPI_Type_create_struct(). For buffer types, since most of the I/O buffers are
contiguous (if they are not, they will be packed into one) PnetCDF calls
MPI_Type_create_hindex() to combine.

Note the buffer type is used in MPI read/write calls while filetype is used in
MPI_File_setview(). The code comments you point out comes from where the user's
buffer derived data type is defined. So, MPI_Type_create_subarray() is not appicable.
In your case that each process writes a single, large amount with a contiguous
buffer. PnetCDF internally translates the count and datatype into a number of bytes
and hence causes overflow. But even if PnetCDF created a data type, it still
would not pass MPI-IO call, as when the product of arguments count and datatype
of an MPI-IO call is too  large (i.e. > 2^31 bytes), MPI-IO throws MPI_ERR_ARG error.


Wei-keng

On Feb 18, 2013, at 12:48 PM, Jim Edwards wrote:

> I'm using mpich2 1.4.1p1  romio: 1.2.6
> 
> In your test program the first MPI_File_write works but the second gives:
> 
> Error: MPI_File_write (ddtype) Invalid argument, error stack:
> MPI_FILE_WRITE(100): Invalid count argument
> 
> 
> On Mon, Feb 18, 2013 at 10:52 AM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> Hi, Jim,
> 
> I tested your codes with 4 MPI processes and got the error below.
> MPI_FILE_WRITE_ALL(105): Invalid count argument
> 
> Maybe you are using IBM's MPI-IO? (I am using MPICH.)
> 
> Can you try the attached Fortran program? (run on 1 process).
> I got error below.
> 
>  Error: MPI_File_write MPI_DOUBLE Invalid argument, error stack:
> MPI_FILE_WRITE(102): Invalid count argument
>  Error: MPI_File_write (ddtype) Invalid argument, error stack:
> MPI_FILE_WRITE(102): Invalid count argument
> 
> 
> 
> 
> Wei-keng
> 
> On Feb 18, 2013, at 9:00 AM, Jim Edwards wrote:
> 
> > Hi Wei-keng,
> >
> > This is just an interface problem and not a hard limit of mpi-io.  For example if I run the
> > same case on 4 tasks instead of 8, it works just fine (example attached).
> >
> > If I create an mpi derived type such as for example an mpi_type_contiguous I can do the same call as below successfully.
> >
> >   int len = 322437120;
> >     double *buf = (double*) malloc(len * sizeof(double));
> >     int elemtype, err;
> >     err = mpi_type_contiguous(len,mpi_double,elemtype);
> >     ierr = mpi_type_commit(elemtype)
> >     err = MPI_File_write(fh, buf, 1, elemtype, &status);
> >     if (err != MPI_SUCCESS) {
> >         int errorStringLen;
> >         char errorString[MPI_MAX_ERROR_
> > STRING];
> >         MPI_Error_string(err, errorString, &errorStringLen);
> >         printf("Error: MPI_File_write_at() (%s)\n",errorString);
> >     }
> >
> >
> >
> >
> >
> > It seems to me that every operation that pnetcdf can do using start and count can be described as an mpi_type_subarray which will both allow pnetcdf to avoid this interface limit and save a potentially considerable amount of memory.
> >
> > - Jim
> >
> > On Sun, Feb 17, 2013 at 10:10 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > Hi, Jim,
> >
> > In your test program, each process is writing 322437120 or 322437202 doubles.
> > so, 322437120 * sizeof(double) = 2,579,496,960 which is larger than 2^31, max
> > for a signed 4-byte integer. It did cause 4-byte integer overflow in PnetCDF.
> > But, even MPI-IO will have a problem with this size.
> >
> > If you try the code fragment below, ROMIO will throw an error class
> > MPI_ERR_ARG, and error string "Invalid count argument".
> >
> >     int len = 322437120;
> >     double *buf = (double*) malloc(len * sizeof(double));
> >
> >     int err = MPI_File_write(fh, buf, len, MPI_DOUBLE, &status);
> >     if (err != MPI_SUCCESS) {
> >         int errorStringLen;
> >         char errorString[MPI_MAX_ERROR_STRING];
> >         MPI_Error_string(err, errorString, &errorStringLen);
> >         printf("Error: MPI_File_write_at() (%s)\n",errorString);
> >     }
> >
> > A possible PnetCDF solution is to detect the overflow and divide a large request
> > into multiple, smaller ones, each with a upper bound of 2^31-1 bytes.
> > Or PnetCDF can simply throw an error, like MPI-IO.
> >
> > Any suggestion?
> >
> > Wei-keng
> >
> > On Feb 17, 2013, at 1:34 PM, Jim Edwards wrote:
> >
> > > Found the problem in the test program, a corrected program is attached.   This reminds me of another issue - the interface to nfmpi_iput_vara is not defined in pnetcdf.mod
> > >
> > > - Jim
> > >
> > > On Sun, Feb 17, 2013 at 11:43 AM, Jim Edwards <jedwards at ucar.edu> wrote:
> > > In my larger program I am getting an error:
> > >
> > > PMPI_Type_create_struct(139): Invalid value for blocklen, must be non-negative but is -1715470336
> > >
> > > I see a note about this in nonblocking.c:
> > >
> > >     for (j=0; j<reqs[i].varp->ndims; j++)
> > >                 blocklens[i] *= reqs[i].count[j];
> > >             /* Warning! blocklens[i] might overflow */
> > >
> > >
> > > But I tried to distile this into a small testcase and I'm getting a different error, I've attached the test program anyway because I can't spot any error there and think it must be in pnetcdf.    Also it seems like instead of
> > > calling mpi_type_create_struct you should be calling mpi_type_subarray which will avoid the problem of blocklens overflowing.
> > >
> > > This test program is written for 8 mpi tasks, but it uses a lot of memory so you may need more than one node to run it.
> > >
> > > --
> > > Jim Edwards
> > >
> > > CESM Software Engineering Group
> > > National Center for Atmospheric Research
> > > Boulder, CO
> > > 303-497-1842
> > >
> > >
> > >
> > > --
> > > Jim Edwards
> > >
> > > CESM Software Engineering Group
> > > National Center for Atmospheric Research
> > > Boulder, CO
> > > 303-497-1842
> > > <testpnetcdf5.F90>
> >
> >
> >
> >
> > --
> > Jim Edwards
> >
> >
> > <testpnetcdf5.F90>
> 
> 
> 
> 
> 
> -- 
> Jim Edwards
> 
> 



More information about the parallel-netcdf mailing list