pnetcdf 1.2.0 create file issue
Rob Latham
robl at mcs.anl.gov
Fri May 11 10:43:20 CDT 2012
On Thu, May 10, 2012 at 03:28:57PM -0600, Jim Edwards wrote:
> This occurs on the ncsa machine bluewaters. I am using pnetcdf1.2.0 and
> pgi 11.10.0
need one more bit of information: the version of MPT you are using.
> The issue is that calling nfmpi_createfile would sometimes result in an
> error:
>
> MPI_File_open : Other I/O error , error stack:
> (unknown)(): Other I/O error
> 126: MPI_File_open : Other I/O error , error stack:
> (unknown)(): Other I/O error
> Error on create : 502 -32
>
> The error appears to be intermittent and I could not get it to occur at all
> on a small number of tasks (160) but it occurs with high frequency when
> using a larger number of tasks (>=1600). I traced the problem to the use
> of nf_clobber in the mode argument, removing the nf_clobber seems to have
> solved the problem and I think that create implies clobber anyway doesn't
> it?
> Can someone who knows what is going on under the covers enlighten me
> with some understanding of this issue? I suspect that one task is trying
> to clobber the file that another has just created or something of that
> nature.
Unfortunately, "under the covers" here means "inside the MPI-IO
library", which we don't have access to.
in the create case we call MPI_File_open with "MPI_MODE_RDWR |
MPI_MODE_CREATE", and if noclobber set, we add MPI_MODE_EXCL.
OK, so that's pnetcdf. What's going on in MPI-IO? Well, cray's based
their MPI-IO off of our ROMIO, but I'm not sure which version.
Let me cook up a quick MPI-IO-only test case you can run to trigger
this problem and then you can beat cray over the head with it.
==rob
--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
More information about the parallel-netcdf
mailing list