Intermittent error with pnetcdf 1.6.x: One or more variable sizes violate format constraints
Wei-keng Liao
wkliao at eecs.northwestern.edu
Fri Nov 27 01:19:29 CST 2015
Hi, Michael
Could you please send me your config.log file (from building 1.6.1)?
If you describe the variables (number, their dimensions and sizes),
it can be helpful. Also, is there any fixed-size variable larger
than 2GB?
You can set the run-time environment variable PNETCDF_SAFE_MODE to 1
to enable the metadata consistency checking in PnetCDF. That might
print additional messages in stdout, if an error is detected.
Wei-keng
On Nov 26, 2015, at 11:04 PM, Schlottke-Lakemper, Michael wrote:
> Hi folks,
>
> With the 1.6.0/1.6.1 versions of Parallel netCDF, under some conditions we get -62 errors (One or more variable sizes violate format constraints) when working with NC_64BIT_OFFSET files in parallel. It occurs mostly with parallel jobs > 16 MPI ranks (and was seen with up to 4k ranks so far) and was reproduced both on GPFS as well as Lustre file systems. Other than that, we could not find anything to narrow down the scope of the problem. Our current fix is to use the 1.5.0 version of Parallel netCDF, which has not yet produced this error, thus from a user perspective this seems like a regression in the 1.6.x series.
>
> Any ideas what the problem could be or what we could do to narrow it down?
>
> Yours
>
> Michael
>
>
> --
> Michael Schlottke-Lakemper
>
> Chair of Fluid Mechanics and Institute of Aerodynamics
> RWTH Aachen University
> Wüllnerstraße 5a
> 52062 Aachen
> Germany
>
> Phone: +49 (241) 80 95188
> Fax: +49 (241) 80 92257
> Mail: m.schlottke-lakemper at aia.rwth-aachen.de
> Web: http://www.aia.rwth-aachen.de
>
More information about the parallel-netcdf
mailing list