[pnetcdf-devel] [parallel-netcdf] #29: cdf-2 max array size

Wei-keng Liao wkliao at eecs.northwestern.edu
Mon Mar 27 11:54:54 CDT 2017


Hi, Luke,

> This is indeed what happens.  We proceed through the code to attempt to write, only returning a warning, which may not be wise.

Let me understand your work flow better. I assume the following two possibilities.
Please let me know which one is your case.

1. Use PnetCDF 1.7.0 to create a new netCDF file and continue to write variables.
2. Use PnetCDF 1.5.0 to create a new netCDF file. Then use 1.7.0 to write variables.

In the former case, you should have gotton error codes for every write attempted
and the new file created should be of size 0.

In the latter case, there will be no error reported and writes will succeed.
Please confirm this case, as you might just found a bug in PnetCDF and NetCDF.

For the latter case, I tried a toy program in both PnetCDF and NetCDF.
Neither ncmpi_open or nc_open reports an error. This indicates
no variable size checking is done in both libraries.

Wei-keng

On Mar 27, 2017, at 9:39 AM, Luke Van Roekel wrote:

> Hello,
>   
> From the file header, I can see variables edgeMask and normalVelocity are
> larger than 4GB-4 bytes.
> 
> Yes, I saw we are definitely violating cdf-2 file size constraints
> 
> If using PnetCDF 1.7.0, your program should have created a zero-size file
> with the error message: "NetCDF: One or more variable sizes violate format
> constraints" when ncmpi_enddef is called. Is this not your case?
> 
> This is indeed what happens.  We proceed through the code to attempt to write, only returning a warning, which may not be wise.
> 
> Note that ncdump and ncmpidump do not check the validity of a netCDF file.
> It is possible they (and paraview) do not complain about the size violation.
> 
> This is interesting.  I'm still surprised that the output looks reasonable even though we violate the constraint.  There is nothing amiss in the output, perhaps since I'm running on large HPC machines, with lots of memory per node it may allow for the writing even outside of cdf-2 constraints.  Even if we can get good output with pnetcdf/1.5.0, I don't think the appropriate solution is to change cdf-2 constraints (or depend on a single version), but instead I should have my code use cdf-5, especially since nco supports cdf-5.
> 
> Regards,
> Luke
> 
> On Sun, Mar 26, 2017 at 10:58 AM, Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
> Hi, Luke,
> 
> From the file header, I can see variables edgeMask and normalVelocity are
> larger than 4GB-4 bytes.
> 
> If using PnetCDF 1.7.0, your program should have created a zero-size file
> with the error message: "NetCDF: One or more variable sizes violate format
> constraints" when ncmpi_enddef is called. Is this not your case?
> 
> Note that ncdump and ncmpidump do not check the validity of a netCDF file.
> It is possible they (and paraview) do not complain about the size violation.
> 
> 
> Below is further information regarding to the size limitation, if you are
> interested in the CDF file format specification. Feel free to skip.
> 
> Your file produced by PnetCDF 1.5.0 is redeemed as an invalid CDF-2 file
> due to the size limit imposed by the netCDF folks. Their explanation for
> this size limitation is "... to permit aggregate access to all the data in
> a netCDF variable (or a record's worth of data) on 32-bit platforms."
> See this under "The 64-bit Offset Format" of
> http://www.unidata.ucar.edu/software/netcdf/docs/file_format_specifications.html
> 
> I myself am not sure if this is a reason strong enough. A 32-bit system
> can only have maximum 4GB memory and applications cannot allocate a memory
> buffer of size > 4GB. Hence, imposing such size limit on top of a file
> format specification does not make much sense.
> 
> Also note that all CDF file format specifications contain a metadata
> entity called "vsize" for each variable, to store the variable's size
> in bytes. For CDF-1 and 2 formats, vsize is a 4-byte integer and thus limits
> a variable size to 4GB-4 bytes (minus 4 is due to alignment.) However, the
> format specification also says that vsize is redundant (it can be calculated
> from other metadata entities). Both netCDF and PnetCDF libraries do not
> use it at all in their implementation. If vsize is ignored, a variable can
> actually be larger than 4GB in a CDF-2 file, as long as none of its
> dimensions is larger than 4GB (which is again protected by nc_def_dim).
> 
> If you are interested in proposing to lift that size limit, I would suggest
> you post the request to netcdf discussion group <netcdfgroup at unidata.ucar.edu>
> There was a discussion on lifting other limits (NC_MAX_DIMS, NC_MAX_VARS, etc.)
> as well recently.
> 
> Wei-keng
> 
> On Mar 25, 2017, at 10:52 PM, Luke Van Roekel wrote:
> 
> > Hi Wei-king,
> >   Thanks for the tipping me to the interesting utility.  As my scratch calculation suggested, I do have a variable over the max allowed in bytes (about 2x, attached offsets.cdf).  What I don't understand is that I can still produce a valid netcdf file, if I use pnetcdf/1.5.0.  I don't think my workflow was clear.  Here is what I have done.
> >
> > I've run identical code.  Using pnetcdf/1.5.0, I can produce valid netcdf files.  I have verified this by running the ncdump utility (in nco) and visualizing in paraview.  So I am confident the code is producing good output with pnetcdf 1.5.0.  If I do an identical run, but substitute pnetcdf/1.7.0 I cannot visualize the output with paraview or run ncdump -h (it says invalid file format).  The ncoffsets command shows I certainly have a variable over the max size ( bytes) so I'm stumped.  I see why new versions of pnetcdf do the check as it is, but I don't understand why I can still produce a valid .nc file with pnetcdf/1.5.0.
> >
> > If you'd also like a ncdump -h I can send that to you as well.
> >
> > Regards,
> > Luke
> >
> >
> > On Sat, Mar 25, 2017 at 12:14 AM, Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
> >
> > FYI. PnetCDF provides a utility program called "ncoffsets" that
> > can report the size in bytes of individual variables defined in a file.
> > You can use that to check whether a vaiable is > 4294967292 bytes or not.
> >
> > Here is an example usage on a CDF-2 file.
> >
> > % ncoffsets -s testfile.nc
> > netcdf testfile.nc {
> > // file format: CDF-2
> >
> > file header:
> >         size   = 212 bytes
> >         extent = 512 bytes
> >
> > dimensions:
> >         Z = 1023
> >         Y = 1023
> >         X = 4104
> >
> > fixed-size variables:
> >         byte   var1(Z, Y, X):
> >                start file offset =         512
> >                end   file offset =  4294955528
> >                size in bytes     =  4294955016
> >         byte   var3(Z, Y, X):
> >                start file offset =  4294956032
> >                end   file offset =  8589911048
> >                size in bytes     =  4294955016
> >         int    var2(Z, Y, X):
> >                start file offset =  8589911552
> >                end   file offset = 25769731616
> >                size in bytes     = 17179820064
> > }
> >
> > Run "ncoffsets -h" to see all command-line options or its user guide:
> > http://cucis.ece.northwestern.edu/projects/PnetCDF/doc/pnetcdf-c/ncoffsets.html
> >
> >
> > Wei-keng
> >
> > On Mar 24, 2017, at 11:19 PM, Luke Van Roekel wrote:
> >
> > > I do a few things.  First, an ncdump -h on the file with pnetcdf >= 1.6.0 yields invalid format, whereas 1.5.0 gives expected metadata.  I've also been able to visualize our ocean related quantities in paraview.  I'm convinced the 1.5.0 produced files are valid, which leads to my confusion, since a few individual fields are certainly > 4GB.
> > >
> > > On Fri, Mar 24, 2017 at 9:24 PM, Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
> > > Hi, Luke
> > >
> > > What software do you use to validate the netcdf files?
> > >
> > > FYI. NetCDF library from Unidata is now supporting CDF-5 format
> > > which promotes other netCDF software, such as NCO, to adopt CDF-5 as well.
> > >
> > >
> > > Wei-keng
> > >
> > > On Mar 24, 2017, at 10:14 PM, Luke Van Roekel wrote:
> > >
> > > > Hello Wei-Keng
> > > >   Thanks for the reply  What you say does make sense and I had seen this and my output use case does violate this criterion.  But I remain confused as to why I can produce a valid netcdf output while violating the 4GB constraint.  Do you have any ideas?
> > > >
> > > > In the interim I am exploring the use of cdf5 to move around this as well.
> > > >
> > > > Thanks!
> > > > Luke
> > > >
> > > > On Fri, Mar 24, 2017 at 5:13 PM, Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
> > > > Hi, Luke
> > > >
> > > > There are some limits for CDF-2 file format. One of them is "No fixed-size
> > > > variable can require more than 2^32 - 4 bytes (i.e. 4GiB - 4 bytes, or
> > > > 4,294,967,292 bytes) of storage for its data, unless it is the last
> > > > fixed-size variable and there are no record variables. When there are no
> > > > record variables, the last fixed-size variable can be any size supported by
> > > > the file system, e.g. terabytes.)". See this URL.
> > > > http://www.unidata.ucar.edu/software/netcdf/docs/file_structure_and_performance.html#offset_format_limitations
> > > >
> > > > The size referred there is for the whole variable in bytes, so the check
> > > > must consider variable's element size, xsz.)
> > > >
> > > > It was a bug in PnetCDF 1.5.0 that fails to honor the above limitation.
> > > > The bug has been fixed in 1.6.0.
> > > >
> > > >
> > > > Wei-keng
> > > >
> > > > On Mar 24, 2017, at 4:08 PM, parallel-netcdf wrote:
> > > >
> > > > > #29: cdf-2 max array size
> > > > > --------------------------------------+-------------------------------------
> > > > > Reporter:  luke.vanroekel@…          |       Owner:  robl
> > > > >     Type:  defect/bug                |      Status:  new
> > > > > Priority:  major                     |   Milestone:
> > > > > Component:  parallel-netcdf           |     Version:  1.8.1
> > > > > Keywords:  CDF2 max size             |
> > > > > --------------------------------------+-------------------------------------
> > > > > Hello,
> > > > >   I've been using pnetcdf to write very large data sets (max array 1.2E9
> > > > > elements), if I use pnetcdf/1.5.0, everything works fine with the datatype
> > > > > NC_64BIT_OFFSET.  But if I use a newer version (same data, same datatype),
> > > > > I receive the NC_EVARSIZE error and the file is not written.
> > > > >
> > > > > I spent a bit of time looking through the code in versions 1.5.0 and 1.8.1
> > > > > and it seems the way you enforce maxsize on CDF2 files has changed.  In
> > > > > 1.5.0 you simply check if the product of the dimensions of the biggest
> > > > > array is > 2^32-4, but in 1.8.1, this check seems to be agains
> > > > > (2^32-4)/(array element size (xsz in check_vlens)).  I'm curious why this
> > > > > was changed?  The output from version 1.5.0 looks good, so it doesn't seem
> > > > > as though we are violating any CDF2 file size constraints.
> > > > >
> > > > > Let me know if you need more information.
> > > > >
> > > > > I'm not sure if this should be tagged as defect/bug type or clarification.
> > > > > So I'll leave as is, my apologies if the type is inappropriate.
> > > > >
> > > > > Regards,
> > > > > Luke
> > > > >
> > > > > --
> > > > > Ticket URL: <http://trac.mcs.anl.gov/projects/parallel-netcdf/ticket/29>
> > > > > parallel-netcdf <https://trac.mcs.anl.gov/projects/parallel-netcdf>
> > > > >
> > > > > _______________________________________________
> > > > > pnetcdf-devel mailing list
> > > > > pnetcdf-devel at lists.mcs.anl.gov
> > > > > https://lists.mcs.anl.gov/mailman/listinfo/pnetcdf-devel
> > > >
> > > >
> > >
> > >
> >
> >
> > <offsets.cdf>
> 
> 



More information about the parallel-netcdf mailing list