[netCDF #KLB-596506]: apparent bug in netcdf-4.2
Wei-keng Liao
wkliao at ece.northwestern.edu
Mon Mar 4 23:10:13 CST 2013
This variable alignment is a PnetCDF behavior.
The default alignment value for each non-record variable is 512 bytes in PnetCDF.
According to CDF-1 and CDF-2 file format specifications, each variable has
a field named "begin" which is the variable's file starting location.
var := name nelems [dimid ...] vatt_array nc_type vsize begin
We believe PnetCDF's variable alignment does not violate the CDF spec. and hence
implemented this default alignment in hope to improve performance. This alignment
can be turned off by setting the two hints below.
MPI_Info_set(info, "nc_header_align_size", "1");
MPI_Info_set(info, "nc_var_align_size", "1");
I wonder if you can send us the file and program to reproduce the corruption problem.
Wei-keng
On Mar 4, 2013, at 6:49 PM, Jim Edwards wrote:
> Hi Russ,
>
> That turns out to have been the problem. The original file was created with pnetcdf.
>
> Jim
>
>
>
> On Mon, Mar 4, 2013 at 3:12 PM, Jim Edwards <jedwards at ucar.edu> wrote:
> Russ,
>
> We think that the original file may have been written with pnetcdf. We are going to try to recreate the file with netcdf and again with pnetcdf and see if that explains the issue.
>
> Jim
>
>
> On Mon, Mar 4, 2013 at 2:31 PM, Samuel Levis <slevis at ucar.edu> wrote:
> Not exactly. I tried 2-degree to 2-degree, 2-degree to 0.5, 2-degree to 0.25, and others. All cases worked except the ones with the 0.5-degree file as output.
>
> I also tried 0.5-degree to 0.5-degree (mapping the file into itself) and that failed. When I say failed, I mean that the output file ends up with junk in it.
>
> Sam
>
>
> On 03/04/2013 02:26 PM, Jim Edwards wrote:
>> Hi Russ,
>>
>> Another piece of information. This program interpolates data from a file of one resolution (2 degree in this case) to another. When the output file is low resolution, 1/2 degree or lower, the output file looks fine, no corruption that we can detect. It's only when the output file is higher resolution (1/4 degree) that this problem comes about.
>>
>> Jim
>>
>> On Mon, Mar 4, 2013 at 2:04 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>> Hi Russ,
>>
>> It looks like that file was originally created on bluefire on 11/21/11, I don't have any information about which netcdf library was used, but I think that some adjustment may have been made inside netcdf for performance on gpfs filesystems.
>>
>> But doesn't your own
>> int nc__enddef(int ncid, size_t h_minfree, size_t v_align,
>> size_t v_minfree, size_t r_align);
>>
>>
>> allow for changing this alignment? I don't know that that was done for this file, but it would seem to suggest that there is no assumption being violated about these alignments. Or that one part of netcdf is assuming something which another part is not.
>>
>>
>>
>> On Mon, Mar 4, 2013 at 12:53 PM, Unidata netCDF Support <support-netcdf at unidata.ucar.edu> wrote:
>> Hi Jim,
>>
>> I'm curious how the original file you provided was created and perhaps
>> modified. It has a peculiar alignment characteristic that I haven't
>> seen before, and if there are more netCDF files being created the same
>> way, we may nned to adapt.
>>
>> Could you tell me the history of the file, what file system it was
>> written on, and whether the netCDF library with which it was written
>> was modified in any way?
>>
>> The file has this characteristic, which would indicate a non-Posix
>> file system: it is using 512-byte alignment of data values rather than
>> the 4-byte alignment assumed by netCDF. So, for example, the data
>> block for fixed-size variables begins with 9 scalar integers that
>> should take 4 bytes each. The offsets computed for these values from
>> the beginning of the fixed-size data block are 0, 4, 8, 12, 16, 20,
>> 24, 28, 32, so there is no padding or wasted space. The offsets from
>> the beginning of the fixed-size data block that are actually stored in the
>> header for these variables are 0, 512, 1024, ... , 4096. If the file
>> system used to write the data originally could not write data on
>> 4-byte boundaries, I think that violates the assumption of netCDF and
>> POSIX I/O. Nevertheless, if the nc_endef() call pays attention to the
>> file offsets for each variable that are stored in the header (as the
>> netCDF library does when reading the file), rather than computing them
>> from assuming 4-byte alignment, perhaps this file can be modified
>> correctly.
>>
>> The function where we might be able to adapt to this is
>> nc3internal.c:NC_begins(), which is called from
>> nc3internal.c:NC_enddef(). In any case it's a netCDF bug to write
>> something that can't be later read correctly, so if our unmodified
>> library wrote that file and we can't adapt to it, then it was a bug
>> to not emit an error message for trying to create a file on the original
>> non-POSIX file system. Also, the data seems to all be there in the
>> "corrupted" file, which can be fixed by just restoring the variable
>> offsets in the file header to the peculiar values in the original ...
>>
>> --Russ
>>
>> Russ Rew UCAR Unidata Program
>> russ at unidata.ucar.edu http://www.unidata.ucar.edu
>>
>>
>>
>> Ticket Details
>> ===================
>> Ticket ID: KLB-596506
>> Department: Support netCDF
>> Priority: Normal
>> Status: Closed
>>
>>
>>
>>
>> --
>> Jim Edwards
>>
>> CESM Software Engineering Group
>> National Center for Atmospheric Research
>> Boulder, CO
>> 303-497-1842
>>
>>
>>
>> --
>> Jim Edwards
>>
>> CESM Software Engineering Group
>> National Center for Atmospheric Research
>> Boulder, CO
>> 303-497-1842
>
> --
> Samuel Levis -
> slevis at ucar.edu
>
> National Center for Atmospheric Research
> PO Box 3000, Boulder CO 80307-3000 <- use for mail
> 3090 Center Green Dr., Boulder CO 80301 <- vs. shipping
>
> tel
> 303 497-1627
> ; fax -1348; skype: samuellevis2
>
> http://www.cgd.ucar.edu/tss
>
>
> Terrestrial Sciences Section in the
> Climate & Global Dynamics Division
>
>
>
>
> --
> Jim Edwards
>
> CESM Software Engineering Group
> National Center for Atmospheric Research
> Boulder, CO
> 303-497-1842
>
>
>
> --
> Jim Edwards
>
> CESM Software Engineering Group
> National Center for Atmospheric Research
> Boulder, CO
> 303-497-1842
More information about the parallel-netcdf
mailing list