[EXTERNAL] Re: metadata consistency

Rob Latham robl at mcs.anl.gov
Thu Jul 18 14:24:15 CDT 2013


On Thu, Jul 18, 2013 at 01:17:00PM -0500, Wei-keng Liao wrote:
> A question to PnetCDF users. Say, if your program stops in the middle
> of PnetCDF I/O and files are not closed, is it acceptable to see the
> number of records from the file header smaller than the data written
> in the file body? Or would you just simply discard the files?
> The answer will determine what default settings should be used.

That's a good point.  I'd like to hear from our users, but our "worst
case" here is not a corrupt file as it might be in the HDF5 case, but
rather a large datafile with many (possibly all?) records unreachable.

The header would always declare "one record variable of the following
shape, with N records".    It's only when crashing before closing that
the number of record reported could be less than the number of records
actually in the file.

We could probably write a recovery tool that, based on the size of the
file, can make a pretty good guess as the number of records that
should exist.

==rob

> Wei-keng
> 
> On Jul 18, 2013, at 12:33 PM, Rob Latham wrote:
> 
> > On Thu, Jul 18, 2013 at 11:47:57AM -0500, Wei-keng Liao wrote:
> >> There is an issue for flushing the number of records to file I would like
> >> discuss here. In r1364, since the number of records is part of file header,
> >> it is flushed to file only when NC_SHARE is used. Otherwise, it will be done
> >> at file close time. However, because flushing it only requires to write a
> >> 4-byte (CDF and CDF2) or 8-byte integer (CDF5), flushing it should cost
> >> much less than other header changes. Other cases will require to flush the
> >> entire header. I can change the flushing to be performed no matter
> >> NC_SHARE is set or not. Please let me know your preference on when the
> >> number of records should be flushed, as this can affect what you will
> >> see from the file header if the program stops/exists before file close.
> > 
> > As our HDF5 friends can attest, metadata consistency is a big deal. 
> > 
> > I'm nervous, though, about introducing an additional 8 byte write to
> > every record variable I/O operation.   I guess we should try it and
> > see what the costs are.  
> > 
> > ==rob
> > 
> > -- 
> > Rob Latham
> > Mathematics and Computer Science Division
> > Argonne National Lab, IL USA
> 

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


More information about the parallel-netcdf mailing list