[EXTERNAL] Re: metadata consistency

Phil Miller mille121 at illinois.edu
Thu Jul 18 14:46:01 CDT 2013


On Thu, Jul 18, 2013 at 12:24 PM, Rob Latham <robl at mcs.anl.gov> wrote:
> On Thu, Jul 18, 2013 at 01:17:00PM -0500, Wei-keng Liao wrote:
>> A question to PnetCDF users. Say, if your program stops in the middle
>> of PnetCDF I/O and files are not closed, is it acceptable to see the
>> number of records from the file header smaller than the data written
>> in the file body? Or would you just simply discard the files?
>> The answer will determine what default settings should be used.
>
> That's a good point.  I'd like to hear from our users, but our "worst
> case" here is not a corrupt file as it might be in the HDF5 case, but
> rather a large datafile with many (possibly all?) records unreachable.
>
> The header would always declare "one record variable of the following
> shape, with N records".    It's only when crashing before closing that
> the number of record reported could be less than the number of records
> actually in the file.
>
> We could probably write a recovery tool that, based on the size of the
> file, can make a pretty good guess as the number of records that
> should exist.

This tool will need to be very carefully written, and might be
impossible to implement correctly. If a write to a record variable
gets committed far into a file, the file length will be at least the
end point of that write. However, that doesn't mean that all of the
data up to that point will have also been committed. The filesystem
could still have a hole or holes in the file where reads will return
zeros anyplace that writes didn't get committed to stable storage
before execution halted. The protection against this is, of course,
having all writers successfully call one of fdatasync()/fsync()/sync()
and synchronize amongst themselves before the header gets updated.
Even then, we still have to trust that the OS and parallel filesystem
will behave themselves, and have actually made the data safe, when one
of those calls returns.

So, it would be one thing to rewrite the record count field for
visualization, with the possibility and warning to the user that later
records might be bogus. It would be another thing entirely to let some
other process consume those potentially-invalid records. The latter
possibility is probably going to bite some users really hard, even if
they have been loudly warned.


More information about the parallel-netcdf mailing list