[EXTERNAL] Re: metadata consistency

Sjaardema, Gregory D gdsjaar at sandia.gov
Thu Jul 11 16:32:38 CDT 2013


I use the put_att_int and put_att_double in data mode to change the value
of an existing attribute.  The values that are changed are a timestamp
indicating the "last written time" and a value listing the maximum name
length of any names on my databaseŠ  I'm good with making the call
collective, but requiring it to be made in define mode seems like a major
change.

--Greg

On 7/11/13 1:18 PM, "Wei-keng Liao" <wkliao at ece.northwestern.edu> wrote:

>In netCDF, the put_att_<type> is allowed in data mode only when
>it is used to change an existing attribute. I consider
>this case rare, and thus am asking the community if this is
>a common practice.
>
>The solution of "rank 0 wins" does not solve the read case.
>For example, one process change a time stamp (as a global attribute)
>in data mode. This change will not be made aware to other processes.
>So, when other processes call get_att(), they will get the old value.
>Will this be OK?
>
>One milder solution is to make the APIs collective, if they are
>called in data mode.
>
>Wei-keng
>
>On Jul 11, 2013, at 1:21 PM, Rob Latham wrote:
>
>> On Thu, Jul 11, 2013 at 12:09:54PM -0500, Wei-keng Liao wrote:
>>> One minor correction. The last API in the list I intent to say
>>> is the family of ncmpi_put_att_<type> APIs. The <type> can be one
>>> of text, uchar, schar, short, int, ...
>>> 
>> 
>> I'm a little wary of changing our semantics in such a mature piece of
>> software.  I think you are right that most people are already doing
>> this, but it makes me a bit nervous.
>> 
>> the put_att_<type> change has me the most nervous.
>> 
>> It's not as nice as your proposal, but could we just say "rank 0 wins
>> if there is ever inconsistent metadata" ?
>> 
>> ==rob
>> 
>>> Wei-keng
>>> 
>>> On Jul 11, 2013, at 12:03 PM, Wei-keng Liao wrote:
>>> 
>>>> Dear PnetCDF users,
>>>> 
>>>> I am working on strengthening the PnetCDF's metadata consistency and
>>>> would like to change/limit the usage of APIs that modified the
>>>> metadata (file header) of a netCDF file. These APIs are:
>>>>   ncmpi_rename_dim(),
>>>>   ncmpi_rename_var(),
>>>>   ncmpi_copy_att(),
>>>>   ncmpi_rename_att(), and
>>>>   ncmpii_put_att().              <------- correction !
>>>> 
>>>> (The consistency here is referring to the consistency of file header
>>>> data stored in memory across all MPI processes.)
>>>> 
>>>> In netCDF, the above APIs are allowed in data mode if the space
>>>> required to store the new metadata (attributes, names, etc.) is
>>>> less than the old one. Otherwise, they must be called in the define
>>>> mode.
>>>> 
>>>> In PnetCDF, I would like to change that to allow these APIs only
>>>> in define mode. If your applications require the above APIs to
>>>> be called in data mode, please do let me know.
>>>> 
>>>> Here is my reason for the above change. In data mode, if metadata
>>>> is changed on one process's memory (or even the change is written
>>>> to the file by that process because NC_SHARE is set), there is no
>>>> way to propagate the change from this process to other processes,
>>>> until ncmpi_close() or ncmpi_sync() is called. If allowing these
>>>> APIs in define mode only, we can rely on ncmpi_endef() to
>>>> ensure/check the consistency.
>>>> 
>>>> Please let me know if your applications will have a problem with
>>>> such change.
>>>> 
>>>> (My plan is to make NC_SHARE the default mode for PnetCDF as
>>>> PnetCDF IS developed to handle parallel access to shared files.
>>>> The above suggested API changes is the first step of my plan.)
>>>> 
>>>> 
>>>> Wei-keng
>>>> 
>>> 
>> 
>> -- 
>> Rob Latham
>> Mathematics and Computer Science Division
>> Argonne National Lab, IL USA
>




More information about the parallel-netcdf mailing list