Inquiry function bug?

Rob Ross rross at mcs.anl.gov
Fri Mar 12 09:30:00 CST 2004


On 12 Mar 2004, Roger Ting wrote:

> Spending a bit more time with the problem i described in my last
> message, i think the problem stems from the fact that independent mode
> data function doesn't update the file after each addition of record.
> Therefore, the unlimited dimension is not incremented after one
> processor add a record to the file. The other processor didn't realize
> and therefore overwrite the addition made by the first processor.

There are no updates made to the header in the midst of data mode, 
independent or collective.

> Is there a way to flush out the addition after each turn ? I realize
> that nfmpi_sync is a collective function.  Hence, it can't be used if 
> i want to access the file independently.

Actually there's no reason that I can think of why you can't do a
nfmpi_sync while in independent mode, as long as you realize that it is a 
collective function.

Any comments from anyone on this?  I'd like to clarify this in the 
documentation once we have an agreement.

> The weird thing is i can use nfmpi_inq_dimlen even though i am in
> independent mode. This causes some confusion . I thought all inquiry
> functions are collective operation.

They are.  From the API document:

  _These calls are all collective operations_ (see Appendix B for
  rationale).

  As in the original NetCDF interface, they may be called from either 
  define or data mode.  _They return information stored prior to the last 
  open, enddef, or sync._

This is why your approach of using nfmpi_inq_dimlen isn't going to work --
you would need to sync between all your operations in order to get the
right dimension value, and that would require you to synchronize, at
which point you'd be better off just doing collective I/O anyway.

Just because a function is collective doesn't mean that it (a) requires 
communication or (b) forces synchronization.  It's sort of up to you to 
call that function on all your processes at the moment; we aren't 
enforcing that.  Enforcing that you called it on all your processes would 
require communication that we currently don't need, and that would make 
the implementation *slower*!  We certainly don't want to do that!

Hope this helps; let us know if we can help you reorganize things a bit to 
work a little more effectively.

Regards,

Rob




More information about the parallel-netcdf mailing list