Independent write

Rob Ross rross at mcs.anl.gov
Fri Mar 12 09:59:53 CST 2004


On Fri, 12 Mar 2004, Joachim Worringen wrote:

> Rob Ross:
> > To implement an append mode, there would have to be some sort of
> > communication between processes that kept everyone up to date about what
> > entry was the "last" one, and some mechanism for ensuring that only one
> > process got to write to that position.
> >
> > That is a generally difficult thing to implement in a low-overhead,
> > scalable way, regardless of the API.
> 
> One might define an "atomic mode" for pNetCDF, like in MPI-IO.

Ack no!!!!  I *hate* the MPI-IO atomic mode!!!  We'd have to augment the
interface to have an append mode as well, because of the race condition
mentioned.

> MPI-2 one-sided communication could be a way to implement the required
> synchronization (doulbe-locking two windows to access some data in one
> of the windows atomically like fetch&increment - not nice, but works).
> It's efficiency would depend on the progress characteristics of the
> underlying MPI implementation and/or the interconnect and/or the
> communication behaviour of the application, but it would work anyway.

Yeah, this is a clever way to implement this sort of functionality.  
There is at least one group looking at this sort of thing for consistent
caching in the MPI-IO layer.

Seriously though, the thing that I like best about PnetCDF is that it is 
*simple*.  There isn't anything complicated going on behind the scenes, 
and it is streamlined for the operations that it does provide.

If we were to add this kind of functionality, performance would likely be
terrible in that mode.  But users would use it rather than spending a
little extra time to do things in a more efficient way.  They would
complain about performance of PnetCDF and would never know that really
everything could have been quite a bit faster.  So I'd rather just not 
implement that and have the question come up the way it has, so we have an 
opportunity to help users do the right thing.

Maybe I should add a section to the document on "implementing append 
operations in PnetCDF", describing a couple of approaches and giving some 
code?  I would much rather spend a little time doing this and helping 
people do things the right way.

> As many interconnect today allow RMA, the performance of passive
> synchronization should become better with the MPI implementations
> exploiting such capabilities.

Agreed.  There are definitely lots of opportunities there!  RMA 
implementations need to be more wide-spread though...

> BTW, what's the state of the one-sided communication in MPICH2? ;-)

Ask Rajeev :).  I think it's pretty close.

Rob




More information about the parallel-netcdf mailing list