collective i/o on zero-dimensional data

Mon Oct 4 23:36:41 CDT 2010

Hi, Max,

Switching between collective and independent data modes is expensive, because fsync will be called each time the mode switches. So, grouping writes together to reduce the number of switches is a good strategy.

As for writing 0D variables, pnetcdf will ignore both arguments start[] and count[] and always let the calling process write one value (1 element) to the variable.

So, if the call is collective and the writing processes have different values to write, then the outcome in the file will be undefined (usually the last process wins, but no way to know who is the last one). So, one solution is to define the variable to be a 1-D array of length 1 and set argument count[0] to zero for all the processes except the one you would like its data get written to the file.

As for recommending collective or independent I/O for 0D variables, it depends on your I/O pattern. Do you have a lot of 0D variables? Are they being overwritten frequently and by different processes? Please note that a good I/O performance usually happens when the request is large and contiguous.

Use independent mode for all data can hurt the performance for the "distributed" arrays, as independent APIs may produce many small, noncontiguous requests to the file system.

Wei-keng

On Oct 4, 2010, at 6:42 PM, Maxwell Kelley wrote:

> 
> Hello,
> 
> Some code I ported from a GPFS to a Lustre machine was hit by the performance effects of switching back and forth between collective mode for distributed data and independent mode for non-distributed data. Converting the writes of non-distributed data like zero-dimensional (0D) variables to collective mode was straightforward, but with a small wrinkle. Since the start/count vectors passed to put_vara_double_all cannot be used to indicate which process possesses the definitive value of a 0D variable, I could only get correct results by ensuring that this datum is identical on all processes. Can I count on put_vara_double_all always behaving this way, or could future library versions refuse to write 0D data in collective mode? BTW the return code did not indicate an error when process-varying 0D data was passed to put_vara_double_all.
> 
> Grouping independent-mode writes could reduce the number of switches between collective and independent mode but would require significant code reorganization so I tried the all-collective option first. I could also declare 0D variables as 1D arrays of length 1.
> 
> Before going any further, I should also ask about the recommended method for writing a 0D variable.  Collective I/O?  Or independent I/O with system-specific MPI hints (I haven't explored the MPI hints)?  Or should I use independent mode for all data, including the distributed arrays?
> 
> -Max
> 
>