collective i/o on zero-dimensional data

Maxwell Kelley kelley at giss.nasa.gov
Mon Oct 4 18:42:07 CDT 2010


Hello,

Some code I ported from a GPFS to a Lustre machine was hit by the 
performance effects of switching back and forth between collective mode 
for distributed data and independent mode for non-distributed data. 
Converting the writes of non-distributed data like zero-dimensional (0D) 
variables to collective mode was straightforward, but with a small 
wrinkle. Since the start/count vectors passed to put_vara_double_all 
cannot be used to indicate which process possesses the definitive value of 
a 0D variable, I could only get correct results by ensuring that this 
datum is identical on all processes. Can I count on put_vara_double_all 
always behaving this way, or could future library versions refuse to write 
0D data in collective mode? BTW the return code did not indicate an error 
when process-varying 0D data was passed to put_vara_double_all.

Grouping independent-mode writes could reduce the number of switches 
between collective and independent mode but would require significant code 
reorganization so I tried the all-collective option first. I could also 
declare 0D variables as 1D arrays of length 1.

Before going any further, I should also ask about the recommended method 
for writing a 0D variable.  Collective I/O?  Or independent I/O with 
system-specific MPI hints (I haven't explored the MPI hints)?  Or should I 
use independent mode for all data, including the distributed arrays?

-Max




More information about the parallel-netcdf mailing list