Using parallel write with subset of processors
Latham, Robert J.
robl at mcs.anl.gov
Tue May 10 10:54:34 CDT 2022
On Mon, 2022-05-09 at 12:51 -0700, Pascale Garaud wrote:
> Is there a best way to do that efficiently? The code currently uses a
> collective PUT_VAR_ALL to write the 3D dataset to file, but that
> would not work for the slice (and hangs when I try).
These collective I/O calls require all processes to participate in the call... but not all processes need to have data.
For proceses that do not have any data, you can set the 'count': Consider the common "put_vara_float_all" call for one example:
int ncmpi_put_vara_float_all(int ncid, int varid, const MPI_Offset *start,
const MPI_Offset *count, const float *op);
that 'count' parameter can just be an N dimensional array of 0 for the processes with no data
> I could just copy the whole data for the slice into a single
> processor, and then do an "independent" write for that processor, but
> that doesn't seem to be very efficient.
indeed! please don't do this
> I tried to understand how to use IPUT instead, but I am very confused
> about the syntax / procedure, especially given that all of the
> examples I have seen end up using all processors for the write.
IPUT is a fun optimization. Once you get the hang of the "blocking" versions, revisit the "non-blocking" routines, especially if you have writes to multiple variables.
==rob
More information about the parallel-netcdf
mailing list