pnetcdf and large transfers
Rob Latham
robl at mcs.anl.gov
Tue Jul 2 16:44:25 CDT 2013
On Tue, Jul 02, 2013 at 04:29:38PM -0500, Wei-keng Liao wrote:
> > Perhaps I misunderstand, but I think that in the case that the I/O is to a single variable and the variable size is such that the access cannot be too large, we can safely avoid the allreduce. Right?
>
> If the variables are fixed-sized (non-record) and the size is defined < 2GiB, then you are right
> we can avoid the allreduce (for blocking APIs only). Otherwise, I think allreduce is still
> necessary for RobL's approach.
Right. That won't catch all cases, but it will catch the important
ones: if the amount of data stored is small, then there's not a lot of
I/O over which to amortize data.
> > Is there something additional that we could learn as an artifact of the collective (currently proposed as an allreduce) that might help us in optimizing I/O generally?
>
> I am not sure for optimization, just thinking about making it work.
> I wonder when the request size is that big, should we worry about the cost
> of that one additional allreduce?
> I just remember if the default collective I/O buffer size is used (cb_buffer_size=16MiB),
> then the maximal amount of individual read/write (made by aggregators) is 16 MiB. Thus,
> I don't think we will have a problem for collective I/O (where two-phase I/O actually
> involves). It is independent I/O. Is my understanding correct?
We still need to "feed" the MPI-IO routine, though. MPI_File_read_all
takes an integer 'count' parameter: presently we count off N
MPI_BYTES.
Yes, once we get the request down into MPI-IO we're in good shape.
> > I would like to have a solution in ROMIO also, but prefer a
> > solution that is available soonest to our users, and a PnetCDF fix
> > is superior with respect to that metric (as RobL says)...
>
> In this case, we can use RobL's approach on blocking (independent?)
> APIs and for big variables. For other cases, return errors?
I'm going to do two things (and not do one thing):
- check for the _x variants of the type routines. If those exist, I
will presume the implementation has given some thought to datatypes
describing large amounts of memory.
- Implement the RobR "only for large variables" optimization
- leave non-blocking alone for now, on the assumption that
non-blocking's primary use case is to combine many small I/O
requests into larger ones.
==rob
--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
More information about the parallel-netcdf
mailing list