bad performance

Tue Dec 1 10:34:47 CST 2009

On Thu, Nov 26, 2009 at 06:29:47PM +0100, Joeckel, Patrick wrote:
> Dear Rob,
> 
> here are some numbers (the wall-clock in seconds for one model
> output time step):
> 
> 1 serial netCDF with gather: 39.18
> 2 parallel-netCDF, no hints: 120.5
> 3 parallel-netCDF, IBM_io_buffer_size=2949120: 142.0
> 4 parallel-netCDF, IBM_sparse_access=true: 49.29
> 5 parallel-netCDF, 3 and 4: 58.45
> 
> It seems that ONLY setting IBM_sparse_access to true
> is the fastest option, but unfortunately still
> almost 26% slower than serial netDF.

Thanks for this information, Patrick.  Very thought provoking.   Looks
like it's not such a good idea to make pnetcdf automatically set that
buffer parameter after all.  

Even with a 26% slowdown, are there maybe other benefits to using
pnetcdf?  For example, you need enough memory on rank 0 to temporarily
hold all the data it gathers from the other processes, but with
pnetcdf you will have less memory pressure on rank 0.

We'll need to think some more about how to improve record variable
I/O, as that's commonly used by several important applications.  We
think it will require some improvements at both the pnetcdf and MPI-IO
layers.  

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA