Unchecked memory allocation and potential performance problem
William Gropp
gropp at mcs.anl.gov
Wed Dec 6 10:40:20 CST 2006
By aligned, I meant on file block boundaries. Just as data not on
"word size" boundaries can be slow in the processor, data not on file
block boundaries, particularly when multiple threads/processes are
accessing the same file, can be slower than aligned data (see
O_DIRECT restrictions on some filesystems). Of course, those
boundaries are multiples of 16 to 256k :)
Bill
On Dec 6, 2006, at 10:18 AM, Russ Rew wrote:
> On Wed, Dec 06, 2006 at 09:53:18 -0600, Rob Latham wrote:
>> Off the top of my head there are two not-too-hard ways we can do
>> this:
>>
>> There's nothing in the CDF-1 or CDF-2 file format spec that prevents
>> us from using an arbitrarily large header to describe the data.
>> If we
>> know the right parameters for alignment and blocksize, we can pad the
>> header out to a useful point (which might somewhat reduce the
>> chance a
>> re-definition would trigger a costly data shuffle).
>>
>> Same thing for variables. We don't *have* to place variables butting
>> up against each other. They could also be padded out to beneficial
>> points in the file. This change would be more invasive than padding
>> the header.
>
> There is a documented serial netCDF-3 interface for reserving extra
> space in the header and for controlling alignment of the data sections
> for fixed-size and record variables, using the function nc__enddef
> (note the two underscores in the name):
>
> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-
> c.html#nc_005f_005fenddef
>
> Also by default, data for variables starts on four-byte boundaries, so
> badly aligned accesses should not occur except possibly when getting
> subsets of byte or short variables.
>
> --Russ
>
>
>
More information about the parallel-netcdf
mailing list