Big file support on BG with 32-bit off_t type

Robert Latham robl at mcs.anl.gov
Tue Jan 9 15:12:40 CST 2007


On Mon, Jan 08, 2007 at 01:54:23PM -0700, John Michalakes wrote:
> Working on Blue Gene, we have a code that uses pnetcdf to
> successfully write a large file (> 2GB, file type "2"). The pnetcdf
> library and code are compiled with 32-bit addressing so that
> sizeof(off_t) is 4. The file appears to be correct based on persual
> with ncdump.

Great.  Glad to hear at least the file creation part of your code is
working better now.

> We are unable to read the file back in, however, using pnetcdf with
> another program also compiled with 32-bit addressing.  The error
> return is:  NC_ESMALL.
> 
> I believe the section of pnetcdf code returning the error is
> ncmpii_hdr_get_NC, in header.c:
> 
>   /* check version number in last byte of magic */
>   if (magic[sizeof(ncmagic)-1] == 0x1) {
>           getbuf.version = 1;
>   } else if (magic[sizeof(ncmagic)-1] == 0x2) {
>           getbuf.version = 2;
>           fSet(ncp->flags, NC_64BIT_OFFSET);
>           if (sizeof(off_t) != 8) {
>                   /* take the easy way out: if we can't support all CDF-2
>                    * files, return immediately */
>                   free(getbuf.base);
>                   return NC_ESMALL;
> 
> This is around line 1142.
> 
> Question: is this error condition correct and necessary? How does
> 32-bit pnetcdf manage to *write* the large file (apparently)
> correctly?

Hm, that is indeed probably a bug: we should check for
sizof(MPI_Offset), not off_t.   Do you still have the
src/lib/ncconfig.h file handy?  Does it have lines like this?

/* The number of bytes in an MPI_Offset */
#define SIZEOF_MPI_OFFSET 8

> Will pnetcdf compiled for OBJECT_MODE 64 work properly on Blue Gene?

I think that it would, but I also think it should work ok as-is.  
Here's a quick fix.  If ncconfig.h already says SIZEOF_MPI_OFFSET is
8, then can you try changing this line: 

if (sizeof(off_t) != 8)  {

to this

if (sizeof(MPI_Offset) != 8)  {

If that works, then I'll clean up the approach a bit and add it to the
next release.

Thanks for the report.  It's great to know that the CDF-2 format is
getting used. 

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B




More information about the parallel-netcdf mailing list