writing large variables
John Clyne
clyne at ucar.edu
Wed Jan 16 12:25:07 CST 2013
On Jan 16, 2013, at 11:07 AM, Wei-keng Liao wrote:
> In your case, NC_64BIT_DATA is indeed required.
>
> In netcdf, if you define a variable with > 2^31 elements and it is the last
> variable defined in the file, then you probably can still use CDF-2.
> Below is the netcdf code I tested (using netcdf library version 4.2.1.1).
> var2 is the variable with 8B elements.
>
Thanks for clarifying that. Wei-keng. We are indeed able to write and read "large" variables with netCDF in CDF-2 format with the restrictions noted. The problem is that we need to write the data in parallel, and the last time we did the experiment, which is admittedly a while ago, pnetcdf performed significantly better than Unidata's netCDF-4.
jc
> #include <stdio.h>
> #include <netcdf.h>
>
> #define NZ 2
> #define NY 1048576
> #define NX 8192
>
> #define ERR(e) {if (e!= NC_NOERR) {printf("Error: %s\n", nc_strerror(e)); exit(-1);}}
>
> int main(int argc, char* argv[])
> {
> int ncid, varid1, varid2, old_modep, cmode, err;
> int dimids[3];
> size_t start[3], count[2];
> double buf;
>
> cmode = NC_CLOBBER | NC_64BIT_OFFSET;
>
> if (err = nc_create("testfile.nc", cmode, &ncid)) ERR(err);
> if (err = nc_def_dim(ncid, "z", NZ, &dimids[0])) ERR(err);
> if (err = nc_def_dim(ncid, "y", NY, &dimids[1])) ERR(err);
> if (err = nc_def_dim(ncid, "x", NX, &dimids[2])) ERR(err);
>
> if (err = nc_def_var(ncid, "var1", NC_DOUBLE, 2, dimids, &varid1))
> ERR(err);
> if (err = nc_def_var(ncid, "var2", NC_DOUBLE, 2, dimids+1, &varid2))
> ERR(err);
>
> if (err = nc_set_fill(ncid, NC_NOFILL, &old_modep)) ERR(err);
> if (err = nc_enddef(ncid)) ERR(err);
>
> /* write the last element */
> start[0] = NZ-1;
> start[1] = NY-1;
> start[2] = NX-1;
> count[0] = count[1] = 1;
> if (err = nc_put_vara_double(ncid, varid1, start, count, &buf))
> ERR(err);
> if (err = nc_put_vara_double(ncid, varid2, start+1, count, &buf))
> ERR(err);
>
> if (err = nc_close(ncid)) ERR(err);
>
> return 0;
> }
>
> % ls -l testfile.nc
> -rw------- 1 wkliao users 68736254100 Jan 16 11:58 testfile.nc
>
> % ncdump -h testfile.nc
> netcdf testfile {
> dimensions:
> z = 2 ;
> y = 1048576 ;
> x = 8192 ;
> variables:
> double var1(z, y) ;
> double var2(y, x) ;
> }
>
> % ncdump -k testfile.nc
> 64-bit offset
>
>
> Wei-keng
>
> On Jan 16, 2013, at 11:49 AM, John Clyne wrote:
>
>> Hi Wei-Keng,
>>
>> I should have been more clear. The array has more than 2^31 elements. Our test case presently has on the order of 2^33 elements, and soon we'll need to support 2^36 elements or more.
>>
>> It sounds like the NC_64BIT_DATA flag is required in our case?
>>
>> thanks - jc
>>
>> On Jan 16, 2013, at 9:27 AM, Wei-keng Liao wrote:
>>
>>> Hi, John,
>>>
>>> The mode NC_64BIT_DATA (CDF-5 format) allows you to define an array variable
>>> that has more than 2^31 elements. Note this is about the number of "elements"
>>> not the size of an array.
>>>
>>> If your array has less elements but the size is more than 4GB, then
>>> NC_64BIT_OFFSET can still be used. For example, double foo[Z][Y][X] has
>>> Z*Y*X elements. If Z*Y*X < 2^31 and Z*Y*X*sizeof(double) > 2^31, then you
>>> can still use NC_64BIT_OFFSET.
>>>
>>> Is this your case?
>>>
>>> Wei-keng
>>>
>>> On Jan 16, 2013, at 9:58 AM, John Clyne wrote:
>>>
>>>> Thanks for the quick response, Rob. I've poked the Unidata folks as well to see if they have any updates on their CDF-5 support plans. One followup question: Is it possible to output large variables from pnetcdf without using CDF-5? netCDF seems to support this in a CDF-2 format, albeit with restrictions. For our application we can live with those restrictions.'
>>>>
>>>> Thanks again for your help.
>>>>
>>>> Best,
>>>>
>>>> jc
>>>>
>>>> On Jan 16, 2013, at 7:54 AM, Rob Latham wrote:
>>>>
>>>>> On Tue, Jan 15, 2013 at 05:09:05PM -0700, John Clyne wrote:
>>>>>> Is it possible to write a large variable (>4GB) to a file with pnetcdf and read back the variable from the resulting file with netCDF? Outputting a large variable with pnetcdf appears to require passing the NC_64BIT_DATA flag (not NC_64BIT_OFFSET) to nc_create_par() - without this flag an error is generated. The file is written successfully, but when NC_64BIT_DATA is used the file is unrecognized by netcdf. For example:
>>>>>>
>>>>>> yslogin2[43] ncdump -h vx.0000.nc0
>>>>>> ncdump: vx.0000.nc0: NetCDF: Unknown file format
>>>>>>
>>>>>> From what I can gather from the web the NC_64BIT_DATA results in the generation of a CDF-5 formatted file. Is there support for CDF-5 in netCDF, or any other options for mixing pnetcdf and netCDF?
>>>>>
>>>>> Hi John: the short answer is there is no "unidata netCDF" support for
>>>>> pnetcdf's CDF-5 (giant variables) file format.
>>>>>
>>>>> I've been working with Unidata on and off over the last few years to
>>>>> find a way that we could use NetCDF-4's "netcdf on pnetcdf" feature to
>>>>> support CDF-5, but that support right now only exists as a series of
>>>>> patches yet to be incorporated into Unidata's tree.
>>>>>
>>>>> ==rob
>>>>>
>>>>> --
>>>>> Rob Latham
>>>>> Mathematics and Computer Science Division
>>>>> Argonne National Lab, IL USA
>>>>
>>>> John Clyne
>>>> National Center for Atmospheric Research
>>>> 303.497.1236 (w), 303.809.1922 (c)
>>>> clyne at ucar.edu
>>>>
>>>>
>>>>
>>>
>>
>> John Clyne
>> National Center for Atmospheric Research
>> 303.497.1236 (w), 303.809.1922 (c)
>> clyne at ucar.edu
>>
>>
>>
>
John Clyne
National Center for Atmospheric Research
303.497.1236 (w), 303.809.1922 (c)
clyne at ucar.edu
More information about the parallel-netcdf
mailing list