nfmpi_put_var_char expected behavior?

Rob Latham robl at mcs.anl.gov
Mon Aug 6 11:33:29 CDT 2012


On Mon, Aug 06, 2012 at 11:18:40AM -0500, Wei-keng Liao wrote:
> 
> NetCDF F90 user guide says "If prefill is not on, the data writer must explicitly provide a null terminating byte."
> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-f90.html#Reading-and-Writing-Character-String-Values
> 
> So check the example from the NetCDF F77 user guide from the link below that adds a null byte:
>      TXVAL(TXLEN:TXLEN) = CHAR(0)   ! null terminate
>  
> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-f77.html#Reading-and-Writing-Character-String-Values

Wei-keng, it sounds like you are saying pnetcdf is more
netcdf-compliant than serial netcdf.  Good job, us!   but
sometimes groups need to inter-operate based on actual implementation
behavior, not what's documented in the spec.

let's see if we can find an easy way to accommodate Jim. 

==rob

> Wei-keng
> 
> On Aug 6, 2012, at 8:00 AM, Jim Edwards wrote:
> 
> > As in the example netcdf4 NC_FILL is OFF.   The issue isn't what ncdump does, its what get_var_text does.   When a string is written with put_var_text as in the example get_var_text cannot properly read it.   I should not have to modify my code to have this work correctly.   
> > 
> > On Sun, Aug 5, 2012 at 9:44 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > Hi, Jim,
> > 
> > Please understand pnetcdf does correctly follow the semantics of "put_var_text"
> > and those garbage characters are expected. The strategy of netCDF's buffering
> > the write data until nc_sync or nc_close may cause slow performance for large
> > variables.
> > 
> > Even if I added a char(0) at the end of the string, ncmpidump/ncdump will still
> > print the garbage tailing characters, because both dumps check all tailing
> > characters against char(0), starting from the end of array toward the begin.
> > So, in order to get the same output as netcdf4, pnetcdf has to fill all the
> > non-written part of the variable with char(0), which is equivalent to implementing
> > the NF_FILL mode.
> > 
> > In order to get the same netcdf results, you can add the following in your codes.
> > 
> >     do i=LEN_TRIM(buf)+1, 80
> >         buf(i:i) = char(0)
> >     enddo
> >     status = nfmpi_put_var_text(ncid, varid, buf)
> > 
> > Hope this helps.
> > 
> > Wei-keng
> > 
> > On Aug 4, 2012, at 8:01 AM, Jim Edwards wrote:
> > 
> > > Hi Wei-keng,
> > >
> > > The tailing junk may be there in the netcdf output but it is preceeded by a char(0) null string terminator so the variable written using netcdf is read correctly.  The pnetcdf variable cannot be read correctly as written.   I don't think that you need to copy to another buffer, you just need to add a terminator to the string.
> > >
> > > Jim
> > >
> > > On Fri, Aug 3, 2012 at 3:55 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > > Hi, Jim
> > >
> > > I can reproduce the results you got when using ncmpidump and ncdump.
> > >
> > > When you used trim(buf) in nf90_put_var(), you are telling netcdf4
> > > to write 17 characters (implicitly count(1)==17 will be used internally
> > > due to function overloading in F90). So, in theory, those tailing garbage
> > > characters are expected.
> > >
> > > After checking the netcdf 4.1.3 source codes, I found that the data to
> > > be written will be first copied to an internal buffer (of chunk size
> > > 8192 by default) and later flushed to the file. The internal buffer
> > > appears to be initialized to all 0s. NetCDF developers can confirm this.
> > >
> > > So, in your case, the call to nf90_put_var() is actually copying 17
> > > characters to that buffer. Since ncdump/ncmpidump skips the tailing
> > > '\0' characters, the output gives no tailing garbage.
> > >
> > > In Pnetcdf, we do not copy write data to a temporary buffer, so the
> > > unwritten part of the variable contains undefined contents.
> > >
> > > Wei-keng
> > >
> > > On Aug 3, 2012, at 9:09 AM, Jim Edwards wrote:
> > >
> > > > Hi Wei-keng,
> > > >
> > > > Here is a program that shows the problem.   Running with pnetcdf I get
> > > >  var = "./none/foo.009.ncuth\000\000\000\000 |\215\000\000\000\000\000p\257\377\377\377\177\000\000\250\260\377\377\377\177\000\000\230\260\377\377\377\177\000\000\001\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000v\377a" ;
> > > >
> > > > while netcdf results in:
> > > >
> > > >  var = "./none/foo.009.nc" ;
> > > >
> > > >
> > > > no garbage, no extra space.
> > > >
> > > >
> > > > #ifdef PNETCDF
> > > > program main
> > > >     use pnetcdf
> > > >     implicit none
> > > >     include 'mpif.h'
> > > >
> > > >     integer i, status, ncid, varid, dimid(1)
> > > >     integer(kind=MPI_OFFSET_KIND) :: len, count(1), offset(1)
> > > >     character*(80) buf, rbuf
> > > >
> > > >     call MPI_Init(status)
> > > >     status = nfmpi_create(MPI_COMM_WORLD, 'ptestfile.nc', &
> > > >                           NF_CLOBBER, MPI_INFO_NULL, ncid)
> > > >     len = 80
> > > >     status = nfmpi_def_dim(ncid, "dim", len, dimid(1))
> > > >     status = nfmpi_def_var(ncid, "var", NF_CHAR, 1, dimid, varid)
> > > >     status = nfmpi_enddef(ncid)
> > > >     status = nfmpi_begin_indep_data(ncid)
> > > >
> > > >     buf = "./none/foo.009.nc"
> > > >     status = nfmpi_put_var_text(ncid, varid,trim(buf))
> > > >     status = nfmpi_end_indep_data(ncid)
> > > >
> > > >     status = nfmpi_close(ncid)
> > > >     call MPI_Finalize(status)
> > > > end program
> > > > #else
> > > > program main
> > > >     use netcdf
> > > >     implicit none
> > > >
> > > >     integer i, status, ncid, varid, dimid(1)
> > > >     integer :: len, count(1), offset(1)
> > > >     character*(80) buf, rbuf
> > > >
> > > >     status = nf90_create('ntestfile.nc', &
> > > >                           NF90_CLOBBER, ncid)
> > > >     len = 80
> > > >     status = nf90_set_fill(ncid, NF90_NOFILL, i)
> > > >     status = nf90_def_dim(ncid, "dim", len, dimid(1))
> > > >     status = nf90_def_var(ncid, "var", NF90_CHAR, dimid, varid)
> > > >     status = nf90_enddef(ncid)
> > > >
> > > >     buf = "./none/foo.009.nc"
> > > >     status = nf90_put_var(ncid, varid, trim(buf))
> > > >
> > > >     status = nf90_close(ncid)
> > > > end program
> > > > #endif
> > > >
> > > >
> > > > On Thu, Aug 2, 2012 at 9:19 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > > >
> > > > I assume you are calling nfmpi_put_var_text. The "var" APIs are
> > > > intended for writing the entire variable, which in your case it writes
> > > > 80 characters. PnetCDF (and netCDF) will not stop writing at the end
> > > > of the string (string length == 17 in your case). Instead, it writes
> > > > the whole 80 characters from the I/O buffer into the file.
> > > >
> > > > Are you seeing those 0s from running ncmpidump, but not from ncdump?
> > > > Strangely, I got the same output from both ncmpidump and ncdump (4.1.3).
> > > > See below. Let me know if this program is same as your program is doing.
> > > >
> > > > program main
> > > >     use pnetcdf
> > > >     implicit none
> > > >     include 'mpif.h'
> > > >
> > > >     integer i, status, ncid, varid, dimid(1)
> > > >     integer(kind=MPI_OFFSET_KIND) :: len, count(1), offset(1)
> > > >     character*(80) buf, rbuf
> > > >
> > > >     call MPI_Init(status)
> > > >     status = nfmpi_create(MPI_COMM_WORLD, 'testfile.nc', &
> > > >                           NF_CLOBBER, MPI_INFO_NULL, ncid)
> > > >     len = 80
> > > >     status = nfmpi_def_dim(ncid, "dim", len, dimid(1))
> > > >     status = nfmpi_def_var(ncid, "var", NF_CHAR, 1, dimid, varid)
> > > >     status = nfmpi_enddef(ncid)
> > > >     status = nfmpi_begin_indep_data(ncid)
> > > >
> > > >     buf = "./none/foo.009.nc"
> > > >     status = nfmpi_put_var_text(ncid, varid, buf)
> > > >     status = nfmpi_end_indep_data(ncid)
> > > >
> > > >     status = nfmpi_close(ncid)
> > > >     call MPI_Finalize(status)
> > > > end program
> > > >
> > > > % ncmpidump testfile.nc
> > > > netcdf testfile {
> > > > // file format: CDF-1
> > > > dimensions:
> > > >         dim = 80 ;
> > > > variables:
> > > >         char var(dim) ;
> > > > data:
> > > >
> > > >  var = "./none/foo.009.nc                                                               " ;
> > > > }
> > > >
> > > > % 4.1.3/bin/ncdump testfile.nc
> > > > netcdf testfile {
> > > > dimensions:
> > > >         dim = 80 ;
> > > > variables:
> > > >         char var(dim) ;
> > > > data:
> > > >
> > > >  var = "./none/foo.009.nc                                                               " ;
> > > > }
> > > >
> > > >
> > > > Wei-keng
> > > >
> > > > On Aug 2, 2012, at 9:26 AM, Jim Edwards wrote:
> > > >
> > > > > Hi Wei-keng,
> > > > >
> > > > > In standard netcdf I set NC_NO_FILL when I open the file and I still get this behavior.   I think that you just need to null terminate the string when you pass it from fortran to c.
> > > > >
> > > > > Jim
> > > > >
> > > > > On Wed, Aug 1, 2012 at 7:58 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > > > > Hi, Jim
> > > > >
> > > > > NetCDF's default for fill mode is NC_FILL and the default fill value for char type
> > > > > is NC_FILL_CHAR == (char)0. See netCDF user guide below.
> > > > > http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c.html#nc_005fset_005ffill
> > > > > http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c.html#Fill-Values
> > > > >
> > > > > In PnetCDF, only NC_NOFILL is implemented. So, for those spaces that were not
> > > > > written by the application, their contents are undefined.
> > > > >
> > > > >
> > > > > Wei-keng
> > > > >
> > > > > On Aug 1, 2012, at 11:27 AM, Jim Edwards wrote:
> > > > >
> > > > > > If I declare a character string variable with a length x and then write a string of length y<x using nfmpi_put_var_char
> > > > > > what is the expected behavior?     I think that what I am getting is incorrect (a bunch of garbage in the string from y:x )
> > > > > >
> > > > > > So for example I want to write string './none/foo.009.nc' into a variable of length 80.    In the file I am getting:
> > > > > >
> > > > > > './none/foo.009.nc 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0'
> > > > > >
> > > > > >
> > > > > > I think that this is a bug.
> > > > > >
> > > > > >
> > > > > > This is happening with the latest pnetcdf svn trunk code on jaguarpf compiled using:
> > > > > >
> > > > > > Currently Loaded Modulefiles:
> > > > > >   1) modules/3.2.6.6                      10) xpmem/0.1-2.0400.31280.3.1.gem       19) DefApps
> > > > > >   2) xtpe-network-gemini                  11) xe-sysroot/4.0.46                    20) altd/1.0
> > > > > >   3) pgi/12.5.0                           12) xt-asyncpe/5.11                      21) subversion/1.6.17
> > > > > >   4) xt-libsci/11.1.00                    13) atp/1.4.1                            22) szip/2.1
> > > > > >   5) udreg/2.3.2-1.0400.5038.0.0.gem      14) PrgEnv-pgi/4.0.46                    23) hdf5/1.8.7
> > > > > >   6) ugni/2.3-1.0400.4374.4.88.gem        15) xt-mpich2/5.5.0                      24) netcdf/4.1.3
> > > > > >   7) pmi/3.0.0-1.0000.8661.28.2807.gem    16) xtpe-interlagos                      25) esmf/5.2.0rp1
> > > > > >   8) dmapp/3.2.1-1.0400.4782.3.1.gem      17) eswrap/1.0.9
> > > > > >   9) gni-headers/2.1-1.0400.4351.3.1.gem  18) lustredu/1.0
> > > > > >
> > > > > > --
> > > > > > Jim Edwards
> > > > > >
> > > > > > CESM Software Engineering Group
> > > > > > National Center for Atmospheric Research
> > > > > > Boulder, CO
> > > > > > 303-497-1842
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Jim Edwards
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Jim Edwards
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > > --
> > > Jim Edwards
> > >
> > >
> > >
> > 
> > 
> > 
> > 
> > -- 
> > Jim Edwards
> > 
> > 
> > 
> 

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


More information about the parallel-netcdf mailing list