nfmpi_put_var_char expected behavior?

Wei-keng Liao wkliao at ece.northwestern.edu
Mon Aug 6 11:18:40 CDT 2012


NetCDF F90 user guide says "If prefill is not on, the data writer must explicitly provide a null terminating byte."
http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-f90.html#Reading-and-Writing-Character-String-Values

So check the example from the NetCDF F77 user guide from the link below that adds a null byte:
     TXVAL(TXLEN:TXLEN) = CHAR(0)   ! null terminate
 
http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-f77.html#Reading-and-Writing-Character-String-Values

Wei-keng

On Aug 6, 2012, at 8:00 AM, Jim Edwards wrote:

> As in the example netcdf4 NC_FILL is OFF.   The issue isn't what ncdump does, its what get_var_text does.   When a string is written with put_var_text as in the example get_var_text cannot properly read it.   I should not have to modify my code to have this work correctly.   
> 
> On Sun, Aug 5, 2012 at 9:44 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> Hi, Jim,
> 
> Please understand pnetcdf does correctly follow the semantics of "put_var_text"
> and those garbage characters are expected. The strategy of netCDF's buffering
> the write data until nc_sync or nc_close may cause slow performance for large
> variables.
> 
> Even if I added a char(0) at the end of the string, ncmpidump/ncdump will still
> print the garbage tailing characters, because both dumps check all tailing
> characters against char(0), starting from the end of array toward the begin.
> So, in order to get the same output as netcdf4, pnetcdf has to fill all the
> non-written part of the variable with char(0), which is equivalent to implementing
> the NF_FILL mode.
> 
> In order to get the same netcdf results, you can add the following in your codes.
> 
>     do i=LEN_TRIM(buf)+1, 80
>         buf(i:i) = char(0)
>     enddo
>     status = nfmpi_put_var_text(ncid, varid, buf)
> 
> Hope this helps.
> 
> Wei-keng
> 
> On Aug 4, 2012, at 8:01 AM, Jim Edwards wrote:
> 
> > Hi Wei-keng,
> >
> > The tailing junk may be there in the netcdf output but it is preceeded by a char(0) null string terminator so the variable written using netcdf is read correctly.  The pnetcdf variable cannot be read correctly as written.   I don't think that you need to copy to another buffer, you just need to add a terminator to the string.
> >
> > Jim
> >
> > On Fri, Aug 3, 2012 at 3:55 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > Hi, Jim
> >
> > I can reproduce the results you got when using ncmpidump and ncdump.
> >
> > When you used trim(buf) in nf90_put_var(), you are telling netcdf4
> > to write 17 characters (implicitly count(1)==17 will be used internally
> > due to function overloading in F90). So, in theory, those tailing garbage
> > characters are expected.
> >
> > After checking the netcdf 4.1.3 source codes, I found that the data to
> > be written will be first copied to an internal buffer (of chunk size
> > 8192 by default) and later flushed to the file. The internal buffer
> > appears to be initialized to all 0s. NetCDF developers can confirm this.
> >
> > So, in your case, the call to nf90_put_var() is actually copying 17
> > characters to that buffer. Since ncdump/ncmpidump skips the tailing
> > '\0' characters, the output gives no tailing garbage.
> >
> > In Pnetcdf, we do not copy write data to a temporary buffer, so the
> > unwritten part of the variable contains undefined contents.
> >
> > Wei-keng
> >
> > On Aug 3, 2012, at 9:09 AM, Jim Edwards wrote:
> >
> > > Hi Wei-keng,
> > >
> > > Here is a program that shows the problem.   Running with pnetcdf I get
> > >  var = "./none/foo.009.ncuth\000\000\000\000 |\215\000\000\000\000\000p\257\377\377\377\177\000\000\250\260\377\377\377\177\000\000\230\260\377\377\377\177\000\000\001\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000v\377a" ;
> > >
> > > while netcdf results in:
> > >
> > >  var = "./none/foo.009.nc" ;
> > >
> > >
> > > no garbage, no extra space.
> > >
> > >
> > > #ifdef PNETCDF
> > > program main
> > >     use pnetcdf
> > >     implicit none
> > >     include 'mpif.h'
> > >
> > >     integer i, status, ncid, varid, dimid(1)
> > >     integer(kind=MPI_OFFSET_KIND) :: len, count(1), offset(1)
> > >     character*(80) buf, rbuf
> > >
> > >     call MPI_Init(status)
> > >     status = nfmpi_create(MPI_COMM_WORLD, 'ptestfile.nc', &
> > >                           NF_CLOBBER, MPI_INFO_NULL, ncid)
> > >     len = 80
> > >     status = nfmpi_def_dim(ncid, "dim", len, dimid(1))
> > >     status = nfmpi_def_var(ncid, "var", NF_CHAR, 1, dimid, varid)
> > >     status = nfmpi_enddef(ncid)
> > >     status = nfmpi_begin_indep_data(ncid)
> > >
> > >     buf = "./none/foo.009.nc"
> > >     status = nfmpi_put_var_text(ncid, varid,trim(buf))
> > >     status = nfmpi_end_indep_data(ncid)
> > >
> > >     status = nfmpi_close(ncid)
> > >     call MPI_Finalize(status)
> > > end program
> > > #else
> > > program main
> > >     use netcdf
> > >     implicit none
> > >
> > >     integer i, status, ncid, varid, dimid(1)
> > >     integer :: len, count(1), offset(1)
> > >     character*(80) buf, rbuf
> > >
> > >     status = nf90_create('ntestfile.nc', &
> > >                           NF90_CLOBBER, ncid)
> > >     len = 80
> > >     status = nf90_set_fill(ncid, NF90_NOFILL, i)
> > >     status = nf90_def_dim(ncid, "dim", len, dimid(1))
> > >     status = nf90_def_var(ncid, "var", NF90_CHAR, dimid, varid)
> > >     status = nf90_enddef(ncid)
> > >
> > >     buf = "./none/foo.009.nc"
> > >     status = nf90_put_var(ncid, varid, trim(buf))
> > >
> > >     status = nf90_close(ncid)
> > > end program
> > > #endif
> > >
> > >
> > > On Thu, Aug 2, 2012 at 9:19 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > >
> > > I assume you are calling nfmpi_put_var_text. The "var" APIs are
> > > intended for writing the entire variable, which in your case it writes
> > > 80 characters. PnetCDF (and netCDF) will not stop writing at the end
> > > of the string (string length == 17 in your case). Instead, it writes
> > > the whole 80 characters from the I/O buffer into the file.
> > >
> > > Are you seeing those 0s from running ncmpidump, but not from ncdump?
> > > Strangely, I got the same output from both ncmpidump and ncdump (4.1.3).
> > > See below. Let me know if this program is same as your program is doing.
> > >
> > > program main
> > >     use pnetcdf
> > >     implicit none
> > >     include 'mpif.h'
> > >
> > >     integer i, status, ncid, varid, dimid(1)
> > >     integer(kind=MPI_OFFSET_KIND) :: len, count(1), offset(1)
> > >     character*(80) buf, rbuf
> > >
> > >     call MPI_Init(status)
> > >     status = nfmpi_create(MPI_COMM_WORLD, 'testfile.nc', &
> > >                           NF_CLOBBER, MPI_INFO_NULL, ncid)
> > >     len = 80
> > >     status = nfmpi_def_dim(ncid, "dim", len, dimid(1))
> > >     status = nfmpi_def_var(ncid, "var", NF_CHAR, 1, dimid, varid)
> > >     status = nfmpi_enddef(ncid)
> > >     status = nfmpi_begin_indep_data(ncid)
> > >
> > >     buf = "./none/foo.009.nc"
> > >     status = nfmpi_put_var_text(ncid, varid, buf)
> > >     status = nfmpi_end_indep_data(ncid)
> > >
> > >     status = nfmpi_close(ncid)
> > >     call MPI_Finalize(status)
> > > end program
> > >
> > > % ncmpidump testfile.nc
> > > netcdf testfile {
> > > // file format: CDF-1
> > > dimensions:
> > >         dim = 80 ;
> > > variables:
> > >         char var(dim) ;
> > > data:
> > >
> > >  var = "./none/foo.009.nc                                                               " ;
> > > }
> > >
> > > % 4.1.3/bin/ncdump testfile.nc
> > > netcdf testfile {
> > > dimensions:
> > >         dim = 80 ;
> > > variables:
> > >         char var(dim) ;
> > > data:
> > >
> > >  var = "./none/foo.009.nc                                                               " ;
> > > }
> > >
> > >
> > > Wei-keng
> > >
> > > On Aug 2, 2012, at 9:26 AM, Jim Edwards wrote:
> > >
> > > > Hi Wei-keng,
> > > >
> > > > In standard netcdf I set NC_NO_FILL when I open the file and I still get this behavior.   I think that you just need to null terminate the string when you pass it from fortran to c.
> > > >
> > > > Jim
> > > >
> > > > On Wed, Aug 1, 2012 at 7:58 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > > > Hi, Jim
> > > >
> > > > NetCDF's default for fill mode is NC_FILL and the default fill value for char type
> > > > is NC_FILL_CHAR == (char)0. See netCDF user guide below.
> > > > http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c.html#nc_005fset_005ffill
> > > > http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c.html#Fill-Values
> > > >
> > > > In PnetCDF, only NC_NOFILL is implemented. So, for those spaces that were not
> > > > written by the application, their contents are undefined.
> > > >
> > > >
> > > > Wei-keng
> > > >
> > > > On Aug 1, 2012, at 11:27 AM, Jim Edwards wrote:
> > > >
> > > > > If I declare a character string variable with a length x and then write a string of length y<x using nfmpi_put_var_char
> > > > > what is the expected behavior?     I think that what I am getting is incorrect (a bunch of garbage in the string from y:x )
> > > > >
> > > > > So for example I want to write string './none/foo.009.nc' into a variable of length 80.    In the file I am getting:
> > > > >
> > > > > './none/foo.009.nc 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0'
> > > > >
> > > > >
> > > > > I think that this is a bug.
> > > > >
> > > > >
> > > > > This is happening with the latest pnetcdf svn trunk code on jaguarpf compiled using:
> > > > >
> > > > > Currently Loaded Modulefiles:
> > > > >   1) modules/3.2.6.6                      10) xpmem/0.1-2.0400.31280.3.1.gem       19) DefApps
> > > > >   2) xtpe-network-gemini                  11) xe-sysroot/4.0.46                    20) altd/1.0
> > > > >   3) pgi/12.5.0                           12) xt-asyncpe/5.11                      21) subversion/1.6.17
> > > > >   4) xt-libsci/11.1.00                    13) atp/1.4.1                            22) szip/2.1
> > > > >   5) udreg/2.3.2-1.0400.5038.0.0.gem      14) PrgEnv-pgi/4.0.46                    23) hdf5/1.8.7
> > > > >   6) ugni/2.3-1.0400.4374.4.88.gem        15) xt-mpich2/5.5.0                      24) netcdf/4.1.3
> > > > >   7) pmi/3.0.0-1.0000.8661.28.2807.gem    16) xtpe-interlagos                      25) esmf/5.2.0rp1
> > > > >   8) dmapp/3.2.1-1.0400.4782.3.1.gem      17) eswrap/1.0.9
> > > > >   9) gni-headers/2.1-1.0400.4351.3.1.gem  18) lustredu/1.0
> > > > >
> > > > > --
> > > > > Jim Edwards
> > > > >
> > > > > CESM Software Engineering Group
> > > > > National Center for Atmospheric Research
> > > > > Boulder, CO
> > > > > 303-497-1842
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Jim Edwards
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > > --
> > > Jim Edwards
> > >
> > >
> > >
> >
> >
> >
> >
> > --
> > Jim Edwards
> >
> >
> >
> 
> 
> 
> 
> -- 
> Jim Edwards
> 
> 
> 



More information about the parallel-netcdf mailing list