nfmpi_put_var_char expected behavior?
Rob Latham
robl at mcs.anl.gov
Mon Aug 6 11:33:29 CDT 2012
On Mon, Aug 06, 2012 at 11:18:40AM -0500, Wei-keng Liao wrote:
>
> NetCDF F90 user guide says "If prefill is not on, the data writer must explicitly provide a null terminating byte."
> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-f90.html#Reading-and-Writing-Character-String-Values
>
> So check the example from the NetCDF F77 user guide from the link below that adds a null byte:
> TXVAL(TXLEN:TXLEN) = CHAR(0) ! null terminate
>
> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-f77.html#Reading-and-Writing-Character-String-Values
Wei-keng, it sounds like you are saying pnetcdf is more
netcdf-compliant than serial netcdf. Good job, us! but
sometimes groups need to inter-operate based on actual implementation
behavior, not what's documented in the spec.
let's see if we can find an easy way to accommodate Jim.
==rob
> Wei-keng
>
> On Aug 6, 2012, at 8:00 AM, Jim Edwards wrote:
>
> > As in the example netcdf4 NC_FILL is OFF. The issue isn't what ncdump does, its what get_var_text does. When a string is written with put_var_text as in the example get_var_text cannot properly read it. I should not have to modify my code to have this work correctly.
> >
> > On Sun, Aug 5, 2012 at 9:44 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > Hi, Jim,
> >
> > Please understand pnetcdf does correctly follow the semantics of "put_var_text"
> > and those garbage characters are expected. The strategy of netCDF's buffering
> > the write data until nc_sync or nc_close may cause slow performance for large
> > variables.
> >
> > Even if I added a char(0) at the end of the string, ncmpidump/ncdump will still
> > print the garbage tailing characters, because both dumps check all tailing
> > characters against char(0), starting from the end of array toward the begin.
> > So, in order to get the same output as netcdf4, pnetcdf has to fill all the
> > non-written part of the variable with char(0), which is equivalent to implementing
> > the NF_FILL mode.
> >
> > In order to get the same netcdf results, you can add the following in your codes.
> >
> > do i=LEN_TRIM(buf)+1, 80
> > buf(i:i) = char(0)
> > enddo
> > status = nfmpi_put_var_text(ncid, varid, buf)
> >
> > Hope this helps.
> >
> > Wei-keng
> >
> > On Aug 4, 2012, at 8:01 AM, Jim Edwards wrote:
> >
> > > Hi Wei-keng,
> > >
> > > The tailing junk may be there in the netcdf output but it is preceeded by a char(0) null string terminator so the variable written using netcdf is read correctly. The pnetcdf variable cannot be read correctly as written. I don't think that you need to copy to another buffer, you just need to add a terminator to the string.
> > >
> > > Jim
> > >
> > > On Fri, Aug 3, 2012 at 3:55 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > > Hi, Jim
> > >
> > > I can reproduce the results you got when using ncmpidump and ncdump.
> > >
> > > When you used trim(buf) in nf90_put_var(), you are telling netcdf4
> > > to write 17 characters (implicitly count(1)==17 will be used internally
> > > due to function overloading in F90). So, in theory, those tailing garbage
> > > characters are expected.
> > >
> > > After checking the netcdf 4.1.3 source codes, I found that the data to
> > > be written will be first copied to an internal buffer (of chunk size
> > > 8192 by default) and later flushed to the file. The internal buffer
> > > appears to be initialized to all 0s. NetCDF developers can confirm this.
> > >
> > > So, in your case, the call to nf90_put_var() is actually copying 17
> > > characters to that buffer. Since ncdump/ncmpidump skips the tailing
> > > '\0' characters, the output gives no tailing garbage.
> > >
> > > In Pnetcdf, we do not copy write data to a temporary buffer, so the
> > > unwritten part of the variable contains undefined contents.
> > >
> > > Wei-keng
> > >
> > > On Aug 3, 2012, at 9:09 AM, Jim Edwards wrote:
> > >
> > > > Hi Wei-keng,
> > > >
> > > > Here is a program that shows the problem. Running with pnetcdf I get
> > > > var = "./none/foo.009.ncuth\000\000\000\000 |\215\000\000\000\000\000p\257\377\377\377\177\000\000\250\260\377\377\377\177\000\000\230\260\377\377\377\177\000\000\001\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000v\377a" ;
> > > >
> > > > while netcdf results in:
> > > >
> > > > var = "./none/foo.009.nc" ;
> > > >
> > > >
> > > > no garbage, no extra space.
> > > >
> > > >
> > > > #ifdef PNETCDF
> > > > program main
> > > > use pnetcdf
> > > > implicit none
> > > > include 'mpif.h'
> > > >
> > > > integer i, status, ncid, varid, dimid(1)
> > > > integer(kind=MPI_OFFSET_KIND) :: len, count(1), offset(1)
> > > > character*(80) buf, rbuf
> > > >
> > > > call MPI_Init(status)
> > > > status = nfmpi_create(MPI_COMM_WORLD, 'ptestfile.nc', &
> > > > NF_CLOBBER, MPI_INFO_NULL, ncid)
> > > > len = 80
> > > > status = nfmpi_def_dim(ncid, "dim", len, dimid(1))
> > > > status = nfmpi_def_var(ncid, "var", NF_CHAR, 1, dimid, varid)
> > > > status = nfmpi_enddef(ncid)
> > > > status = nfmpi_begin_indep_data(ncid)
> > > >
> > > > buf = "./none/foo.009.nc"
> > > > status = nfmpi_put_var_text(ncid, varid,trim(buf))
> > > > status = nfmpi_end_indep_data(ncid)
> > > >
> > > > status = nfmpi_close(ncid)
> > > > call MPI_Finalize(status)
> > > > end program
> > > > #else
> > > > program main
> > > > use netcdf
> > > > implicit none
> > > >
> > > > integer i, status, ncid, varid, dimid(1)
> > > > integer :: len, count(1), offset(1)
> > > > character*(80) buf, rbuf
> > > >
> > > > status = nf90_create('ntestfile.nc', &
> > > > NF90_CLOBBER, ncid)
> > > > len = 80
> > > > status = nf90_set_fill(ncid, NF90_NOFILL, i)
> > > > status = nf90_def_dim(ncid, "dim", len, dimid(1))
> > > > status = nf90_def_var(ncid, "var", NF90_CHAR, dimid, varid)
> > > > status = nf90_enddef(ncid)
> > > >
> > > > buf = "./none/foo.009.nc"
> > > > status = nf90_put_var(ncid, varid, trim(buf))
> > > >
> > > > status = nf90_close(ncid)
> > > > end program
> > > > #endif
> > > >
> > > >
> > > > On Thu, Aug 2, 2012 at 9:19 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > > >
> > > > I assume you are calling nfmpi_put_var_text. The "var" APIs are
> > > > intended for writing the entire variable, which in your case it writes
> > > > 80 characters. PnetCDF (and netCDF) will not stop writing at the end
> > > > of the string (string length == 17 in your case). Instead, it writes
> > > > the whole 80 characters from the I/O buffer into the file.
> > > >
> > > > Are you seeing those 0s from running ncmpidump, but not from ncdump?
> > > > Strangely, I got the same output from both ncmpidump and ncdump (4.1.3).
> > > > See below. Let me know if this program is same as your program is doing.
> > > >
> > > > program main
> > > > use pnetcdf
> > > > implicit none
> > > > include 'mpif.h'
> > > >
> > > > integer i, status, ncid, varid, dimid(1)
> > > > integer(kind=MPI_OFFSET_KIND) :: len, count(1), offset(1)
> > > > character*(80) buf, rbuf
> > > >
> > > > call MPI_Init(status)
> > > > status = nfmpi_create(MPI_COMM_WORLD, 'testfile.nc', &
> > > > NF_CLOBBER, MPI_INFO_NULL, ncid)
> > > > len = 80
> > > > status = nfmpi_def_dim(ncid, "dim", len, dimid(1))
> > > > status = nfmpi_def_var(ncid, "var", NF_CHAR, 1, dimid, varid)
> > > > status = nfmpi_enddef(ncid)
> > > > status = nfmpi_begin_indep_data(ncid)
> > > >
> > > > buf = "./none/foo.009.nc"
> > > > status = nfmpi_put_var_text(ncid, varid, buf)
> > > > status = nfmpi_end_indep_data(ncid)
> > > >
> > > > status = nfmpi_close(ncid)
> > > > call MPI_Finalize(status)
> > > > end program
> > > >
> > > > % ncmpidump testfile.nc
> > > > netcdf testfile {
> > > > // file format: CDF-1
> > > > dimensions:
> > > > dim = 80 ;
> > > > variables:
> > > > char var(dim) ;
> > > > data:
> > > >
> > > > var = "./none/foo.009.nc " ;
> > > > }
> > > >
> > > > % 4.1.3/bin/ncdump testfile.nc
> > > > netcdf testfile {
> > > > dimensions:
> > > > dim = 80 ;
> > > > variables:
> > > > char var(dim) ;
> > > > data:
> > > >
> > > > var = "./none/foo.009.nc " ;
> > > > }
> > > >
> > > >
> > > > Wei-keng
> > > >
> > > > On Aug 2, 2012, at 9:26 AM, Jim Edwards wrote:
> > > >
> > > > > Hi Wei-keng,
> > > > >
> > > > > In standard netcdf I set NC_NO_FILL when I open the file and I still get this behavior. I think that you just need to null terminate the string when you pass it from fortran to c.
> > > > >
> > > > > Jim
> > > > >
> > > > > On Wed, Aug 1, 2012 at 7:58 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > > > > Hi, Jim
> > > > >
> > > > > NetCDF's default for fill mode is NC_FILL and the default fill value for char type
> > > > > is NC_FILL_CHAR == (char)0. See netCDF user guide below.
> > > > > http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c.html#nc_005fset_005ffill
> > > > > http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c.html#Fill-Values
> > > > >
> > > > > In PnetCDF, only NC_NOFILL is implemented. So, for those spaces that were not
> > > > > written by the application, their contents are undefined.
> > > > >
> > > > >
> > > > > Wei-keng
> > > > >
> > > > > On Aug 1, 2012, at 11:27 AM, Jim Edwards wrote:
> > > > >
> > > > > > If I declare a character string variable with a length x and then write a string of length y<x using nfmpi_put_var_char
> > > > > > what is the expected behavior? I think that what I am getting is incorrect (a bunch of garbage in the string from y:x )
> > > > > >
> > > > > > So for example I want to write string './none/foo.009.nc' into a variable of length 80. In the file I am getting:
> > > > > >
> > > > > > './none/foo.009.nc 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0'
> > > > > >
> > > > > >
> > > > > > I think that this is a bug.
> > > > > >
> > > > > >
> > > > > > This is happening with the latest pnetcdf svn trunk code on jaguarpf compiled using:
> > > > > >
> > > > > > Currently Loaded Modulefiles:
> > > > > > 1) modules/3.2.6.6 10) xpmem/0.1-2.0400.31280.3.1.gem 19) DefApps
> > > > > > 2) xtpe-network-gemini 11) xe-sysroot/4.0.46 20) altd/1.0
> > > > > > 3) pgi/12.5.0 12) xt-asyncpe/5.11 21) subversion/1.6.17
> > > > > > 4) xt-libsci/11.1.00 13) atp/1.4.1 22) szip/2.1
> > > > > > 5) udreg/2.3.2-1.0400.5038.0.0.gem 14) PrgEnv-pgi/4.0.46 23) hdf5/1.8.7
> > > > > > 6) ugni/2.3-1.0400.4374.4.88.gem 15) xt-mpich2/5.5.0 24) netcdf/4.1.3
> > > > > > 7) pmi/3.0.0-1.0000.8661.28.2807.gem 16) xtpe-interlagos 25) esmf/5.2.0rp1
> > > > > > 8) dmapp/3.2.1-1.0400.4782.3.1.gem 17) eswrap/1.0.9
> > > > > > 9) gni-headers/2.1-1.0400.4351.3.1.gem 18) lustredu/1.0
> > > > > >
> > > > > > --
> > > > > > Jim Edwards
> > > > > >
> > > > > > CESM Software Engineering Group
> > > > > > National Center for Atmospheric Research
> > > > > > Boulder, CO
> > > > > > 303-497-1842
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Jim Edwards
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Jim Edwards
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > > --
> > > Jim Edwards
> > >
> > >
> > >
> >
> >
> >
> >
> > --
> > Jim Edwards
> >
> >
> >
>
--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
More information about the parallel-netcdf
mailing list