nfmpi_put_var_char expected behavior?

Wei-keng Liao wkliao at ece.northwestern.edu
Sun Aug 5 22:44:51 CDT 2012


Hi, Jim,

Please understand pnetcdf does correctly follow the semantics of "put_var_text"
and those garbage characters are expected. The strategy of netCDF's buffering
the write data until nc_sync or nc_close may cause slow performance for large
variables.

Even if I added a char(0) at the end of the string, ncmpidump/ncdump will still
print the garbage tailing characters, because both dumps check all tailing
characters against char(0), starting from the end of array toward the begin.
So, in order to get the same output as netcdf4, pnetcdf has to fill all the
non-written part of the variable with char(0), which is equivalent to implementing
the NF_FILL mode.

In order to get the same netcdf results, you can add the following in your codes.

    do i=LEN_TRIM(buf)+1, 80
        buf(i:i) = char(0)
    enddo
    status = nfmpi_put_var_text(ncid, varid, buf)

Hope this helps.

Wei-keng

On Aug 4, 2012, at 8:01 AM, Jim Edwards wrote:

> Hi Wei-keng, 
> 
> The tailing junk may be there in the netcdf output but it is preceeded by a char(0) null string terminator so the variable written using netcdf is read correctly.  The pnetcdf variable cannot be read correctly as written.   I don't think that you need to copy to another buffer, you just need to add a terminator to the string.  
> 
> Jim
> 
> On Fri, Aug 3, 2012 at 3:55 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> Hi, Jim
> 
> I can reproduce the results you got when using ncmpidump and ncdump.
> 
> When you used trim(buf) in nf90_put_var(), you are telling netcdf4
> to write 17 characters (implicitly count(1)==17 will be used internally
> due to function overloading in F90). So, in theory, those tailing garbage
> characters are expected.
> 
> After checking the netcdf 4.1.3 source codes, I found that the data to
> be written will be first copied to an internal buffer (of chunk size
> 8192 by default) and later flushed to the file. The internal buffer
> appears to be initialized to all 0s. NetCDF developers can confirm this.
> 
> So, in your case, the call to nf90_put_var() is actually copying 17
> characters to that buffer. Since ncdump/ncmpidump skips the tailing
> '\0' characters, the output gives no tailing garbage.
> 
> In Pnetcdf, we do not copy write data to a temporary buffer, so the
> unwritten part of the variable contains undefined contents.
> 
> Wei-keng
> 
> On Aug 3, 2012, at 9:09 AM, Jim Edwards wrote:
> 
> > Hi Wei-keng,
> >
> > Here is a program that shows the problem.   Running with pnetcdf I get
> >  var = "./none/foo.009.ncuth\000\000\000\000 |\215\000\000\000\000\000p\257\377\377\377\177\000\000\250\260\377\377\377\177\000\000\230\260\377\377\377\177\000\000\001\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000v\377a" ;
> >
> > while netcdf results in:
> >
> >  var = "./none/foo.009.nc" ;
> >
> >
> > no garbage, no extra space.
> >
> >
> > #ifdef PNETCDF
> > program main
> >     use pnetcdf
> >     implicit none
> >     include 'mpif.h'
> >
> >     integer i, status, ncid, varid, dimid(1)
> >     integer(kind=MPI_OFFSET_KIND) :: len, count(1), offset(1)
> >     character*(80) buf, rbuf
> >
> >     call MPI_Init(status)
> >     status = nfmpi_create(MPI_COMM_WORLD, 'ptestfile.nc', &
> >                           NF_CLOBBER, MPI_INFO_NULL, ncid)
> >     len = 80
> >     status = nfmpi_def_dim(ncid, "dim", len, dimid(1))
> >     status = nfmpi_def_var(ncid, "var", NF_CHAR, 1, dimid, varid)
> >     status = nfmpi_enddef(ncid)
> >     status = nfmpi_begin_indep_data(ncid)
> >
> >     buf = "./none/foo.009.nc"
> >     status = nfmpi_put_var_text(ncid, varid,trim(buf))
> >     status = nfmpi_end_indep_data(ncid)
> >
> >     status = nfmpi_close(ncid)
> >     call MPI_Finalize(status)
> > end program
> > #else
> > program main
> >     use netcdf
> >     implicit none
> >
> >     integer i, status, ncid, varid, dimid(1)
> >     integer :: len, count(1), offset(1)
> >     character*(80) buf, rbuf
> >
> >     status = nf90_create('ntestfile.nc', &
> >                           NF90_CLOBBER, ncid)
> >     len = 80
> >     status = nf90_set_fill(ncid, NF90_NOFILL, i)
> >     status = nf90_def_dim(ncid, "dim", len, dimid(1))
> >     status = nf90_def_var(ncid, "var", NF90_CHAR, dimid, varid)
> >     status = nf90_enddef(ncid)
> >
> >     buf = "./none/foo.009.nc"
> >     status = nf90_put_var(ncid, varid, trim(buf))
> >
> >     status = nf90_close(ncid)
> > end program
> > #endif
> >
> >
> > On Thu, Aug 2, 2012 at 9:19 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> >
> > I assume you are calling nfmpi_put_var_text. The "var" APIs are
> > intended for writing the entire variable, which in your case it writes
> > 80 characters. PnetCDF (and netCDF) will not stop writing at the end
> > of the string (string length == 17 in your case). Instead, it writes
> > the whole 80 characters from the I/O buffer into the file.
> >
> > Are you seeing those 0s from running ncmpidump, but not from ncdump?
> > Strangely, I got the same output from both ncmpidump and ncdump (4.1.3).
> > See below. Let me know if this program is same as your program is doing.
> >
> > program main
> >     use pnetcdf
> >     implicit none
> >     include 'mpif.h'
> >
> >     integer i, status, ncid, varid, dimid(1)
> >     integer(kind=MPI_OFFSET_KIND) :: len, count(1), offset(1)
> >     character*(80) buf, rbuf
> >
> >     call MPI_Init(status)
> >     status = nfmpi_create(MPI_COMM_WORLD, 'testfile.nc', &
> >                           NF_CLOBBER, MPI_INFO_NULL, ncid)
> >     len = 80
> >     status = nfmpi_def_dim(ncid, "dim", len, dimid(1))
> >     status = nfmpi_def_var(ncid, "var", NF_CHAR, 1, dimid, varid)
> >     status = nfmpi_enddef(ncid)
> >     status = nfmpi_begin_indep_data(ncid)
> >
> >     buf = "./none/foo.009.nc"
> >     status = nfmpi_put_var_text(ncid, varid, buf)
> >     status = nfmpi_end_indep_data(ncid)
> >
> >     status = nfmpi_close(ncid)
> >     call MPI_Finalize(status)
> > end program
> >
> > % ncmpidump testfile.nc
> > netcdf testfile {
> > // file format: CDF-1
> > dimensions:
> >         dim = 80 ;
> > variables:
> >         char var(dim) ;
> > data:
> >
> >  var = "./none/foo.009.nc                                                               " ;
> > }
> >
> > % 4.1.3/bin/ncdump testfile.nc
> > netcdf testfile {
> > dimensions:
> >         dim = 80 ;
> > variables:
> >         char var(dim) ;
> > data:
> >
> >  var = "./none/foo.009.nc                                                               " ;
> > }
> >
> >
> > Wei-keng
> >
> > On Aug 2, 2012, at 9:26 AM, Jim Edwards wrote:
> >
> > > Hi Wei-keng,
> > >
> > > In standard netcdf I set NC_NO_FILL when I open the file and I still get this behavior.   I think that you just need to null terminate the string when you pass it from fortran to c.
> > >
> > > Jim
> > >
> > > On Wed, Aug 1, 2012 at 7:58 PM, Wei-keng Liao <wkliao at ece.northwestern.edu> wrote:
> > > Hi, Jim
> > >
> > > NetCDF's default for fill mode is NC_FILL and the default fill value for char type
> > > is NC_FILL_CHAR == (char)0. See netCDF user guide below.
> > > http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c.html#nc_005fset_005ffill
> > > http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c.html#Fill-Values
> > >
> > > In PnetCDF, only NC_NOFILL is implemented. So, for those spaces that were not
> > > written by the application, their contents are undefined.
> > >
> > >
> > > Wei-keng
> > >
> > > On Aug 1, 2012, at 11:27 AM, Jim Edwards wrote:
> > >
> > > > If I declare a character string variable with a length x and then write a string of length y<x using nfmpi_put_var_char
> > > > what is the expected behavior?     I think that what I am getting is incorrect (a bunch of garbage in the string from y:x )
> > > >
> > > > So for example I want to write string './none/foo.009.nc' into a variable of length 80.    In the file I am getting:
> > > >
> > > > './none/foo.009.nc 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0'
> > > >
> > > >
> > > > I think that this is a bug.
> > > >
> > > >
> > > > This is happening with the latest pnetcdf svn trunk code on jaguarpf compiled using:
> > > >
> > > > Currently Loaded Modulefiles:
> > > >   1) modules/3.2.6.6                      10) xpmem/0.1-2.0400.31280.3.1.gem       19) DefApps
> > > >   2) xtpe-network-gemini                  11) xe-sysroot/4.0.46                    20) altd/1.0
> > > >   3) pgi/12.5.0                           12) xt-asyncpe/5.11                      21) subversion/1.6.17
> > > >   4) xt-libsci/11.1.00                    13) atp/1.4.1                            22) szip/2.1
> > > >   5) udreg/2.3.2-1.0400.5038.0.0.gem      14) PrgEnv-pgi/4.0.46                    23) hdf5/1.8.7
> > > >   6) ugni/2.3-1.0400.4374.4.88.gem        15) xt-mpich2/5.5.0                      24) netcdf/4.1.3
> > > >   7) pmi/3.0.0-1.0000.8661.28.2807.gem    16) xtpe-interlagos                      25) esmf/5.2.0rp1
> > > >   8) dmapp/3.2.1-1.0400.4782.3.1.gem      17) eswrap/1.0.9
> > > >   9) gni-headers/2.1-1.0400.4351.3.1.gem  18) lustredu/1.0
> > > >
> > > > --
> > > > Jim Edwards
> > > >
> > > > CESM Software Engineering Group
> > > > National Center for Atmospheric Research
> > > > Boulder, CO
> > > > 303-497-1842
> > > >
> > >
> > >
> > >
> > >
> > > --
> > > Jim Edwards
> > >
> > >
> > >
> >
> >
> >
> >
> > --
> > Jim Edwards
> >
> >
> >
> 
> 
> 
> 
> -- 
> Jim Edwards
> 
> 
> 



More information about the parallel-netcdf mailing list