nfmpi_put_var_char expected behavior?

Jim Edwards edwards.jim at gmail.com
Wed Aug 8 15:28:52 CDT 2012


okay, I give :-)   we will make the code work with pnetcdf, at least we
know that that solution will be backward compatible with netcdf.    Thanks
both of you for looking into this.

On Wed, Aug 8, 2012 at 2:25 PM, Rob Latham <robl at mcs.anl.gov> wrote:

> On Sat, Aug 04, 2012 at 07:01:39AM -0600, Jim Edwards wrote:
> > Hi Wei-keng,
> >
> > The tailing junk may be there in the netcdf output but it is preceeded
> by a
> > char(0) null string terminator so the variable written using netcdf is
> read
> > correctly.  The pnetcdf variable cannot be read correctly as written.   I
> > don't think that you need to copy to another buffer, you just need to
> add a
> > terminator to the string.
>
> Hi Jim:  Here's your code:
>
>   buf = "./none/foo.009.nc"
>   status = nfmpi_put_var_text(ncid, varid,trim(buf))
>   status = nfmpi_end_indep_data(ncid)
>
> trim() turns your 80 char string into a 17 char string.   In our
> Fortran to C wrappers it's true we are given the string and a length.
>
> But we can't just blat a terminator on that passed-in fortran string.
> Well, we *can* and will likely get away with it in many cases, but
> either we truncate the string by one char to make room for the '\0' or
> we overrun the array by one character.  Neither of those are great
> options, but let's go ahead and do that:
>
> netcdf ptestfile {
> // file format: CDF-1
> dimensions:
>         dim = 80 ;
> variables:
>         char var(dim) ;
> data:
>
>  var = "./none/foo.009.nc\000\374\374\374\374\374\374A\000\000\000\000\000\000\000\2507\030\277{\177\000\000\2507\030\277{\177\000\000
> \000\000\000\000\000\000\000
> \000\000\000\000\000\000\000\000\000\000\000\000\000\000\000r_size\000\003"
> ;
> }
>
>
> We could make a copy of the users data:
>
> http://trac.mcs.anl.gov/projects/parallel-netcdf/browser/trunk/src/libf/put_var_textf.c
> Just throw a calloc() before line 27 and a free after it.
>
> Problem is, this approach will leak memory in bput_var_textf and
> iput_var_textf, and still doesn't fix your presentation problem:
>
> netcdf ptestfile {
> // file format: CDF-1
> dimensions:
>         dim = 80 ;
> variables:
>         char var(dim) ;
> data:
>
>  var = "./none/foo.009.nc\000\000\000\000\000\000\000!\000\000\000\000\000\000\000\000\000\000\000\374\374\374\374\374\374\374\374\374\374\374\374\374\374\374\374\374\374\374\374!\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000P"
> ;
> }
>
> As wei-keng says, we are fairly limited in our options in pnetcdf land in
> that
> we are re-using the user buffer.  There are buffer copies in some
> instances,
> but if the data is in big-endian format and the layout in memory is
> contiguous,
> we can avoid those copies.  I'm hesitant to introduce a buffer copy in
> every I/O operation (we service text/int/float/double functions with one
> generic I/O routine).
>
> ==rob
>
>
> > Jim
> >
> > On Fri, Aug 3, 2012 at 3:55 PM, Wei-keng Liao
> > <wkliao at ece.northwestern.edu>wrote:
> >
> > > Hi, Jim
> > >
> > > I can reproduce the results you got when using ncmpidump and ncdump.
> > >
> > > When you used trim(buf) in nf90_put_var(), you are telling netcdf4
> > > to write 17 characters (implicitly count(1)==17 will be used internally
> > > due to function overloading in F90). So, in theory, those tailing
> garbage
> > > characters are expected.
> > >
> > > After checking the netcdf 4.1.3 source codes, I found that the data to
> > > be written will be first copied to an internal buffer (of chunk size
> > > 8192 by default) and later flushed to the file. The internal buffer
> > > appears to be initialized to all 0s. NetCDF developers can confirm
> this.
> > >
> > > So, in your case, the call to nf90_put_var() is actually copying 17
> > > characters to that buffer. Since ncdump/ncmpidump skips the tailing
> > > '\0' characters, the output gives no tailing garbage.
> > >
> > > In Pnetcdf, we do not copy write data to a temporary buffer, so the
> > > unwritten part of the variable contains undefined contents.
> > >
> > > Wei-keng
> > >
> > > On Aug 3, 2012, at 9:09 AM, Jim Edwards wrote:
> > >
> > > > Hi Wei-keng,
> > > >
> > > > Here is a program that shows the problem.   Running with pnetcdf I
> get
> > > >  var = "./none/foo.009.ncuth\000\000\000\000
> > >
> |\215\000\000\000\000\000p\257\377\377\377\177\000\000\250\260\377\377\377\177\000\000\230\260\377\377\377\177\000\000\001\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000v\377a"
> > > ;
> > > >
> > > > while netcdf results in:
> > > >
> > > >  var = "./none/foo.009.nc" ;
> > > >
> > > >
> > > > no garbage, no extra space.
> > > >
> > > >
> > > > #ifdef PNETCDF
> > > > program main
> > > >     use pnetcdf
> > > >     implicit none
> > > >     include 'mpif.h'
> > > >
> > > >     integer i, status, ncid, varid, dimid(1)
> > > >     integer(kind=MPI_OFFSET_KIND) :: len, count(1), offset(1)
> > > >     character*(80) buf, rbuf
> > > >
> > > >     call MPI_Init(status)
> > > >     status = nfmpi_create(MPI_COMM_WORLD, 'ptestfile.nc', &
> > > >                           NF_CLOBBER, MPI_INFO_NULL, ncid)
> > > >     len = 80
> > > >     status = nfmpi_def_dim(ncid, "dim", len, dimid(1))
> > > >     status = nfmpi_def_var(ncid, "var", NF_CHAR, 1, dimid, varid)
> > > >     status = nfmpi_enddef(ncid)
> > > >     status = nfmpi_begin_indep_data(ncid)
> > > >
> > > >     buf = "./none/foo.009.nc"
> > > >     status = nfmpi_put_var_text(ncid, varid,trim(buf))
> > > >     status = nfmpi_end_indep_data(ncid)
> > > >
> > > >     status = nfmpi_close(ncid)
> > > >     call MPI_Finalize(status)
> > > > end program
> > > > #else
> > > > program main
> > > >     use netcdf
> > > >     implicit none
> > > >
> > > >     integer i, status, ncid, varid, dimid(1)
> > > >     integer :: len, count(1), offset(1)
> > > >     character*(80) buf, rbuf
> > > >
> > > >     status = nf90_create('ntestfile.nc', &
> > > >                           NF90_CLOBBER, ncid)
> > > >     len = 80
> > > >     status = nf90_set_fill(ncid, NF90_NOFILL, i)
> > > >     status = nf90_def_dim(ncid, "dim", len, dimid(1))
> > > >     status = nf90_def_var(ncid, "var", NF90_CHAR, dimid, varid)
> > > >     status = nf90_enddef(ncid)
> > > >
> > > >     buf = "./none/foo.009.nc"
> > > >     status = nf90_put_var(ncid, varid, trim(buf))
> > > >
> > > >     status = nf90_close(ncid)
> > > > end program
> > > > #endif
> > > >
> > > >
> > > > On Thu, Aug 2, 2012 at 9:19 PM, Wei-keng Liao <
> > > wkliao at ece.northwestern.edu> wrote:
> > > >
> > > > I assume you are calling nfmpi_put_var_text. The "var" APIs are
> > > > intended for writing the entire variable, which in your case it
> writes
> > > > 80 characters. PnetCDF (and netCDF) will not stop writing at the end
> > > > of the string (string length == 17 in your case). Instead, it writes
> > > > the whole 80 characters from the I/O buffer into the file.
> > > >
> > > > Are you seeing those 0s from running ncmpidump, but not from ncdump?
> > > > Strangely, I got the same output from both ncmpidump and ncdump
> (4.1.3).
> > > > See below. Let me know if this program is same as your program is
> doing.
> > > >
> > > > program main
> > > >     use pnetcdf
> > > >     implicit none
> > > >     include 'mpif.h'
> > > >
> > > >     integer i, status, ncid, varid, dimid(1)
> > > >     integer(kind=MPI_OFFSET_KIND) :: len, count(1), offset(1)
> > > >     character*(80) buf, rbuf
> > > >
> > > >     call MPI_Init(status)
> > > >     status = nfmpi_create(MPI_COMM_WORLD, 'testfile.nc', &
> > > >                           NF_CLOBBER, MPI_INFO_NULL, ncid)
> > > >     len = 80
> > > >     status = nfmpi_def_dim(ncid, "dim", len, dimid(1))
> > > >     status = nfmpi_def_var(ncid, "var", NF_CHAR, 1, dimid, varid)
> > > >     status = nfmpi_enddef(ncid)
> > > >     status = nfmpi_begin_indep_data(ncid)
> > > >
> > > >     buf = "./none/foo.009.nc"
> > > >     status = nfmpi_put_var_text(ncid, varid, buf)
> > > >     status = nfmpi_end_indep_data(ncid)
> > > >
> > > >     status = nfmpi_close(ncid)
> > > >     call MPI_Finalize(status)
> > > > end program
> > > >
> > > > % ncmpidump testfile.nc
> > > > netcdf testfile {
> > > > // file format: CDF-1
> > > > dimensions:
> > > >         dim = 80 ;
> > > > variables:
> > > >         char var(dim) ;
> > > > data:
> > > >
> > > >  var = "./none/foo.009.nc
> > >                 " ;
> > > > }
> > > >
> > > > % 4.1.3/bin/ncdump testfile.nc
> > > > netcdf testfile {
> > > > dimensions:
> > > >         dim = 80 ;
> > > > variables:
> > > >         char var(dim) ;
> > > > data:
> > > >
> > > >  var = "./none/foo.009.nc
> > >                 " ;
> > > > }
> > > >
> > > >
> > > > Wei-keng
> > > >
> > > > On Aug 2, 2012, at 9:26 AM, Jim Edwards wrote:
> > > >
> > > > > Hi Wei-keng,
> > > > >
> > > > > In standard netcdf I set NC_NO_FILL when I open the file and I
> still
> > > get this behavior.   I think that you just need to null terminate the
> > > string when you pass it from fortran to c.
> > > > >
> > > > > Jim
> > > > >
> > > > > On Wed, Aug 1, 2012 at 7:58 PM, Wei-keng Liao <
> > > wkliao at ece.northwestern.edu> wrote:
> > > > > Hi, Jim
> > > > >
> > > > > NetCDF's default for fill mode is NC_FILL and the default fill
> value
> > > for char type
> > > > > is NC_FILL_CHAR == (char)0. See netCDF user guide below.
> > > > >
> > >
> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c.html#nc_005fset_005ffill
> > > > >
> > >
> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c.html#Fill-Values
> > > > >
> > > > > In PnetCDF, only NC_NOFILL is implemented. So, for those spaces
> that
> > > were not
> > > > > written by the application, their contents are undefined.
> > > > >
> > > > >
> > > > > Wei-keng
> > > > >
> > > > > On Aug 1, 2012, at 11:27 AM, Jim Edwards wrote:
> > > > >
> > > > > > If I declare a character string variable with a length x and then
> > > write a string of length y<x using nfmpi_put_var_char
> > > > > > what is the expected behavior?     I think that what I am
> getting is
> > > incorrect (a bunch of garbage in the string from y:x )
> > > > > >
> > > > > > So for example I want to write string './none/foo.009.nc' into a
> > > variable of length 80.    In the file I am getting:
> > > > > >
> > > > > > './none/foo.009.nc 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0 0
> > > 0 0 0 0 0 0 0'
> > > > > >
> > > > > >
> > > > > > I think that this is a bug.
> > > > > >
> > > > > >
> > > > > > This is happening with the latest pnetcdf svn trunk code on
> jaguarpf
> > > compiled using:
> > > > > >
> > > > > > Currently Loaded Modulefiles:
> > > > > >   1) modules/3.2.6.6                      10)
> > > xpmem/0.1-2.0400.31280.3.1.gem       19) DefApps
> > > > > >   2) xtpe-network-gemini                  11) xe-sysroot/4.0.46
> > >                20) altd/1.0
> > > > > >   3) pgi/12.5.0                           12) xt-asyncpe/5.11
> > >                21) subversion/1.6.17
> > > > > >   4) xt-libsci/11.1.00                    13) atp/1.4.1
> > >                22) szip/2.1
> > > > > >   5) udreg/2.3.2-1.0400.5038.0.0.gem      14) PrgEnv-pgi/4.0.46
> > >                23) hdf5/1.8.7
> > > > > >   6) ugni/2.3-1.0400.4374.4.88.gem        15) xt-mpich2/5.5.0
> > >                24) netcdf/4.1.3
> > > > > >   7) pmi/3.0.0-1.0000.8661.28.2807.gem    16) xtpe-interlagos
> > >                25) esmf/5.2.0rp1
> > > > > >   8) dmapp/3.2.1-1.0400.4782.3.1.gem      17) eswrap/1.0.9
> > > > > >   9) gni-headers/2.1-1.0400.4351.3.1.gem  18) lustredu/1.0
> > > > > >
> > > > > > --
> > > > > > Jim Edwards
> > > > > >
> > > > > > CESM Software Engineering Group
> > > > > > National Center for Atmospheric Research
> > > > > > Boulder, CO
> > > > > > 303-497-1842
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Jim Edwards
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Jim Edwards
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
>



-- 

Jim Edwards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20120808/af6cd459/attachment-0001.html>


More information about the parallel-netcdf mailing list