[Darshan-users] my pnetcdf changes

Rob Latham robl at mcs.anl.gov
Mon Jul 19 11:29:58 CDT 2010


I wanted to measure pnetcdf overhead so I made some changes to darshan
in the 'more-pnetcdf' branch.  

- I added a few more wrapped functions.  The particular app I wanted
  to examine only used a few I/O routines, so I have not done a full
  wrap of the pnetcdf API

- (and this is why I'm on a branch) I added a new records to the log
  file format to record cumulative time spent in the pnetcdf I/O
  routines.

- I modified the summary generator to add a third column reporting
  percent of time spent in pnetcdf.  

I confirmed that very little time is actually spent in pnetcdf, even
when the more complicated non-blocking API is used.  Thus, that third
stacked bar graph in the summary plot ends up being pretty useless: at
that resolution, "% of time in MPI" and "% of time in Pnetcdf" are the
same number of pixels.

Most of the I/O costs for this application are spent in the MPI-IO
library, which I also suspected but now have confirmation.

what i really want at the end is a pnetcdf profile: time in "define
mode", time in metadata, time in I/O, number of calls, etc.  

At this point I can go in two directions: I can make a standalone
'pnetcdf profile' package.  It would conflict with darshan, since
darshan already wraps ncmpi_open and ncmpi_close, so I'd have to use
this or the other, but not both.  

Or, I can continue to modify darshan.  Drawbacks: increased data in
log file; increased overhead when compiling summaries at close; change
in log file format becomes "official", so I suspect I'd get one shot
to get it right;

More drawbacks: including stats for one library opens the door for
more libraries.  The HDF5 folks aren't going to work on Darshan, I
don't think, and there will likely be other libraries in the
not-so-distant future.   standalone library stat collections have the
benefit of being more flexible.   

Maybe the right thing is to make it easy to add additional information
to darshan without changing the file format?  Augmented data goes to
an "extended statistics" file?  

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


More information about the Darshan-users mailing list