[Darshan-users] my pnetcdf changes

Phil Carns carns at mcs.anl.gov
Mon Jul 19 12:56:59 CDT 2010


On 07/19/2010 12:29 PM, Rob Latham wrote:
> I wanted to measure pnetcdf overhead so I made some changes to darshan
> in the 'more-pnetcdf' branch.
>
> - I added a few more wrapped functions.  The particular app I wanted
>    to examine only used a few I/O routines, so I have not done a full
>    wrap of the pnetcdf API
>
> - (and this is why I'm on a branch) I added a new records to the log
>    file format to record cumulative time spent in the pnetcdf I/O
>    routines.
>
> - I modified the summary generator to add a third column reporting
>    percent of time spent in pnetcdf.
>
> I confirmed that very little time is actually spent in pnetcdf, even
> when the more complicated non-blocking API is used.  Thus, that third
> stacked bar graph in the summary plot ends up being pretty useless: at
> that resolution, "% of time in MPI" and "% of time in Pnetcdf" are the
> same number of pixels.
>
> Most of the I/O costs for this application are spent in the MPI-IO
> library, which I also suspected but now have confirmation.
>    
Great!
> what i really want at the end is a pnetcdf profile: time in "define
> mode", time in metadata, time in I/O, number of calls, etc.
>
> At this point I can go in two directions: I can make a standalone
> 'pnetcdf profile' package.  It would conflict with darshan, since
> darshan already wraps ncmpi_open and ncmpi_close, so I'd have to use
> this or the other, but not both.
>
> Or, I can continue to modify darshan.  Drawbacks: increased data in
> log file; increased overhead when compiling summaries at close; change
> in log file format becomes "official", so I suspect I'd get one shot
> to get it right;
>
> More drawbacks: including stats for one library opens the door for
> more libraries.  The HDF5 folks aren't going to work on Darshan, I
> don't think, and there will likely be other libraries in the
> not-so-distant future.   standalone library stat collections have the
> benefit of being more flexible.
>
> Maybe the right thing is to make it easy to add additional information
> to darshan without changing the file format?  Augmented data goes to
> an "extended statistics" file?

That's an interesting idea.  I haven't thought about this much but I'll 
throw some ideas out.  Maybe we can compartmentalize the different apis 
better.

Right now we lump all of the statistics into two big arrays (one for 
integers and one for floats), whether the information is about normal 
posix, fstreams, mpi-io, hdf, or pnetcdf.  One thing we could do is 
split the netcdf counters into their own arrays.  At shutdown time, we 
could do an allreduce or'ing together a flag to see if any ranks 
detected netcdf activity.  If any rank did, then we set a flag in the 
header and all ranks write netcdf counters into the log file.  If no one 
saw netcdf, then no one writes the counters.  We could toggle what the 
reduction operator does for file records depending on if netcdf were 
present or not too.

If we were clever enough it could be like a plug-in api, where 
developers can tack on new counters without breaking the file format 
(darshan-parser could just ignore whatever section(s) of the file 
contain plug-in statistics).  The plug-in api would tell darshan how big 
the data is per file record, provide a reduction function for shared 
files, provide a function to detect if the plugin was activated in a 
given run, provide some unique value and/or version number that can be 
set in the header to identify what plugin generated the data, etc.

Maybe that's over-engineering, though :)  We need to think about how 
likely we are to use this for other APIs.  In the short term picture 
there isn't that much reason not to just keep adding netcdf counters.  
It's not likely to make much difference in the output file size (gzip is 
great at compressing zeroed out counters), so its mainly a question of 
if there is some maintenance value in splitting up some things...

I'm cc'ing Jason too, because he has done some work in hooking darshan 
up to zoidfs.  Maybe that's another example where it might be helpful to 
be able to add in wrappers and counters without perturbing the base file 
format?

A side issue (but related) is that I imagine its hard to maintain extra 
counters outside of trunk when Kevin and I keep inserting new base 
counters here and there in the main arrays.

-Phil



More information about the Darshan-users mailing list