[Darshan-commits] [Git][darshan/darshan][master] 7 commits: fix uninitialized variable bug in darshan-diff

Shane Snyder xgitlab at cels.anl.gov
Fri Sep 30 11:47:17 CDT 2016


Shane Snyder pushed to branch master at darshan / darshan


Commits:
200d21f0 by Shane Snyder at 2016-09-26T15:30:21-05:00
fix uninitialized variable bug in darshan-diff

- - - - -
8826d235 by Shane Snyder at 2016-09-26T16:58:09-05:00
fix summary-per-file for default stdio streams

The stdio module stores names for the default streams (stdin,
stderr, stdout) in the following format: <STDERR>. The '<' and
'>' symbols require proper escape characters in order to be
properly interpretated by the darshan-job-summary (perl) script.

- - - - -
941b1ec5 by Shane Snyder at 2016-09-27T09:04:21-05:00
update darshan-util docs with new changes

- - - - -
64457805 by Shane Snyder at 2016-09-30T11:31:09-05:00
add ChangeLog updates for 3.1.0

- - - - -
89437264 by Shane Snyder at 2016-09-30T11:31:29-05:00
update documentation on instrumentation modules

- - - - -
8aba70e7 by Shane Snyder at 2016-09-30T11:32:08-05:00
couple of small fixes for comments

- - - - -
d08cb98e by Shane Snyder at 2016-09-30T11:37:07-05:00
bump library version numbers in configure.in

- - - - -


12 changed files:

- ChangeLog
- darshan-runtime/configure
- darshan-runtime/configure.in
- darshan-runtime/darshan.h
- darshan-util/configure
- darshan-util/configure.in
- darshan-util/darshan-diff.c
- darshan-util/darshan-logutils.h
- darshan-util/darshan-lustre-logutils.c
- darshan-util/darshan-summary-per-file.sh
- darshan-util/doc/darshan-util.txt
- doc/darshan-modularization.txt


Changes:

=====================================
ChangeLog
=====================================
--- a/ChangeLog
+++ b/ChangeLog
@@ -2,6 +2,29 @@
 Darshan Release Change Log
 --------------------------
 
+Darshan-3.1.0
+=============
+* add stdio I/O library instrumentation module (Philip Carns)
+    - this handles instrumentation of file stream I/O functions
+      like fopen(), fprintf(), fscanf(), etc.
+    - this module also captures stats on the standard streams (stdin,
+      stdout, & stdin)
+* add Lustre instrumentation module (Glenn Lockwood)
+    - this module provides Lustre striping details (e.g., stripe
+      width, stripe size, list of OSTs a file is striped over)
+* add new mmap-based logging mechanism that allows Darshan to
+  generate output logs even in cases where applications don't
+  call MPI_Finalize()
+    - these logs are uncompressed and are per-process rather
+      than per-job
+* add the darshan-merge utility to darshan-util to allow per-process
+  logs generated by the mmap-based logging mechanism to be converted
+  into Darshan's traditional compressed per-job log files
+* augment the POSIX module timestamp counters to also include a
+  LAST_OPEN & FIRST_CLOSE counters to give more details on application
+  I/O intervals
+* avoid saving duplicate mount point entries in Darshan log files
+
 Darshan-3.0.1
 =============
 * bug fix in darshan logutil mount parsing code that was


=====================================
darshan-runtime/configure
=====================================
--- a/darshan-runtime/configure
+++ b/darshan-runtime/configure
@@ -1,6 +1,6 @@
 #! /bin/sh
 # Guess values for system-dependent variables and create Makefiles.
-# Generated by GNU Autoconf 2.69 for darshan-runtime 3.0.1.
+# Generated by GNU Autoconf 2.69 for darshan-runtime 3.1.0.
 #
 #
 # Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
@@ -577,8 +577,8 @@ MAKEFLAGS=
 # Identity of this package.
 PACKAGE_NAME='darshan-runtime'
 PACKAGE_TARNAME='darshan-runtime'
-PACKAGE_VERSION='3.0.1'
-PACKAGE_STRING='darshan-runtime 3.0.1'
+PACKAGE_VERSION='3.1.0'
+PACKAGE_STRING='darshan-runtime 3.1.0'
 PACKAGE_BUGREPORT=''
 PACKAGE_URL=''
 
@@ -1248,7 +1248,7 @@ if test "$ac_init_help" = "long"; then
   # Omit some internal or obsolete options to make the list less imposing.
   # This message is too long to be a string in the A/UX 3.1 sh.
   cat <<_ACEOF
-\`configure' configures darshan-runtime 3.0.1 to adapt to many kinds of systems.
+\`configure' configures darshan-runtime 3.1.0 to adapt to many kinds of systems.
 
 Usage: $0 [OPTION]... [VAR=VALUE]...
 
@@ -1309,7 +1309,7 @@ fi
 
 if test -n "$ac_init_help"; then
   case $ac_init_help in
-     short | recursive ) echo "Configuration of darshan-runtime 3.0.1:";;
+     short | recursive ) echo "Configuration of darshan-runtime 3.1.0:";;
    esac
   cat <<\_ACEOF
 
@@ -1419,7 +1419,7 @@ fi
 test -n "$ac_init_help" && exit $ac_status
 if $ac_init_version; then
   cat <<\_ACEOF
-darshan-runtime configure 3.0.1
+darshan-runtime configure 3.1.0
 generated by GNU Autoconf 2.69
 
 Copyright (C) 2012 Free Software Foundation, Inc.
@@ -1771,7 +1771,7 @@ cat >config.log <<_ACEOF
 This file contains any messages produced by compilers while
 running configure, to aid debugging if configure makes a mistake.
 
-It was created by darshan-runtime $as_me 3.0.1, which was
+It was created by darshan-runtime $as_me 3.1.0, which was
 generated by GNU Autoconf 2.69.  Invocation command line was
 
   $ $0 $@
@@ -4348,7 +4348,7 @@ else
   MPICH_LIB_OLD=0
 fi
 
-DARSHAN_VERSION="3.0.1"
+DARSHAN_VERSION="3.1.0"
 
 
 
@@ -4868,7 +4868,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
 # report actual input values of CONFIG_FILES etc. instead of their
 # values after options handling.
 ac_log="
-This file was extended by darshan-runtime $as_me 3.0.1, which was
+This file was extended by darshan-runtime $as_me 3.1.0, which was
 generated by GNU Autoconf 2.69.  Invocation command line was
 
   CONFIG_FILES    = $CONFIG_FILES
@@ -4930,7 +4930,7 @@ _ACEOF
 cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
 ac_cs_version="\\
-darshan-runtime config.status 3.0.1
+darshan-runtime config.status 3.1.0
 configured by $0, generated by GNU Autoconf 2.69,
   with options \\"\$ac_cs_config\\"
 


=====================================
darshan-runtime/configure.in
=====================================
--- a/darshan-runtime/configure.in
+++ b/darshan-runtime/configure.in
@@ -5,7 +5,7 @@ dnl Process this file with autoconf to produce a configure script.
 dnl You may need to use autoheader as well if changing any DEFINEs
 
 dnl sanity checks, output header, location of scripts used here
-AC_INIT([darshan-runtime], [3.0.1])
+AC_INIT([darshan-runtime], [3.1.0])
 AC_CONFIG_SRCDIR([darshan.h])
 AC_CONFIG_AUX_DIR(../maint/config)
 AC_CONFIG_HEADER(darshan-runtime-config.h)


=====================================
darshan-runtime/darshan.h
=====================================
--- a/darshan-runtime/darshan.h
+++ b/darshan-runtime/darshan.h
@@ -82,6 +82,20 @@ struct darshan_fs_info
     int mdt_count;
 };
 
+/* darshan_instrument_fs_data()
+ *
+ * Allow file system-specific modules to instrument data for the file
+ * stored at 'path'. 'fs_type' is checked to determine the underlying
+ * filesystem and calls into the corresponding file system instrumentation
+ * module, if defined -- currently we only have a Lustre module. 'fd' is
+ * the file descriptor corresponding to the file, which may be needed by
+ * the file system to retrieve specific parameters.
+ */
+void darshan_instrument_fs_data(
+    int fs_type,
+    const char *path,
+    int fd);
+
 /*****************************************************
 * darshan-core functions exported to darshan modules *
 *****************************************************/
@@ -144,20 +158,6 @@ void *darshan_core_register_record(
     int rec_len,
     struct darshan_fs_info *fs_info);
 
-/* darshan_instrument_fs_data()
- *
- * Allow file system-specific modules to instrument data for the file
- * stored at 'path'. 'fs_type' is checked to determine the underlying
- * filesystem and calls into the corresponding file system instrumentation
- * module, if defined -- currently we only have a Lustre module. 'fd' is
- * the file descriptor corresponding to the file, which may be needed by
- * the file system to retrieve specific parameters.
- */
-void darshan_instrument_fs_data(
-    int fs_type,
-    const char *path,
-    int fd);
-
 /* darshan_core_wtime()
  *
  * Returns the elapsed time relative to (roughly) the start of


=====================================
darshan-util/configure
=====================================
--- a/darshan-util/configure
+++ b/darshan-util/configure
@@ -1,6 +1,6 @@
 #! /bin/sh
 # Guess values for system-dependent variables and create Makefiles.
-# Generated by GNU Autoconf 2.69 for darshan-util 3.0.1.
+# Generated by GNU Autoconf 2.69 for darshan-util 3.1.0.
 #
 #
 # Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
@@ -577,8 +577,8 @@ MAKEFLAGS=
 # Identity of this package.
 PACKAGE_NAME='darshan-util'
 PACKAGE_TARNAME='darshan-util'
-PACKAGE_VERSION='3.0.1'
-PACKAGE_STRING='darshan-util 3.0.1'
+PACKAGE_VERSION='3.1.0'
+PACKAGE_STRING='darshan-util 3.1.0'
 PACKAGE_BUGREPORT=''
 PACKAGE_URL=''
 
@@ -1236,7 +1236,7 @@ if test "$ac_init_help" = "long"; then
   # Omit some internal or obsolete options to make the list less imposing.
   # This message is too long to be a string in the A/UX 3.1 sh.
   cat <<_ACEOF
-\`configure' configures darshan-util 3.0.1 to adapt to many kinds of systems.
+\`configure' configures darshan-util 3.1.0 to adapt to many kinds of systems.
 
 Usage: $0 [OPTION]... [VAR=VALUE]...
 
@@ -1297,7 +1297,7 @@ fi
 
 if test -n "$ac_init_help"; then
   case $ac_init_help in
-     short | recursive ) echo "Configuration of darshan-util 3.0.1:";;
+     short | recursive ) echo "Configuration of darshan-util 3.1.0:";;
    esac
   cat <<\_ACEOF
 
@@ -1393,7 +1393,7 @@ fi
 test -n "$ac_init_help" && exit $ac_status
 if $ac_init_version; then
   cat <<\_ACEOF
-darshan-util configure 3.0.1
+darshan-util configure 3.1.0
 generated by GNU Autoconf 2.69
 
 Copyright (C) 2012 Free Software Foundation, Inc.
@@ -1758,7 +1758,7 @@ cat >config.log <<_ACEOF
 This file contains any messages produced by compilers while
 running configure, to aid debugging if configure makes a mistake.
 
-It was created by darshan-util $as_me 3.0.1, which was
+It was created by darshan-util $as_me 3.1.0, which was
 generated by GNU Autoconf 2.69.  Invocation command line was
 
   $ $0 $@
@@ -4105,7 +4105,7 @@ fi
 done
 
 
-DARSHAN_UTIL_VERSION="3.0.1"
+DARSHAN_UTIL_VERSION="3.1.0"
 
 
 
@@ -4621,7 +4621,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
 # report actual input values of CONFIG_FILES etc. instead of their
 # values after options handling.
 ac_log="
-This file was extended by darshan-util $as_me 3.0.1, which was
+This file was extended by darshan-util $as_me 3.1.0, which was
 generated by GNU Autoconf 2.69.  Invocation command line was
 
   CONFIG_FILES    = $CONFIG_FILES
@@ -4683,7 +4683,7 @@ _ACEOF
 cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
 ac_cs_version="\\
-darshan-util config.status 3.0.1
+darshan-util config.status 3.1.0
 configured by $0, generated by GNU Autoconf 2.69,
   with options \\"\$ac_cs_config\\"
 


=====================================
darshan-util/configure.in
=====================================
--- a/darshan-util/configure.in
+++ b/darshan-util/configure.in
@@ -5,7 +5,7 @@ dnl Process this file with autoconf to produce a configure script.
 dnl You may need to use autoheader as well if changing any DEFINEs
 
 dnl sanity checks, output header, location of scripts used here
-AC_INIT([darshan-util], [3.0.1])
+AC_INIT([darshan-util], [3.1.0])
 AC_CONFIG_SRCDIR([darshan-logutils.h])
 AC_CONFIG_AUX_DIR(../maint/config)
 AC_CONFIG_HEADER(darshan-util-config.h)


=====================================
darshan-util/darshan-diff.c
=====================================
--- a/darshan-util/darshan-diff.c
+++ b/darshan-util/darshan-diff.c
@@ -362,7 +362,6 @@ static int darshan_build_global_record_hash(
 {
     struct darshan_mod_record_ref *mod_rec;
     struct darshan_file_record_ref *file_rec;
-    darshan_record_id tmp_rec_id;
     struct darshan_base_record *base_rec;
     int i;
     int ret;
@@ -372,10 +371,10 @@ static int darshan_build_global_record_hash(
      */
     for(i = 0; i < DARSHAN_MAX_MODS; i++)
     {
+        if(!mod_logutils[i]) break;
+
         while(1)
         {
-            if(!mod_logutils[i]) break;
-
             mod_rec = malloc(sizeof(struct darshan_mod_record_ref));
             assert(mod_rec);
             memset(mod_rec, 0, sizeof(struct darshan_mod_record_ref));
@@ -400,7 +399,7 @@ static int darshan_build_global_record_hash(
                 base_rec = (struct darshan_base_record *)mod_rec->mod_dat;
                 mod_rec->rank = base_rec->rank;
 
-                HASH_FIND(hlink, *rec_hash, &tmp_rec_id, sizeof(darshan_record_id), file_rec);
+                HASH_FIND(hlink, *rec_hash, &(base_rec->id), sizeof(darshan_record_id), file_rec);
                 if(!file_rec)
                 {
                     /* there is no entry in the global hash table of darshan records
@@ -410,7 +409,7 @@ static int darshan_build_global_record_hash(
                     assert(file_rec);
 
                     memset(file_rec, 0, sizeof(struct darshan_file_record_ref));
-                    file_rec->rec_id = tmp_rec_id;
+                    file_rec->rec_id = base_rec->id;
                     HASH_ADD(hlink, *rec_hash, rec_id, sizeof(darshan_record_id), file_rec);
 
                 }


=====================================
darshan-util/darshan-logutils.h
=====================================
--- a/darshan-util/darshan-logutils.h
+++ b/darshan-util/darshan-logutils.h
@@ -87,13 +87,13 @@ struct darshan_mod_logutil_funcs
         void *buf
     );
     /* print the counters for a given log record
-     *      - 'file_rec' is the record's data buffer
-     *      - 'file_name' is the file path string for the record
-     *      - 'mnt-pt' is the file path mount point string
+     *      - 'rec' is the record's data buffer
+     *      - 'name' is the name string associated with this record (or NULL if there isn't one)
+     *      - 'mnt_pt' is the file path mount point string
      *      - 'fs_type' is the file system type string
      */
     void (*log_print_record)(
-        void *file_rec,
+        void *rec,
         char *file_name,
         char *mnt_pt,
         char *fs_type


=====================================
darshan-util/darshan-lustre-logutils.c
=====================================
--- a/darshan-util/darshan-lustre-logutils.c
+++ b/darshan-util/darshan-lustre-logutils.c
@@ -165,7 +165,7 @@ static void darshan_log_print_lustre_description(int ver)
     printf("#   LUSTRE_MDTS: number of MDTs across the entire file system.\n");
     printf("#   LUSTRE_STRIPE_OFFSET: OST ID offset specified when the file was created.\n");
     printf("#   LUSTRE_STRIPE_SIZE: stripe size for file in bytes.\n");
-    printf("#   LUSTRE_STRIPE_WIDTH: number of OSTs over which file is striped.\n");
+    printf("#   LUSTRE_STRIPE_WIDTH: number of OSTs over which the file is striped.\n");
     printf("#   LUSTRE_OST_ID_*: indices of OSTs over which the file is striped.\n");
 
     return;


=====================================
darshan-util/darshan-summary-per-file.sh
=====================================
--- a/darshan-util/darshan-summary-per-file.sh
+++ b/darshan-util/darshan-summary-per-file.sh
@@ -31,19 +31,31 @@ fi
 counter=0
 darshan-parser --file-list $1| egrep -v '^(#|$)' | cut -f 1-2 | sort -n | uniq |
 while read -r hash filepath stuff ; do
-        counter=$((counter+1))
-	file=$(basename $filepath)
-	if [ -x $file.darshan ] ; then
-		$file = $file.$hash.darshan
-	fi
-        echo Status: Generating summary for file $counter of $filecount: $file
-        echo =======================================================
-	darshan-convert --file $hash $1 $2/$file.darshan
+    counter=$((counter+1))
+    file=$(basename $filepath)
+
+    if [ -x $file.darshan ] ; then
+        $file = $file.$hash.darshan
+    fi
+
+    echo Status: Generating summary for file $counter of $filecount: $file
+    echo =======================================================
+    darshan-convert --file $hash $1 $2/$file.darshan
         rc=$?
         if [ $rc -ne 0 ]; then
            exit $rc
         fi
-	darshan-job-summary.pl $2/$file.darshan --output $2/$file.pdf
+
+    # XXX: manually escape STDIO stdin/stdout/stderr name strings before passing to perl
+    if [ $file == "<STDIN>" ] ; then
+        file="\<STDIN\>"
+    elif [ $file == "<STDOUT>" ] ; then
+        file="\<STDOUT\>"
+    elif [ $file == "<STDERR>" ] ; then
+        file="\<STDERR\>"
+    fi
+
+    darshan-job-summary.pl $2/$file.darshan --output $2/$file.pdf
         rc=$?
         if [ $rc -ne 0 ]; then
            exit $rc


=====================================
darshan-util/doc/darshan-util.txt
=====================================
--- a/darshan-util/doc/darshan-util.txt
+++ b/darshan-util/doc/darshan-util.txt
@@ -150,8 +150,8 @@ The format of this output is described in the following section.
 
 The beginning of the output from darshan-parser displays a summary of
 overall information about the job. Additional job-level summary information
-can also be produced using the `--perf`, `--file`, `--file-list`, or
-`--file-list-detailed` command line options.  See the
+can also be produced using the `--perf`, `--file`, `--file-list`,
+`--file-list-detailed`, or `--total` command line options.  See the
 <<addsummary,Additional summary output>> section for more information about
 those options.
 
@@ -235,10 +235,6 @@ otherwise noted, counters include all variants of the call in question, such as
 | POSIX_SEEKS | Count of POSIX seek operations
 | POSIX_STATS | Count of POSIX stat operations
 | POSIX_MMAPS | Count of POSIX mmap operations
-| POSIX_FOPENS | Count of POSIX stream open operations
-| POSIX_FREADS | Count of POSIX stream read operations
-| POSIX_FWRITES | Count of POSIX stream write operations
-| POSIX_FSEEKS | Count of POSIX stream seek operations
 | POSIX_FSYNCS | Count of POSIX fsync operations
 | POSIX_FDSYNCS | Count of POSIX fdatasync operations
 | POSIX_MODE | Mode that the file was last opened in
@@ -265,16 +261,12 @@ value of 1 MiB for optimal file alignment.
 | POSIX_STRIDE[1-4]_COUNT | Count of 4 most common stride patterns
 | POSIX_ACCESS[1-4]_ACCESS | 4 most common POSIX access sizes
 | POSIX_ACCESS[1-4]_COUNT | Count of 4 most common POSIX access sizes
-| POSIX_FASTEST_RANK | The MPI rank of the rank with smallest time spent in POSIX I/O
+| POSIX_FASTEST_RANK | The MPI rank with smallest time spent in POSIX I/O
 | POSIX_FASTEST_RANK_BYTES | The number of bytes transferred by the rank with smallest time spent in POSIX I/O
-| POSIX_SLOWEST_RANK | The MPI rank of the rank with largest time spent in POSIX I/O
+| POSIX_SLOWEST_RANK | The MPI rank with largest time spent in POSIX I/O
 | POSIX_SLOWEST_RANK_BYTES | The number of bytes transferred by the rank with the largest time spent in POSIX I/O
-| POSIX_F_OPEN_TIMESTAMP | Timestamp of first time that the file was opened
-| POSIX_F_READ_START_TIMESTAMP | Timestamp that the first POSIX read operation began
-| POSIX_F_WRITE_START_TIMESTAMP | Timestamp that the first POSIX write operation began
-| POSIX_F_READ_END_TIMESTAMP | Timestamp that the last POSIX read operation ended
-| POSIX_F_WRITE_END_TIMESTAMP | Timestamp that the last POSIX write operation ended
-| POSIX_F_CLOSE_TIMESTAMP | Timestamp of the last time that the file was closed
+| POSIX_F_*_START_TIMESTAMP | Timestamp that the first POSIX file open/read/write/close operation began
+| POSIX_F_*_END_TIMESTAMP | Timestamp that the last POSIX file open/read/write/close operation ended
 | POSIX_F_READ_TIME | Cumulative time spent reading at the POSIX level
 | POSIX_F_WRITE_TIME | Cumulative time spent in write, fsync, and fdatasync at the POSIX level
 | POSIX_F_META_TIME | Cumulative time spent in open, close, stat, and seek at the POSIX level
@@ -313,9 +305,9 @@ value of 1 MiB for optimal file alignment.
 | MPIIO_SIZE_WRITE_AGG_* | Histogram of total size of write accesses at MPI level, even if access is noncontiguous
 | MPIIO_ACCESS[1-4]_ACCESS | 4 most common MPI aggregate access sizes
 | MPIIO_ACCESS[1-4]_COUNT | Count of 4 most common MPI aggregate access sizes
-| MPIIO_FASTEST_RANK | The MPI rank of the rank with smallest time spent in MPI I/O
+| MPIIO_FASTEST_RANK | The MPI rank with smallest time spent in MPI I/O
 | MPIIO_FASTEST_RANK_BYTES | The number of bytes transferred by the rank with smallest time spent in MPI I/O
-| MPIIO_SLOWEST_RANK | The MPI rank of the rank with largest time spent in MPI I/O
+| MPIIO_SLOWEST_RANK | The MPI rank with largest time spent in MPI I/O
 | MPIIO_SLOWEST_RANK_BYTES | The number of bytes transferred by the rank with the largest time spent in MPI I/O
 | MPIIO_F_OPEN_TIMESTAMP | Timestamp of first time that the file was opened at MPI level
 | MPIIO_F_READ_START_TIMESTAMP | Timestamp that the first MPI read operation began
@@ -328,10 +320,39 @@ value of 1 MiB for optimal file alignment.
 | MPIIO_META_TIME | Cumulative time spent in open and close at MPI level
 | MPIIO_F_MAX_READ_TIME | Duration of the slowest individual MPI read operation
 | MPIIO_F_MAX_WRITE_TIME | Duration of the slowest individual MPI write operation
-| CP_F_FASTEST_RANK_TIME | The time of the rank which had the smallest amount of time spent in MPI I/O (cumulative read, write, and meta times)
-| CP_F_SLOWEST_RANK_TIME | The time of the rank which had the largest amount of time spent in MPI I/O
-| CP_F_VARIANCE_RANK_TIME | The population variance for MPI I/O time of all the ranks
-| CP_F_VARIANCE_RANK_BYTES | The population variance for bytes transferred of all the ranks at MPI level
+| MPIIO_F_FASTEST_RANK_TIME | The time of the rank which had the smallest amount of time spent in MPI I/O (cumulative read, write, and meta times)
+| MPIIO_F_SLOWEST_RANK_TIME | The time of the rank which had the largest amount of time spent in MPI I/O
+| MPIIO_F_VARIANCE_RANK_TIME | The population variance for MPI I/O time of all the ranks
+| MPIIO_F_VARIANCE_RANK_BYTES | The population variance for bytes transferred of all the ranks at MPI level
+|====
+
+
+.STDIO module
+[cols="40%,60%",options="header"]
+|====
+| counter name | description
+| STDIO_OPENS | Count of how many times the file was opened using the stdio interface (e.g., `fopen()`)
+| STDIO_READS | Count of stdio read operations
+| STDIO_WRITES | Count of stdio write operations
+| STDIO_SEEKS | Count of stdio seek operations
+| STDIO_FLUSHES | Count of stdio flush operations
+| STDIO_BYTES_WRITTEN | Total number of bytes written to the file using stdio operations
+| STDIO_BYTES_READ | Total number of bytes read from the file using stdio operations
+| STDIO_MAX_BYTE_READ | Highest offset in the file that was read
+| STDIO_MAX_BYTE_WRITTEN | Highest offset in the file that was written
+| STDIO_FASTEST_RANK | The MPI rank with the smallest time spent in stdio operations
+| STDIO_FASTEST_RANK_BYTES | The number of bytes transferred by the rank with the smallest time spent in stdio operations
+| STDIO_SLOWEST_RANK | The MPI rank with the largest time spent in stdio operations
+| STDIO_SLOWEST_RANK_BYTES | The number of bytes transferred by the rank with the largest time spent in stdio operations
+| STDIO_META_TIME | Cumulative time spent in stdio open/close/seek operations
+| STDIO_WRITE_TIME | Cumulative time spent in stdio write operations
+| STDIO_READ_TIME | Cumulative time spent in stdio read operations
+| STDIO_*_START_TIMESTAMP | Timestamp that the first stdio file open/read/write/close operation began
+| STDIO_*_END_TIMESTAMP | Timestamp that the last stdio file open/read/write/close operation ended
+| STDIO_F_FASTEST_RANK_TIME | The time of the rank which had the smallest time spent in stdio I/O (cumulative read, write, and meta times)
+| STDIO_F_SLOWEST_RANK_TIME | The time of the rank which had the largest time spent in stdio I/O
+| STDIO_F_VARIANCE_RANK_TIME | The population variance for stdio I/O time of all the ranks
+| STDIO_F_VARIANCE_RANK_BYTES | The population variance for bytes transferred of all the ranks
 |====
 
 .HDF5 module
@@ -373,13 +394,26 @@ value of 1 MiB for optimal file alignment.
 | BGQ_F_TIMESTAMP | Timestamp of when BG/Q data was collected
 |====
 
+
+.Lustre module (if enabled, for Lustre file systems)
+[cols="40%,60%",options="header"]
+|====
+| counter name | description
+| LUSTRE_OSTS | number of OSTs (object storage targets) for the file system
+| LUSTRE_MDTS | number of MDTs (metadata targets) for the file system
+| LUSTRE_STRIPE_OFFSET | OST id offset specified at file creation time
+| LUSTRE_STRIPE_SIZE | stripe size for the file in bytes
+| LUSTRE_STRIPE_WIDTH | number of OSTs over which the file is striped
+| LUSTRE_OST_ID_* | indices of OSTs over which the file is striped
+|====
+
 ==== Additional summary output
 [[addsummary]]
 
 The following sections describe addtitional parser options that provide
 summary I/O characterization data for the given log.
 
-*NOTE*: These options are currently only supported by the POSIX and MPI-IO modules.
+*NOTE*: These options are currently only supported by the POSIX, MPI-IO, and stdio modules.
 
 ===== Performance
 
@@ -475,19 +509,18 @@ column is the maximum offset accessed.
 * write_only: Files that were only written to
 * read_write: Files that were both read and written
 * unique: Files that were opened on only one rank
-* shared: File that were opened by more than one rank
+* shared: Files that were opened by more than one rank
 
 
 .Example output
 ----
-# files
-# -----
-# total: 1542 236572244952 154157611
-# read_only: 3 133998651 122805519
-# write_only: 1539 236438246301 154157611
+# <file_type> <file_count> <total_bytes> <max_byte_offset>
+# total: 5 4371499438884 4364699616485
+# read_only: 2 4370100334589 4364699616485
+# write_only: 1 1399104295 1399104295
 # read_write: 0 0 0
-# unique: 2 11193132 11193063
-# shared: 1540 236561051820 154157611
+# unique: 0 0 0
+# shared: 5 4371499438884 4364699616485
 ----
 
 ===== Totals
@@ -525,6 +558,7 @@ file.
 .Example output
 ----
 # Per-file summary of I/O activity.
+# -----
 # <record_id>: darshan record id for this file
 # <file_name>: full file name
 # <nprocs>: number of processes that opened the file
@@ -558,6 +592,8 @@ If the `--bzip2` flag is given, then the output file will be re-compressed in
 bzip2 format rather than libz format.  It also has command line options for
 anonymizing personal data, adding metadata annotation to the log header, and
 restricting the output to a specific instrumented file.
+* darshan-diff: provides a text diff of two Darshan log files, comparing both
+job-level metadata and module data records between the files.
 * darshan-analyzer: walks an entire directory tree of Darshan log files and
 produces a summary of the types of access methods used in those log files.
 * darshan-logutils*: this is a library rather than an executable, but it


=====================================
doc/darshan-modularization.txt
=====================================
--- a/doc/darshan-modularization.txt
+++ b/doc/darshan-modularization.txt
@@ -1,7 +1,7 @@
 :data-uri:
 
-Darshan modularization branch development notes
-===============================================
+Modularized I/O characterization using Darshan 3.x
+==================================================
 
 == Introduction
 
@@ -16,30 +16,7 @@ modules, which are responsible for gathering I/O data from a specific system com
 manage these modules at runtime and create a valid Darshan log regardless of how many
 or what types of modules are used.
 
-== Checking out and building the modularization branch
-
-The Darshan source code is available at the following GitLab project page:
-https://xgitlab.cels.anl.gov/darshan/darshan. It is worth noting that this page
-also provides issue tracking to provide users the ability to browse known issues
-with the code or to report new issues.
-
-The following commands can be used to clone the Darshan source code and checkout
-the modularization branch:
-
-----
-git clone git at xgitlab.cels.anl.gov:darshan/darshan.git
-cd darshan
-git checkout dev-modular
-----
-
-For details on configuring, building, and using the Darshan runtime and utility
-repositories, consult the documentation from previous versions
-(http://www.mcs.anl.gov/research/projects/darshan/docs/darshan-runtime.html[darshan-runtime] and
-http://www.mcs.anl.gov/research/projects/darshan/docs/darshan-util.html[darshan-util]) -- the
-necessary steps for building these repositories should not have changed in the new version of
-Darshan.
-
-== Darshan dev-modular overview
+== Overview of Darshan's modularized architecture
 
 The Darshan source tree is organized into two primary components:
 
@@ -121,7 +98,7 @@ component so it is included in the output I/O characterization.
 The static initialization approach is useful for modules that do not have function calls
 that can be intercepted and instead can just grab all I/O characterization data at Darshan
 startup or shutdown time. A module can be statically initialized at Darshan startup time
-by adding its initializatin routine to the `mod_static_init_fns` list at the top of the
+by adding its initializatin routine to the `mod_static_init_fns` array at the top of the
 `lib/darshan-core.c` source file.
 
 *NOTE*: Modules may wish to add a corresponding configure option to disable the module
@@ -131,7 +108,7 @@ used on other systems.
 
 Most instrumentation modules can just bootstrap themselves within wrapper functions during
 normal application execution. Each of Darshan's current I/O library instrumentation modules
-(POSIX, MPI-IO, HDF5, PnetCDF) follow this approach. Each wrapper function should just include
+(POSIX, MPI-IO, stdio, HDF5, PnetCDF) follow this approach. Each wrapper function should just include
 logic to initialize data structures and register with `darshan-core` if this initialization
 has not already occurred. Darshan intercepts function calls of interest by inserting these
 wrappers at compile time for statically linked executables (e.g., using the linkers
@@ -144,36 +121,23 @@ minimizing Darshan's impact on application I/O performance.
 
 When the instrumented application terminates and Darshan begins its shutdown procedure, it requires
 a way to interface with any active modules that have data to contribute to the output I/O characterization.
-Darshan requires that module developers implement the following functions to allow the Darshan runtime
-environment to coordinate with modules while shutting down:
+The following function is implemented by each module to finalize (and perhaps reorganize) module records
+before returning the record memory back to darshan-core to be compressed and written to file.
 
 [source,c]
-struct darshan_module_funcs
-{
-    void (*begin_shutdown)(void);
-    void (*get_output_data)(
-        MPI_Comm mod_comm,
-        darshan_record_id *shared_recs,
-        int shared_rec_count,
-        void** mod_buf,
-        int* mod_buf_sz
-    );
-    void (*shutdown)(void);
-};
-
-`begin_shutdown()`
-
-This function informs the module that Darshan is about to begin shutting down. It should disable
-all wrappers to prevent the module from making future updates to internal data structures, primarily
-to ensure data consistency and avoid other race conditions.
-
-`get_output_data()`
-
-This function is responsible for packing all module I/O data into a single buffer to be written
-to the output I/O characterization. This function can be used to run collective MPI operations on
-module data; for instance, Darshan typically tries to reduce file records which are shared across
-all application processes into a single data record (more details on the shared file reduction
-mechanism are given in link:darshan-modularization.html#_shared_record_reductions[Section 5]).
+typedef void (*darshan_module_shutdown)(
+    MPI_Comm mod_comm,
+    darshan_record_id *shared_recs,
+    int shared_rec_count,
+    void** mod_buf,
+    int* mod_buf_sz
+);
+
+This function can be used to run collective MPI operations on module data; for instance, Darshan
+typically tries to reduce file records which are shared across all application processes into a
+single data record (more details on the shared file reduction mechanism are given in
+link:darshan-modularization.html#_shared_record_reductions[Section 5]). This function also serves
+as a final opportunity for modules to cleanup and free any allocated data structures, etc.
 
 * _mod_comm_ is the MPI communicator to use for collective communication
 
@@ -182,14 +146,11 @@ processes
 
 * _shared_rec_count_ is the size of the shared record list
 
-* _mod_buf_ is a pointer to the buffer of this module's I/O characterization data
-
-* _mod_buf_sz_ is the size of the module's output buffer
+* _mod_buf_ is a pointer to the buffer address of the module's contiguous set of data records
 
-`shutdown()`
-
-This function is a signal from Darshan that it is safe to shutdown. It should clean up and free
-all internal data structures.
+* _mod_buf_sz_ is a pointer to a variable storing the aggregate size of the module's records. On
+input, the pointed to value indicates the aggregate size of the module's registered records; on
+ouptut, the value may be updated if, for instance, certain records are discarded
 
 ==== darshan-core
 
@@ -206,9 +167,9 @@ described in detail below.
 [source,c]
 void darshan_core_register_module(
     darshan_module_id mod_id,
-    struct darshan_module_funcs *funcs,
-    int *my_rank,
+    darshan_module_shutdown mod_shutdown_func,
     int *mod_mem_limit,
+    int *rank,
     int *sys_mem_alignment);
 
 The `darshan_core_register_module` function registers Darshan instrumentation modules with the
@@ -218,20 +179,18 @@ will contribute data to Darshan's final I/O characterization.
 * _mod_id_ is a unique identifier for the given module, which is defined in the Darshan log
 format header file (`darshan-log-format.h`).
 
-* _funcs_ is the structure of function pointers (as described above in the previous section) that
-a module developer must provide to interface with the darshan-core runtime. 
+* _mod_shutdown_func_ is the function pointer to the module shutdown function described in the
+previous section.
 
-* _my_rank_ is a pointer to an integer to store the calling process's application MPI rank in
+* _inout_mod_buf_size_ is an input/output argument that stores the amount of module memory
+being requested when calling the function and the amount of memory actually reserved by
+darshan-core when returning.
 
-* _mod_mem_limit_ is a pointer to an integer which will store the amount of memory Darshan
-allows this module to use at runtime. Darshan's default module memory limit is currently set to
-2 MiB, but the user can choose a different value at configure time (using the `--with-mod-mem`
-configure option) or at runtime (using the DARSHAN_MODMEM environment variable). Note that Darshan
-does not allocate any memory for modules; it just informs a module how much memory it can use.
+* _rank_ is a pointer to an integer to store the calling process's application MPI rank in.
+`NULL` may be passed in to ignore this value.
 
 * _sys_mem_alignment_ is a pointer to an integer which will store the system memory alignment value
-Darshan was configured with. This parameter may be set to `NULL` if a module is not concerned with the
-memory alignment value.
+Darshan was configured with. `NULL` may be passed in to ignore this value.
 
 [source,c]
 void darshan_core_unregister_module(
@@ -241,64 +200,56 @@ The `darshan_core_unregister_module` function disassociates the given module fro
 `darshan-core` runtime. Consequentially, Darshan does not interface with the given module at
 shutdown time and will not log any I/O data from the module. This function should only be used
 if a module registers itself with darshan-core but later decides it does not want to contribute
-any I/O data.
+any I/O data. Note that, in the current implementation, Darshan does not have the ability to
+reclaim the record memory allocated to the calling module to assign to other modules.
 
 * _mod_id_ is the unique identifer for the module being unregistered.
 
 [source,c]
-void darshan_core_register_record(
-    void *name,
-    int len,
-    darshan_module_id mod_id,
-    int printable_flag,
-    int mod_limit_flag,
-    darshan_record_id *rec_id,
-    int *file_alignment);
-
-The `darshan_core_register_record` function registers some data record with the darshan-core
-runtime. This record could reference a POSIX file or perhaps an object identifier for an
-object storage system, for instance.  A unique identifier for the given record name is
-generated by Darshan, which should then be used by the module for referencing the corresponding
-record.  This allows multiple modules to refer to a specific data record in a consistent manner
-and also provides a mechanism for mapping these records back to important metadata stored by
-darshan-core. It is safe (and likely necessary) to call this function many times for the same
-record -- darshan-core will just set the corresponding record identifier if the record has
-been previously registered.
-
-* _name_ is just the name of the data record, which could be a file path, object ID, etc.
-
-* _len_ is the size of the input record name. For string record names, this would just be the
-string length, but for nonprintable record names (e.g., an integer object identifier), this
-is the size of the record name type.
+darshan_record_id darshan_core_gen_record_id(
+    const char *name);
 
-* _mod_id_ is the identifier for the module attempting to register this record.
+The `darshan_core_gen_record_id` function simply generates a unique record identifier for a
+given record name. This function is generally called to convert a name string to a unique record
+identifier that is needed to register a data record with darshan-core. The generation of IDs
+is consistent, such that modules which reference records with the same names will store these
+records using the same unique IDs, simplifying the correlation of these records for analysis.
 
-* _printable_flag_ indicates whether the input record name is a printable ASCII string.
+* _name_ is the name of the corresponding data record (often times this is just a file name).
 
-* _mod_limit_flag_ indicates whether the calling module is out of memory to instrument new
-records or not. If this flag is set, darshan-core will not create new records and instead just
-search existing records for one corresponding to input _name_. 
+[source,c]
+void *darshan_core_register_record(
+    darshan_record_id rec_id,
+    const char *name,
+    darshan_module_id mod_id,
+    int rec_len,
+    int *fs_info);
 
-* _rec_id_ is a pointer to a variable which will store the unique record identifier generated
-by Darshan.
+The `darshan_core_register_record` function registers a data record with the darshan-core
+runtime, allocating memory for the record so that it is persisted in the output log file.
+This record could reference a POSIX file or perhaps an object identifier for an
+object storage system, for instance. This function should only be called once for each
+record being tracked by a module to avoid duplicating record memory. This function returns
+the address which the record should be stored at or `NULL` if there is insufficient
+memory for storing the record.
 
-* _file_alignment_ is a pointer to an integer which will store the the file alignment (block size)
-of the underlying storage system. This parameter may be set to `NULL` if it is not applicable to a
-given module.
+* _rec_id_ is a unique integer identifier for this record (generally generated using the
+`darshan_core_gen_record_id` function).
 
-[source,c]
-void darshan_core_unregister_record(
-    darshan_record_id rec_id,
-    darshan_module_id mod_id);
+* _name_ is the string name of the data record, which could be a file path, object ID, etc.
+If given, darshan-core will associate the given name with the record identifier and store
+this mapping in the log file so it can be retrieved for analysis. `NULL` may be passed in
+to generate an anonymous (unnamed) record.
 
-The `darshan_core_unregister_record` function disassociates the given module identifier from the
-given record identifier. If no other modules are associated with the given record identifier, then
-Darshan removes all internal references to the record. This function should only be used if a
-module registers a record with darshan-core, but later decides not to store the record internally.
+* _mod_id_ is the identifier for the module attempting to register this record.
 
-* _rec_id_ is the record identifier we want to unregister.
+* _rec_len_ is the length of the record.
 
-* _mod_id_ is the module identifier that is unregistering _rec_id_.
+* _fs_info_ is a pointer to a structure of relevant info for the file system associated
+with the given record -- this structure is defined in the `darshan.h` header. Note that this
+functionality only works for record names that are absolute file paths, since we determine
+the file system by matching the file path to the list of mount points Darshan is aware of.
+`NULL` may be passed in to ignore this value.
 
 [source,c]
 double darshan_core_wtime(void);
@@ -307,6 +258,16 @@ The `darshan_core_wtime` function simply returns a floating point number of seco
 Darshan was initialized. This functionality can be used to time the duration of application
 I/O calls or to store timestamps of when functions of interest were called.
 
+[source,c]
+double darshan_core_excluded_path(
+    const char *path);
+
+The `darshan_core_excluded_path` function checks to see if a given file path is in Darshan's
+list of excluded file paths (i.e., paths that we don't instrument I/O to/from, such as /etc,
+/dev, /usr, etc.).
+
+* _path_ is the absolute file path we are checking.
+
 ==== darshan-common
 
 `darshan-common` is a utility component of darshan-runtime, providing module developers with
@@ -333,17 +294,20 @@ simplifying maintenance.
 
 === Darshan-util
 
-The darshan-util component is composed of a log parsing library (libdarshan-util) and a
-corresponding set of utility programs that can parse and analyze Darshan I/O characterization
-logs using this library. The log parsing library includes a generic interface (see
-`darshan-logutils.h`) for retrieving specific portions of a given log file. Specifically,
-this interface allows utilities to retrieve a log's header metadata, job details, record
-identifier mapping, and any module-specific data contained within the log.
-
-Module developers may wish to define additional interfaces for parsing module-specific data
-that can then be integrated into the log parsing library. This extended functionality can be
-implemented in terms of the generic functions offered by darshan-logutils and by module-specific
-formatting information.
+The darshan-util component is composed of a helper library for accessing log file data
+records (`libdarshan-util`) and a set of utilities that use this library to analyze
+application I/O behavior. `libdarhan-util` includes a generic interface (`darshan-logutils`)
+for retrieving specific components of a given log file. Specifically, this interface allows
+utilities to retrieve a log's header metadata, job details, record ID to name mapping, and
+any module-specific data contained within the log.
+
+`libdarshan-util` additionally includes the definition of a generic module interface (`darshan-mod-logutils`)
+that may be implemented by modules to provide a consistent way for Darshan utilities to interact
+with module data stored in log files. This interface is necessary since each module has records
+of varying size and format, so module-specific code is needed to interact with the records in a
+generic manner. This interface is used by the `darshan-parser` utility, for instance, to extract
+data records from all modules contained in a log file and to print these records in a consistent
+format that is amenable to further analysis by other tools.
 
 ==== darshan-logutils
 
@@ -366,22 +330,22 @@ denotes whether the log is storing partial data (that is, all possible applicati
 were not tracked by darshan). Returns a Darshan file descriptor on success or `NULL` on error.
 
 [source,c]
-int darshan_log_getjob(darshan_fd fd, struct darshan_job *job);
-int darshan_log_putjob(darshan_fd fd, struct darshan_job *job);
+int darshan_log_get_job(darshan_fd fd, struct darshan_job *job);
+int darshan_log_put_job(darshan_fd fd, struct darshan_job *job);
 
 Reads/writes `job` structure from/to the log file referenced by descriptor `fd`. The `darshan_job`
 structure is defined in `darshan-log-format.h`. Returns `0` on success, `-1` on failure.
 
 [source,c]
-int darshan_log_getexe(darshan_fd fd, char *buf);
-int darshan_log_putexe(darshan_fd fd, char *buf);
+int darshan_log_get_exe(darshan_fd fd, char *buf);
+int darshan_log_put_exe(darshan_fd fd, char *buf);
 
 Reads/writes the corresponding executable string (exe name and command line arguments)
 from/to the Darshan log referenced by `fd`. Returns `0` on success, `-1` on failure.
 
 [source,c]
-int darshan_log_getmounts(darshan_fd fd, char*** mnt_pts, char*** fs_types, int* count);
-int darshan_log_putmounts(darshan_fd fd, char** mnt_pts, char** fs_types, int count);
+int darshan_log_get_mounts(darshan_fd fd, char*** mnt_pts, char*** fs_types, int* count);
+int darshan_log_put_mounts(darshan_fd fd, char** mnt_pts, char** fs_types, int count);
 
 Reads/writes mounted file system information for the Darshan log referenced by `fd`. `mnt_pnts` points
 to an array of strings storing mount points, `fs_types` points to an array of strings storing file
@@ -389,12 +353,12 @@ system types (e.g., ext4, nfs, etc.), and `count` points to an integer storing t
 of mounted file systems recorded by Darshan. Returns `0` on success, `-1` on failure.
 
 [source,c]
-int darshan_log_gethash(darshan_fd fd, struct darshan_record_ref **hash);
-int darshan_log_puthash(darshan_fd fd, struct darshan_record_ref *hash);
+int darshan_log_get_namehash(darshan_fd fd, struct darshan_name_record_ref **hash);
+int darshan_log_put_namehash(darshan_fd fd, struct darshan_name_record_ref *hash);
 
 Reads/writes the hash table of Darshan record identifiers to full names for all records
 contained in the Darshan log referenced by `fd`. `hash` is a pointer to the hash table (of type
-struct darshan_record_ref *, which should be initialized to `NULL` for reading). This hash table
+struct darshan_name_record_ref *), which should be initialized to `NULL` for reading. This hash table
 is defined by the `uthash` hash table implementation and includes corresponding macros for
 searching, iterating, and deleting records from the hash. For detailed documentation on using this
 hash table, consult `uthash` documentation in `darshan-util/uthash-1.9.2/doc/txt/userguide.txt`.
@@ -402,18 +366,19 @@ The `darshan-parser` utility (for parsing module information out of a Darshan lo
 example of how this hash table may be used. Returns `0` on success, `-1` on failure.
 
 [source,c]
-int darshan_log_getmod(darshan_fd fd, darshan_module_id mod_id, void *mod_buf, int mod_buf_sz);
-int darshan_log_putmod(darshan_fd fd, darshan_module_id mod_id, void *mod_buf, int mod_buf_sz);
+int darshan_log_get_mod(darshan_fd fd, darshan_module_id mod_id, void *mod_buf, int mod_buf_sz);
+int darshan_log_put_mod(darshan_fd fd, darshan_module_id mod_id, void *mod_buf, int mod_buf_sz, int ver);
 
 Reads/writes a chunk of (uncompressed) module data for the module identified by `mod_id` from/to
-the Darshan log referenced by `fd`. `mod_buf_sz` specifies the number of uncompressed bytes to
-read/write from/to the file and store in `mod_buf`. The `darshan_log_getmod` routine can be
+the Darshan log referenced by `fd`. `mod_buf` is the buffer to read data into or write data from,
+and `mod_buf_sz` is the corresponding size of the buffer. The `darshan_log_getmod` routine can be
 repeatedly called to retrieve chunks of uncompressed data from a specific module region of the
 log file given by `fd`. The `darshan_log_putmod` routine just continually appends data to a
-specific module region in the log file given by `fd`. This function returns the number of bytes
-read/written on success, `-1` on failure.
+specific module region in the log file given by `fd` and accepts an additional `ver` parameter
+indicating the version number for the module data records being written. These functions return
+the number of bytes read/written on success, `-1` on failure.
 
-*NOTE*: Darshan use a reader makes right conversion strategy to rectify endianness issues
+*NOTE*: Darshan use a "reader makes right" conversion strategy to rectify endianness issues
 between the machine a log was generated on and a machine analyzing the log. Accordingly,
 module-specific log utility functions will need to check the `swap_flag` variable of the Darshan
 file descriptor to determine if byte swapping is necessary. 32-bit and 64-bit byte swapping
@@ -431,6 +396,42 @@ The correct order for writing all log file data to file is: (1) job data, (2) ex
 mount data, (4) record id -> file name map, (5) each module's data, in increasing order of
 module identifiers.
 
+==== darshan-mod-logutils
+
+The `darshan-mod-logutils` interface provides a convenient way to implement new log functionality
+across all Darshan instrumentation modules, which can potentially greatly simplify the developent
+of new Darshan log utilies. These functions are defined in the `darshan_mod_logutil_funcs` structure
+in `darshan-logutils.h` -- instrumentation modules simply provide their own implementation of each
+function, then utilities can leverage this functionality using the `mod_logutils` array defined in
+`darshan-logutils.c`. A description of some of the currently implemented functions are provided below.
+
+[source,c]
+int log_get_record(darshan_fd fd, void **buf);
+int log_put_record(darshan_fd fd, void *buf);
+
+Reads/writes the module record stored in `buf` to the log referenced by `fd`. Notice that a
+size parameter is not needed since the utilities calling this interface will likely not know
+the record size -- the module-specific log utility code can determine the corresponding size
+before reading/writing the record from/to file.
+
+*NOTE*: `log_get_record` takes a pointer to a buffer address rather than just the buffer address.
+If the pointed to address is equal to `NULL`, then record memory should be allocated instead. This
+functionality helps optimize memory usage, since utilities often don't know the size of records
+being accessed but still must provide a buffer to read them into.
+
+[source,c]
+void log_print_record(void *rec, char *name, char *mnt_pt, char *fs_type);
+
+Prints all data associated with the record pointed to by `rec`. `name` holds the corresponding name
+string for this record. `mnt_pt` and `fs_type` hold the corresponding mount point path and file
+system type strings associated with the record (only valid for records with names that are absolute
+file paths).
+
+[source,c]
+void log_print_description(int ver);
+
+Prints a description of the data stored within records for this module (with version number `ver`).
+
 == Adding new instrumentation modules
 
 In this section we outline each step necessary for adding a module to Darshan. To assist module
@@ -487,11 +488,10 @@ provide the following notes to assist module developers:
 
 * Modules only need to include the `darshan.h` header to interface with darshan-core.
 
-* The file record identifier given when registering a record with darshan-core can be used
+* The file record identifier given when registering a record with darshan-core should be used
 to store the record structure in a hash table or some other structure.
-    - The `darshan_core_register_record` function is really more like a lookup function. It
-    may be called multiple times for the same record -- if the record already exists, the function
-    simply returns its record ID.
+    - Subsequent calls that need to modify this record can then use the corresponding record
+    identifier to lookup the record in this local hash table.
     - It may be necessary to maintain a separate hash table for other handles which the module
     may use to refer to a given record. For instance, the POSIX module may need to look up a
     file record based on a given file descriptor, rather than a path name.
@@ -527,8 +527,8 @@ data record, module developers should consider implementing this functionality e
 is not strictly required. 
 
 Module developers should implement the shared record reduction mechanism within the module's
-`get_output_data()` function, as it provides an MPI communicator for the module to use for
-collective communication and a list of record identifiers which are shared globally by the
+`darshan_module_shutdown()` function, as it provides an MPI communicator for the module to use
+for collective communication and a list of record identifiers which are shared globally by the
 module (as described in link:darshan-modularization.html#_darshan_runtime[Section 3.1]).
 
 In general, implementing a shared record reduction involves the following steps:



View it on GitLab: https://xgitlab.cels.anl.gov/darshan/darshan/compare/1419f48eb621fbb30a8dcd50cab48de4de8aaffa...d08cb98ec4cb35f6be67a94bb6b50925194f04ce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-commits/attachments/20160930/32b7376e/attachment-0001.html>


More information about the Darshan-commits mailing list