[Darshan-commits] [Git][darshan/darshan][dev-modular] 15 commits: fix for gnuplot version number check

Thu Oct 1 11:49:02 CDT 2015

Shane Snyder pushed to branch dev-modular at darshan / darshan


Commits:
72726387 by Phil Carns at 2015-04-29T10:59:18Z
fix for gnuplot version number check

- contributed by Kay Thust

- - - - -
1d202464 by Shane Snyder at 2015-06-30T22:31:31Z
fix map_or_fail typo in lio_listio wrapper

- - - - -
341bf09f by Shane Snyder at 2015-06-30T22:36:16Z
bug fix for aio_return wrapper

- - - - -
14ad6274 by Shane Snyder at 2015-07-01T09:16:47Z
update Changelog for last 2 bug fixes

- - - - -
6be9df5b by Shane Snyder at 2015-08-06T18:01:41Z
bug fix in common access counter logic

- - - - -
fbce8030 by Shane Snyder at 2015-08-07T08:47:18Z
update changelog for common access bug

- - - - -
f68228a5 by Phil Carns at 2015-08-19T11:55:50Z
integrate expanded darshan-parser documentation

- provided by Huong Luu

- - - - -
c5a27637 by Phil Carns at 2015-09-08T12:30:55Z
trivial stat collection script

- for each job, count number of files that used collectives, independent
  operations, or posix

- - - - -
0ffb9547 by Phil Carns at 2015-09-08T16:41:44Z
friendlier output

- - - - -
676030f1 by Phil Carns at 2015-09-08T16:45:43Z
bug fix

- - - - -
7eecf5d0 by Phil Carns at 2015-09-25T13:05:00Z
markdown version of readme for gitlab

- - - - -
3d3c4257 by Phil Carns at 2015-09-25T13:06:48Z
trying to figure out why markdown link is grumpy

- - - - -
83f12d5f by Phil Carns at 2015-09-25T13:16:52Z
whitespace change to test commit hooks

- - - - -
f6fbcec4 by Shane Snyder at 2015-10-01T11:19:41Z
Merge remote-tracking branch 'origin/master' into dev-modular

Conflicts:
	ChangeLog
	darshan-runtime/configure
	darshan-runtime/configure.in
	darshan-runtime/darshan.h
	darshan-runtime/lib/darshan-posix.c
	darshan-util/configure
	darshan-util/configure.in
	darshan-util/doc/darshan-util.txt

- - - - -
41575785 by Shane Snyder at 2015-10-01T11:46:02Z
small text changes in darshan-util docs

- - - - -


4 changed files:

- ChangeLog
- README → README.md
- + darshan-test/darshan-gather-mpi-posix-usage.pl
- darshan-util/doc/darshan-util.txt


Changes:

=====================================
ChangeLog
=====================================

--- a/ChangeLog
+++ b/ChangeLog
@@ -27,6 +27,16 @@ Darshan-3.0.0-pre1
       darshan-util components are mostly the same and still located in
       their respective directories ('darshan-runtime/doc' and 'darshan-util/doc')
 
+darshan-2.3.2-pre1
+=============
+* Fix gnuplot version number check to allow darshan-job-summary.pl to work
+  with gnuplot 5.0 (Kay Thust)
+* Fix function pointer mapping typo in lio_listio64 wrapper (Shane Snyder)
+* Fix faulty logic in extracting I/O data from the aio_return 
+  wrapper (Shane Snyder)
+* Fix bug in common access counter logic (Shane Snyder)
+* Expand and clarify darshan-parser documentation (Huong Luu)
+
 darshan-2.3.1
 =============
 * added documentation and example configuration files for using the -profile


=====================================
README → README.md
=====================================
--- a/README
+++ b/README.md
@@ -1,4 +1,14 @@
-The Darshan source tree is divided into two parts:
+Darshan is a lightweight I/O characterization tool that transparently
+captures I/O access pattern information from HPC applications.
+Darshan can be used to tune applications for increased scientific
+productivity or to gain insight into trends in large-scale computing
+systems.
+
+Please see the 
+[Darshan web page](http://www.mcs.anl.gov/research/projects/darshan)
+for more in-depth news and documentation.
+
+The Darshan source tree is divided into two main parts:
 
 - darshan-runtime: to be installed on systems where you intend to 
   instrument MPI applications.  See darshan-runtime/doc/darshan-runtime.txt
@@ -8,9 +18,7 @@ The Darshan source tree is divided into two parts:
   log files produced by darshan-runtime.  See
   darshan-util/doc/darshan-util.txt for installation instructions.
 
-General documentation can be found on the Darshan documentation web page:
-http://www.mcs.anl.gov/darshan/documentation/
-
 The darshan-test directory contains various test harnesses, benchmarks,
 patches, and unsupported utilites that are mainly of interest to Darshan
 developers.
+


=====================================
darshan-test/darshan-gather-mpi-posix-usage.pl
=====================================
--- /dev/null
+++ b/darshan-test/darshan-gather-mpi-posix-usage.pl
@@ -0,0 +1,98 @@
+#!/usr/bin/perl -w
+
+# This script will go through all of the darshan logs in a given
+# subdirectory and summarize a few basic statistics about data usage and
+# performance, producing a text file with text in columns
+
+#<jobid> <version> <start ascii> <end ascii> <start unix> <end unix> <nprocs> <bytes read> <bytes written> <perf estimate> 
+
+use strict;
+use File::Find;
+
+sub wanted
+{
+    my $file = $_;
+    my $line;
+    my $version = 0.0;
+    my $nprocs = 0;
+    my $start = 0;
+    my $end = 0;
+    my $start_a = "";
+    my $end_a = "";
+    my $jobid = 0;
+    my $bytes_r = 0;
+    my $bytes_w = 0;
+    my $perf = 0.0;
+    my @fields;
+    my $mpi_coll_count = 0;
+    my $mpi_indep_count = 0;
+    my $posix_count = 0;
+
+    # only operate on darshan log files
+    $file =~ /\.darshan\.gz$/ or return;    
+
+    # grab jobid from name, old logs don't store it in the file
+    if($file =~ /_id(\d+)_/) {
+        $jobid = $1;
+    }
+
+    if(!(open(SUMMARY, "darshan-parser --file-list-detailed $file |")))
+    {
+        print(STDERR "Failed to parse $File::Find::name\n");
+        return;
+    }
+
+    while ($line = <SUMMARY>) {
+        if($line =~ /^#/) {
+            next;
+        }
+        if($line =~ /^\s/) {
+            next;
+        }
+
+        @fields = split(/\s/, $line);
+
+        if($#fields == 34)
+        {
+            if($fields[13] > 0){
+                $mpi_coll_count ++;
+            }
+            elsif($fields[12] > 0){
+                $mpi_indep_count ++;
+            }
+            elsif($fields[14] > 0){
+                $posix_count ++;
+            }
+
+        }
+    }
+
+    print(STDOUT "$jobid\t$mpi_coll_count\t$mpi_indep_count\t$posix_count\n");
+    close(SUMMARY);
+}
+
+sub main
+{
+    my @paths;
+
+    if($#ARGV < 0) {
+        die("usage: darshan-gather-stats.pl <one or more log directories>\n");
+    }
+
+    @paths = @ARGV;
+
+    print("# <jobid>\t<#files_using_collectives>\t<#files_using_indep>\t<#files_using_posix>\n"); 
+    print("# NOTE: a given file will only show up in one category, with preference in the order shown above (i.e. a file that used collective I/O will not show up in the indep or posix category).\n");
+
+    find(\&wanted, @paths);
+
+}
+
+main();
+
+# Local variables:
+#  c-indent-level: 4
+#  c-basic-offset: 4
+# End:
+#  
+# vim: ts=8 sts=4 sw=4 expandtab


=====================================
darshan-util/doc/darshan-util.txt
=====================================
--- a/darshan-util/doc/darshan-util.txt
+++ b/darshan-util/doc/darshan-util.txt
@@ -131,11 +131,10 @@ specified file.
 
 === darshan-parser
 
-In order to obtained a full, human readable dump of all information
-contained in a log file, you can use the `darshan-parser` command
-line utility.  It does not require any additional command line tools.
-The following example essentially converts the contents of the log file
-into a fully expanded text file:
+You can use the `darshan-parser` command line utility to obtain a
+complete, human-readable, text-format dump of all information contained
+in a log file.   The following example converts the contents of the
+log file into a fully expanded text file:
 
 ----
 darshan-parser carns_my-app_id114525_7-27-58921_19.darshan.gz > ~/job-characterization.txt
@@ -146,8 +145,14 @@ The format of this output is described in the following section.
 === Guide to darshan-parser output
 
 The beginning of the output from darshan-parser displays a summary of
-overall information about the job. The following table defines the meaning
-of each line:
+overall information about the job. Additional job-level summary information
+can also be produced using the `--perf`, `--file`, `--file-list`, or
+`--file-list-detailed` command line options.  See the
+<<addsummary,Additional summary output>> section for more information about
+those options.
+
+The following table defines the meaning
+of each line in the default header section of the output:
 
 [cols="25%,75%",options="header"]
 |====
@@ -365,6 +370,7 @@ value of 1 MiB for optimal file alignment.
 |====
 
 ==== Additional summary output
+[[addsummary]]
 
 The following sections describe addtitional parser options that provide
 summary I/O characterization data for the given log.
@@ -373,8 +379,7 @@ summary I/O characterization data for the given log.
 
 ===== Performance
 
-Use the '--perf' option to get performance approximations using four
-different computations.
+Job performance information can be generated using the `--perf` command-line option.
 
 .Example output
 ----
@@ -407,6 +412,54 @@ different computations.
 # agg_perf_by_slowest: 2206.983935
 ----
 
+The `total_bytes` line shows the total number of bytes transferred
+(read/written) by the job.  That is followed by three sections:
+
+.I/O timing for unique files
+
+This section reports information about any files that were *not* opened
+by every rank in the job.  This includes independent files (opened by
+1 process) and partially shared files (opened by a proper subset of
+the job's processes). The I/O time for this category of file access
+is reported based on the *slowest* rank of all processes that performed this
+type of file access.
+
+* unique files: slowest_rank_io_time: total I/O time for unique files
+  (including both metadata + data transfer time)
+* unique files: slowest_rank_meta_time: metadata time for unique files
+* unique files: slowest_rank: the rank of the slowest process
+
+.I/O timing for shared files
+
+This section reports information about files that were globally shared (i.e.
+opened by every rank in the job).  This section estimates performance for
+globally shared files using four different methods.  The `time_by_slowest`
+is generally the most accurate, but it may not available in some older Darshan
+log files. 
+
+* shared files: time_by_cumul_*: adds the cumulative time across all
+  processes and divides by the number of processes (inaccurate when there is
+  high variance among processes).
+** shared files: time_by_cumul_io_only: include metadata AND data transfer
+   time for global shared files
+** shared files: time_by_cumul_meta_only: metadata time for global shared
+   files
+* shared files: time_by_open: difference between timestamp of open and
+  close (inaccurate if file is left open without I/O activity)
+* shared files: time_by_open_lastio: difference between timestamp of open
+  and the timestamp of last I/O (similar to above but fixes case where file is
+  left open after I/O is complete)
+* shared files: time_by_slowest : measures time according to which rank was
+  the slowest to perform both metadata operations and data transfer for each
+  shared file. (most accurate but requires newer log version)
+
+.Aggregate performance
+
+Performance is calculated by dividing the total bytes by the I/O time
+(shared files and unique files combined) computed
+using each of the four methods described in the previous output section. Note the unit for total bytes is
+Byte and for the aggregate performance is MiB/s (1024*1024 Bytes/s).
+
 ===== Files
 Use the `--file` option to get totals based on file usage.
 The first column is the count of files for that type, the second column is
@@ -416,9 +469,14 @@ accessed.
 * total: All files
 * read_only: Files that were only read from
 * write_only: Files that were only written to
+* read_write: Files that were both read and written
 * unique: Files that were opened on only one rank
 * shared: File that were opened by more than one rank
 
+Each line has 3 columns. The first column is the count of files for that
+type of file, the second column is number of bytes for that type, and the third
+column is the maximum offset accessed.
+
 .Example output
 ----
 # files
@@ -433,10 +491,11 @@ accessed.
 
 ===== Totals
 
-Use the `--total` option to get all statistics as an aggregate total.
-Statistics that make sense to be aggregated are aggregated. Other statistics
-may be a minimum or maximum if that makes sense. Other data maybe zeroed if
-it doesn't make sense to aggregate the data.
+Use the `--total` option to get all statistics as an aggregate total rather
+than broken down per file.  Each field is either summed across files and
+process (for values such as number of opens), set to global minimums and
+maximums (for values such as open time and close time), or zeroed out (for
+statistics that are nonsensical in aggregate).
 
 .Example output
 ----
@@ -475,11 +534,18 @@ file.
 5041708885572677970 /projects/SSSPPg/snyder/ior/ior.dat 1024    16.342061   1.705930
 ----
 
+This data could be post-processed to compute more in-depth statistics, such as
+the total number of MPI files and total number of POSIX files used in a
+job, categorizing files into independent/unique/local files (opened by
+1 process), subset/partially shared files (opened by a proper subset of
+processes) or globally shared files (opened by all processes), and ranking
+files according to how much time was spent performing I/O in each file.
+
 ===== Detailed file list
 
 The `--file-list-detailed` is the same as --file-list except that it
 produces many columns of output containing statistics broken down by file.
-This option is mainly useful for automated analysis.
+This option is mainly useful for more detailed automated analysis.
 
 === Other darshan-util utilities
 



View it on GitLab: https://xgitlab.cels.anl.gov/darshan/darshan/compare/9274a0db254dad9177a80c1781259daab3b254cf...41575785a890ced023ae36e6781ab7bd63440274
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-commits/attachments/20151001/ed361913/attachment-0001.html>