[Darshan-commits] [Git][darshan/darshan][dev-modular] 15 commits: fix for gnuplot version number check
Shane Snyder
xgitlab at cels.anl.gov
Thu Oct 1 11:49:02 CDT 2015
Shane Snyder pushed to branch dev-modular at darshan / darshan
Commits:
72726387 by Phil Carns at 2015-04-29T10:59:18Z
fix for gnuplot version number check
- contributed by Kay Thust
- - - - -
1d202464 by Shane Snyder at 2015-06-30T22:31:31Z
fix map_or_fail typo in lio_listio wrapper
- - - - -
341bf09f by Shane Snyder at 2015-06-30T22:36:16Z
bug fix for aio_return wrapper
- - - - -
14ad6274 by Shane Snyder at 2015-07-01T09:16:47Z
update Changelog for last 2 bug fixes
- - - - -
6be9df5b by Shane Snyder at 2015-08-06T18:01:41Z
bug fix in common access counter logic
- - - - -
fbce8030 by Shane Snyder at 2015-08-07T08:47:18Z
update changelog for common access bug
- - - - -
f68228a5 by Phil Carns at 2015-08-19T11:55:50Z
integrate expanded darshan-parser documentation
- provided by Huong Luu
- - - - -
c5a27637 by Phil Carns at 2015-09-08T12:30:55Z
trivial stat collection script
- for each job, count number of files that used collectives, independent
operations, or posix
- - - - -
0ffb9547 by Phil Carns at 2015-09-08T16:41:44Z
friendlier output
- - - - -
676030f1 by Phil Carns at 2015-09-08T16:45:43Z
bug fix
- - - - -
7eecf5d0 by Phil Carns at 2015-09-25T13:05:00Z
markdown version of readme for gitlab
- - - - -
3d3c4257 by Phil Carns at 2015-09-25T13:06:48Z
trying to figure out why markdown link is grumpy
- - - - -
83f12d5f by Phil Carns at 2015-09-25T13:16:52Z
whitespace change to test commit hooks
- - - - -
f6fbcec4 by Shane Snyder at 2015-10-01T11:19:41Z
Merge remote-tracking branch 'origin/master' into dev-modular
Conflicts:
ChangeLog
darshan-runtime/configure
darshan-runtime/configure.in
darshan-runtime/darshan.h
darshan-runtime/lib/darshan-posix.c
darshan-util/configure
darshan-util/configure.in
darshan-util/doc/darshan-util.txt
- - - - -
41575785 by Shane Snyder at 2015-10-01T11:46:02Z
small text changes in darshan-util docs
- - - - -
4 changed files:
- ChangeLog
- README → README.md
- + darshan-test/darshan-gather-mpi-posix-usage.pl
- darshan-util/doc/darshan-util.txt
Changes:
=====================================
ChangeLog
=====================================
--- a/ChangeLog
+++ b/ChangeLog
@@ -27,6 +27,16 @@ Darshan-3.0.0-pre1
darshan-util components are mostly the same and still located in
their respective directories ('darshan-runtime/doc' and 'darshan-util/doc')
+darshan-2.3.2-pre1
+=============
+* Fix gnuplot version number check to allow darshan-job-summary.pl to work
+ with gnuplot 5.0 (Kay Thust)
+* Fix function pointer mapping typo in lio_listio64 wrapper (Shane Snyder)
+* Fix faulty logic in extracting I/O data from the aio_return
+ wrapper (Shane Snyder)
+* Fix bug in common access counter logic (Shane Snyder)
+* Expand and clarify darshan-parser documentation (Huong Luu)
+
darshan-2.3.1
=============
* added documentation and example configuration files for using the -profile
=====================================
README → README.md
=====================================
--- a/README
+++ b/README.md
@@ -1,4 +1,14 @@
-The Darshan source tree is divided into two parts:
+Darshan is a lightweight I/O characterization tool that transparently
+captures I/O access pattern information from HPC applications.
+Darshan can be used to tune applications for increased scientific
+productivity or to gain insight into trends in large-scale computing
+systems.
+
+Please see the
+[Darshan web page](http://www.mcs.anl.gov/research/projects/darshan)
+for more in-depth news and documentation.
+
+The Darshan source tree is divided into two main parts:
- darshan-runtime: to be installed on systems where you intend to
instrument MPI applications. See darshan-runtime/doc/darshan-runtime.txt
@@ -8,9 +18,7 @@ The Darshan source tree is divided into two parts:
log files produced by darshan-runtime. See
darshan-util/doc/darshan-util.txt for installation instructions.
-General documentation can be found on the Darshan documentation web page:
-http://www.mcs.anl.gov/darshan/documentation/
-
The darshan-test directory contains various test harnesses, benchmarks,
patches, and unsupported utilites that are mainly of interest to Darshan
developers.
+
=====================================
darshan-test/darshan-gather-mpi-posix-usage.pl
=====================================
--- /dev/null
+++ b/darshan-test/darshan-gather-mpi-posix-usage.pl
@@ -0,0 +1,98 @@
+#!/usr/bin/perl -w
+
+# This script will go through all of the darshan logs in a given
+# subdirectory and summarize a few basic statistics about data usage and
+# performance, producing a text file with text in columns
+
+#<jobid> <version> <start ascii> <end ascii> <start unix> <end unix> <nprocs> <bytes read> <bytes written> <perf estimate>
+
+use strict;
+use File::Find;
+
+sub wanted
+{
+ my $file = $_;
+ my $line;
+ my $version = 0.0;
+ my $nprocs = 0;
+ my $start = 0;
+ my $end = 0;
+ my $start_a = "";
+ my $end_a = "";
+ my $jobid = 0;
+ my $bytes_r = 0;
+ my $bytes_w = 0;
+ my $perf = 0.0;
+ my @fields;
+ my $mpi_coll_count = 0;
+ my $mpi_indep_count = 0;
+ my $posix_count = 0;
+
+ # only operate on darshan log files
+ $file =~ /\.darshan\.gz$/ or return;
+
+ # grab jobid from name, old logs don't store it in the file
+ if($file =~ /_id(\d+)_/) {
+ $jobid = $1;
+ }
+
+ if(!(open(SUMMARY, "darshan-parser --file-list-detailed $file |")))
+ {
+ print(STDERR "Failed to parse $File::Find::name\n");
+ return;
+ }
+
+ while ($line = <SUMMARY>) {
+ if($line =~ /^#/) {
+ next;
+ }
+ if($line =~ /^\s/) {
+ next;
+ }
+
+ @fields = split(/\s/, $line);
+
+ if($#fields == 34)
+ {
+ if($fields[13] > 0){
+ $mpi_coll_count ++;
+ }
+ elsif($fields[12] > 0){
+ $mpi_indep_count ++;
+ }
+ elsif($fields[14] > 0){
+ $posix_count ++;
+ }
+
+ }
+ }
+
+ print(STDOUT "$jobid\t$mpi_coll_count\t$mpi_indep_count\t$posix_count\n");
+ close(SUMMARY);
+}
+
+sub main
+{
+ my @paths;
+
+ if($#ARGV < 0) {
+ die("usage: darshan-gather-stats.pl <one or more log directories>\n");
+ }
+
+ @paths = @ARGV;
+
+ print("# <jobid>\t<#files_using_collectives>\t<#files_using_indep>\t<#files_using_posix>\n");
+ print("# NOTE: a given file will only show up in one category, with preference in the order shown above (i.e. a file that used collective I/O will not show up in the indep or posix category).\n");
+
+ find(\&wanted, @paths);
+
+}
+
+main();
+
+# Local variables:
+# c-indent-level: 4
+# c-basic-offset: 4
+# End:
+#
+# vim: ts=8 sts=4 sw=4 expandtab
=====================================
darshan-util/doc/darshan-util.txt
=====================================
--- a/darshan-util/doc/darshan-util.txt
+++ b/darshan-util/doc/darshan-util.txt
@@ -131,11 +131,10 @@ specified file.
=== darshan-parser
-In order to obtained a full, human readable dump of all information
-contained in a log file, you can use the `darshan-parser` command
-line utility. It does not require any additional command line tools.
-The following example essentially converts the contents of the log file
-into a fully expanded text file:
+You can use the `darshan-parser` command line utility to obtain a
+complete, human-readable, text-format dump of all information contained
+in a log file. The following example converts the contents of the
+log file into a fully expanded text file:
----
darshan-parser carns_my-app_id114525_7-27-58921_19.darshan.gz > ~/job-characterization.txt
@@ -146,8 +145,14 @@ The format of this output is described in the following section.
=== Guide to darshan-parser output
The beginning of the output from darshan-parser displays a summary of
-overall information about the job. The following table defines the meaning
-of each line:
+overall information about the job. Additional job-level summary information
+can also be produced using the `--perf`, `--file`, `--file-list`, or
+`--file-list-detailed` command line options. See the
+<<addsummary,Additional summary output>> section for more information about
+those options.
+
+The following table defines the meaning
+of each line in the default header section of the output:
[cols="25%,75%",options="header"]
|====
@@ -365,6 +370,7 @@ value of 1 MiB for optimal file alignment.
|====
==== Additional summary output
+[[addsummary]]
The following sections describe addtitional parser options that provide
summary I/O characterization data for the given log.
@@ -373,8 +379,7 @@ summary I/O characterization data for the given log.
===== Performance
-Use the '--perf' option to get performance approximations using four
-different computations.
+Job performance information can be generated using the `--perf` command-line option.
.Example output
----
@@ -407,6 +412,54 @@ different computations.
# agg_perf_by_slowest: 2206.983935
----
+The `total_bytes` line shows the total number of bytes transferred
+(read/written) by the job. That is followed by three sections:
+
+.I/O timing for unique files
+
+This section reports information about any files that were *not* opened
+by every rank in the job. This includes independent files (opened by
+1 process) and partially shared files (opened by a proper subset of
+the job's processes). The I/O time for this category of file access
+is reported based on the *slowest* rank of all processes that performed this
+type of file access.
+
+* unique files: slowest_rank_io_time: total I/O time for unique files
+ (including both metadata + data transfer time)
+* unique files: slowest_rank_meta_time: metadata time for unique files
+* unique files: slowest_rank: the rank of the slowest process
+
+.I/O timing for shared files
+
+This section reports information about files that were globally shared (i.e.
+opened by every rank in the job). This section estimates performance for
+globally shared files using four different methods. The `time_by_slowest`
+is generally the most accurate, but it may not available in some older Darshan
+log files.
+
+* shared files: time_by_cumul_*: adds the cumulative time across all
+ processes and divides by the number of processes (inaccurate when there is
+ high variance among processes).
+** shared files: time_by_cumul_io_only: include metadata AND data transfer
+ time for global shared files
+** shared files: time_by_cumul_meta_only: metadata time for global shared
+ files
+* shared files: time_by_open: difference between timestamp of open and
+ close (inaccurate if file is left open without I/O activity)
+* shared files: time_by_open_lastio: difference between timestamp of open
+ and the timestamp of last I/O (similar to above but fixes case where file is
+ left open after I/O is complete)
+* shared files: time_by_slowest : measures time according to which rank was
+ the slowest to perform both metadata operations and data transfer for each
+ shared file. (most accurate but requires newer log version)
+
+.Aggregate performance
+
+Performance is calculated by dividing the total bytes by the I/O time
+(shared files and unique files combined) computed
+using each of the four methods described in the previous output section. Note the unit for total bytes is
+Byte and for the aggregate performance is MiB/s (1024*1024 Bytes/s).
+
===== Files
Use the `--file` option to get totals based on file usage.
The first column is the count of files for that type, the second column is
@@ -416,9 +469,14 @@ accessed.
* total: All files
* read_only: Files that were only read from
* write_only: Files that were only written to
+* read_write: Files that were both read and written
* unique: Files that were opened on only one rank
* shared: File that were opened by more than one rank
+Each line has 3 columns. The first column is the count of files for that
+type of file, the second column is number of bytes for that type, and the third
+column is the maximum offset accessed.
+
.Example output
----
# files
@@ -433,10 +491,11 @@ accessed.
===== Totals
-Use the `--total` option to get all statistics as an aggregate total.
-Statistics that make sense to be aggregated are aggregated. Other statistics
-may be a minimum or maximum if that makes sense. Other data maybe zeroed if
-it doesn't make sense to aggregate the data.
+Use the `--total` option to get all statistics as an aggregate total rather
+than broken down per file. Each field is either summed across files and
+process (for values such as number of opens), set to global minimums and
+maximums (for values such as open time and close time), or zeroed out (for
+statistics that are nonsensical in aggregate).
.Example output
----
@@ -475,11 +534,18 @@ file.
5041708885572677970 /projects/SSSPPg/snyder/ior/ior.dat 1024 16.342061 1.705930
----
+This data could be post-processed to compute more in-depth statistics, such as
+the total number of MPI files and total number of POSIX files used in a
+job, categorizing files into independent/unique/local files (opened by
+1 process), subset/partially shared files (opened by a proper subset of
+processes) or globally shared files (opened by all processes), and ranking
+files according to how much time was spent performing I/O in each file.
+
===== Detailed file list
The `--file-list-detailed` is the same as --file-list except that it
produces many columns of output containing statistics broken down by file.
-This option is mainly useful for automated analysis.
+This option is mainly useful for more detailed automated analysis.
=== Other darshan-util utilities
View it on GitLab: https://xgitlab.cels.anl.gov/darshan/darshan/compare/9274a0db254dad9177a80c1781259daab3b254cf...41575785a890ced023ae36e6781ab7bd63440274
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-commits/attachments/20151001/ed361913/attachment-0001.html>
More information about the Darshan-commits
mailing list