[Darshan-commits] [Git][darshan/darshan][carns/dev-dyn-link-updates] revise darshan-runtime documentation

Philip Carns xgitlab at cels.anl.gov
Wed Mar 25 15:43:22 CDT 2020



Philip Carns pushed to branch carns/dev-dyn-link-updates at darshan / darshan


Commits:
2fbbbd36 by Phil Carns at 2020-03-25T16:42:06-04:00
revise darshan-runtime documentation

- advocate for compile-time instrumentation, even for dynamic linked
  applications
- remove deprecated information
- make the documentation more user-oriented

- - - - -


1 changed file:

- darshan-runtime/doc/darshan-runtime.txt


Changes:

=====================================
darshan-runtime/doc/darshan-runtime.txt
=====================================
@@ -7,11 +7,10 @@ This document describes darshan-runtime, which is the instrumentation
 portion of the Darshan characterization tool.  It should be installed on the
 system where you intend to collect I/O characterization information.
 
-Darshan instruments applications via either compile time wrappers for static
-executables or dynamic library preloading for dynamic executables.  An
-application that has been instrumented with Darshan will produce a single
-log file each time it is executed.  This log summarizes the I/O access patterns
-used by the application.
+Darshan instruments applications via either compile time wrappers or
+dynamic library preloading.  An application that has been instrumented
+with Darshan will produce a single log file each time it is executed.
+This log summarizes the I/O access patterns used by the application.
 
 The darshan-runtime instrumentation has traditionally only supported MPI
 applications (specifically, those that call `MPI_Init()` and `MPI_Finalize()`),
@@ -44,7 +43,7 @@ coarse-grained instrumentation methods.
 This document provides generic installation instructions, but "recipes" for
 several common HPC systems are provided at the end of the document as well.
 
-More information about Darshan can be found at the 
+More information about Darshan can be found at the
 http://www.mcs.anl.gov/darshan[Darshan web site].
 
 == Requirements
@@ -108,24 +107,11 @@ with support for HDF5 versions prior to 1.10
 * `--enable-HDF5-post-1.10`: enables the Darshan HDF5 instrumentation module,
 with support for HDF5 versions 1.10 or higher
 
-=== Cross compilation
-
-On some systems (notably the IBM Blue Gene series), the login nodes do not
-have the same architecture or runtime environment as the compute nodes.  In
-this case, you must configure darshan-runtime to be built using a cross
-compiler.  The following configure arguments show an example for the BG/P system:
-
-----
---host=powerpc-bgp-linux CC=/bgsys/drivers/ppcfloor/comm/default/bin/mpicc 
-----
-
-== Environment preparation
+== Environment preparation (Log directory)
 
 Once darshan-runtime has been installed, you must prepare a location
 in which to store the Darshan log files and configure an instrumentation method.
 
-=== Log directory
-
 This step can be safely skipped if you configured darshan-runtime using the
 `--with-log-path-by-env` option.  A more typical configuration uses a static
 directory hierarchy for Darshan log
@@ -138,7 +124,7 @@ placed. The deepest subdirectories will have sticky permissions to enable
 multiple users to write to the same directory.  If the log directory is
 shared system-wide across many users then the following script should be run
 as root.
- 
+
 ----
 darshan-mk-log-dirs.pl
 ----
@@ -161,36 +147,75 @@ administrators group
 * recursively set the setgid bit on the log directories
 ====
 
+== Instrumenting MPI applications
+
+[NOTE]
+====
+More specific installation "recipes" are provided later in this document for
+some platforms.  This section of the documentation covers general techniques.
+====
+
+Once Darshan has been installed and a log path has been prepared, the next
+step is to actually instrument applications. The preferred method is to
+instrument applications at compile time.
+
+=== Option 1: Instrumenting MPI applications at compile time
 
-=== Instrumentation method
+This method is applicable to C, Fortran, and C++ applications
+(regardless of whether they are static or dynamicly linked) and is the most
+straightforward method to apply transparently system-wide.  It works by
+injecting additional libraries and options into the linker command line to
+intercept relevant I/O calls.
 
-The instrumentation method to use depends on whether the executables
-produced by your compiler are statically or dynamically linked.  If you
-are unsure, you can check by running `ldd <executable_name>` on an example
-executable.  Dynamically-linked executables will produce a list of shared
-libraries when this command is executed.
+On Cray platforms you can enable the compile time instrumentation by simply
+loading the darshan module.  It can then be enabled for all users by placing
+that module in the default environment. As of Darshan 3.2.0 this will
+instrument both static and dynamic executables, while in previous versions
+of Darshan this was only sufficient for static executables.  See the Cray
+installation recipe for more details.
 
-Some compilers allow you to toggle dynamic or static linking via options
-such as `-dynamic` or `-static`.  Please check your compiler man page
-for details if you intend to force one mode or the other.
+For other general MPICH-based MPI implementations, you can generate
+Darshan-enabled variants of the standard mpicc/mpicxx/mpif90/mpif77
+wrappers using the following commands:
 
-== Instrumenting statically-linked MPI applications
+----
+darshan-gen-cc.pl `which mpicc` --output mpicc.darshan
+darshan-gen-cxx.pl `which mpicxx` --output mpicxx.darshan
+darshan-gen-fortran.pl `which mpif77` --output mpif77.darshan
+darshan-gen-fortran.pl `which mpif90` --output mpif90.darshan
+-----
 
-Statically linked executables must be instrumented at compile time.
-The simplest methods to do this are to either generate a customized
-MPI compiler script (e.g. `mpicc`) that includes the link options and
-libraries needed by Darshan, or to use existing profiling configuration
-hooks for MPI compiler scripts.  Once this is done, Darshan
-instrumentation is transparent; you simply compile applications using
-the Darshan-enabled MPI compiler scripts.
+The resulting *.darshan wrappers will transparently inject Darshan
+instrumentation into the link step without any explicit user intervention.
+They can be renamed and placed in an appropriate
+PATH to enable automatic instrumentation.  This method also works correctly
+for both static and dynamic executables as of Darshan 3.2.0.
 
-=== Using a profile configuration 
+For other systems you can enable compile-time instrumentation by either
+manually adding the appropriate link options to your command line or
+modifying your default MPI compiler script.  The `darshan-config` command
+line tool can be used to display the options that you should use:
+
+----
+# Linker options to use for dynamic linking (default on most platforms)
+#   These arguments should go *before* the MPI libraries in the underlying
+#   linker command line to ensure that Darshan can be activated.  It should
+#   also ideally go before other libraries that may issue I/O function calls.
+darshan-config --dyn-ld-flags
+
+# linker options to use for static linking
+#   The first set of arguments should go early in the link command line
+#   (before MPI, while the second set should go at the end fo the link command
+#   line
+darshan-config --pre-ld-flags
+darshan-config --post-ld-flags
+----
+
+==== Using a profile configuration
 
-[[static-prof]]
 The MPICH MPI implementation supports the specification of a profiling library
 configuration that can be used to insert Darshan instrumentation without
-modifying the existing MPI compiler script.  Example profiling configuration
-files are installed with Darshan 2.3.1 and later.  You can enable a profiling
+modifying the existing MPI compiler script. You can enable a profiling
 configuration using environment variables or command line arguments to the
 compiler scripts:
 
@@ -201,14 +226,6 @@ export MPICXX_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-cxx
 export MPIFORT_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-f
 ----
 
-Example for MPICH 3.1 or earlier:
-----
-export MPICC_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-cc
-export MPICXX_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-cxx
-export MPIF77_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-f
-export MPIF90_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-f
-----
-
 Examples for command line use:
 ----
 mpicc -profile=$DARSHAN_PREFIX/share/mpi-profile/darshan-c <args>
@@ -217,36 +234,20 @@ mpif77 -profile=$DARSHAN_PREFIX/share/mpi-profile/darshan-f <args>
 mpif90 -profile=$DARSHAN_PREFIX/share/mpi-profile/darshan-f <args>
 ----
 
-=== Using customized compiler wrapper scripts
-
-[[static-wrapper]]
-For MPICH-based MPI libraries, such as MPICH1, MPICH2, or MVAPICH,
-custom wrapper scripts can be generated to automatically include Darshan
-instrumentation.  The following example illustrates how to produce
-wrappers for C, C++, and Fortran compilers:
-
-----
-darshan-gen-cc.pl `which mpicc` --output mpicc.darshan
-darshan-gen-cxx.pl `which mpicxx` --output mpicxx.darshan
-darshan-gen-fortran.pl `which mpif77` --output mpif77.darshan
-darshan-gen-fortran.pl `which mpif90` --output mpif90.darshan
------
-
-=== Other configurations
-
-Please see the Cray recipe in this document for instructions on
-instrumenting statically-linked applications on that platform.
+Note that unlike the previously described methods in this section, this
+method *will not* automatically adapt to static and dynamic linking options.
+The example profile configurations show above only support dynamic linking.
 
-For other MPI Libraries you must manually modify the MPI compiler scripts to
-add the necessary link options and libraries.  Please see the
-`darshan-gen-*` scripts for examples or contact the Darshan users mailing
-list for help.
+Example profile configurations are also provided with a "-static" suffix if
+you need examples for static linking.
 
-== Instrumenting dynamically-linked MPI applications
+=== Option 2: Instrumenting MPI applications at run time
 
-For dynamically-linked executables, Darshan relies on the `LD_PRELOAD`
-environment variable to insert instrumentation at run time.  The executables
-should be compiled using the normal, unmodified MPI compiler.
+This method is applicable to pre-compiled dynamically linked executables
+as well as interpretted languages such as Python.  You do not need to
+change your compile options in any way.  This method works by injecting
+instrumentation at run time.  It will not work for statically linked
+executables.
 
 To use this mechanism, set the `LD_PRELOAD` environment variable to the full
 path to the Darshan shared library. The preferred method of inserting Darshan
@@ -285,28 +286,7 @@ For SGI systems running the MPT environment, it may be necessary to set the `MPI
 environment variable equal to `true` to avoid deadlock when preloading the Darshan shared
 library.
 
-=== Instrumenting dynamically-linked Fortran applications
-
-Please follow the general steps outlined in the previous section.  For
-Fortran applications compiled with MPICH you may have to take the additional
-step of adding
-`libfmpich.so` to your `LD_PRELOAD` environment variable. For example:
-
-----
-export LD_PRELOAD=/path/to/mpi/used/by/executable/lib/libfmpich.so:/home/carns/darshan-install/lib/libdarshan.so
-----
-
-[NOTE]
-The full path to the libfmpich.so library can be omitted if the rpath
-variable points to the correct path.  Be careful to check the rpath of the
-darshan library and the executable before using this configuration, however.
-They may provide conflicting paths.  Ideally the rpath to the  MPI library
-would *not* be set by the Darshan library, but would instead be specified
-exclusively by the executable itself.  You can check the rpath of the
-darshan library by running `objdump -x
-/home/carns/darshan-install/lib/libdarshan.so |grep RPATH`.
-
-== Instrumenting dynamically-linked non-MPI applications
+=== Option 3: Instrumenting non-MPI applications at run time
 
 Similar to the process described in the previous section, Darshan relies on the
 `LD_PRELOAD` mechanism for instrumenting dynamically-linked non-MPI applications.
@@ -335,6 +315,17 @@ env LD_PRELOAD=/home/carns/darshan-install/lib/libdarshan.so io-test
 Recall that Darshan instrumentation of non-MPI applications is only possible with 
 dynamically-linked applications.
 
+=== Using other profiling tools at the same time as Darshan
+
+As of Darshan version 3.2.0, Darshan does not necessarily interfere with
+other profiling tools (particularly those using the PMPI profiling
+interface).  Darshan itself does not use the PMPI interface, and instead
+uses dynamic linker symbol interception or --wrap function interception for
+static executables.
+
+As a rule of thumb most profiling tools should appear in the linker command
+line *before* -ldarshan if possible.
+
 == Using the Darshan eXtended Tracing (DXT) module
 
 DXT support is disabled by default in Darshan, requiring the user to either explicitly
@@ -391,59 +382,9 @@ The following recipes provide examples for prominent HPC systems.
 These are intended to be used as a starting point.  You will most likely have to adjust paths and options to
 reflect the specifics of your system.
 
-=== IBM Blue Gene (BG/P or BG/Q)
-
-IBM Blue Gene systems produces static executables by default, uses a
-different architecture for login and compute nodes, and uses an MPI
-environment based on MPICH.
-
-The following example shows how to configure Darshan on a BG/P system:
-
-----
-./configure --with-mem-align=16 \
- --with-log-path=/home/carns/working/darshan/releases/logs \
- --prefix=/home/carns/working/darshan/install --with-jobid-env=COBALT_JOBID \
- --with-zlib=/soft/apps/zlib-1.2.3/ \
- --host=powerpc-bgp-linux CC=/bgsys/drivers/ppcfloor/comm/default/bin/mpicc 
-----
-
-.Rationale
-[NOTE]
-====
-The memory alignment is set to 16 not because that is the proper alignment
-for the BG/P CPU architecture, but because that is the optimal alignment for
-the network transport used between compute nodes and I/O nodes in the
-system.  The jobid environment variable is set to `COBALT_JOBID` in this
-case for use with the Cobalt scheduler, but other BG/P systems may use
-different schedulers.  The `--with-zlib` argument is used to point to a
-version of zlib that has been compiled for use on the compute nodes rather
-than the login node.  The `--host` argument is used to force cross-compilation
-of Darshan.  The `CC` variable is set to point to a stock MPI compiler.
-====
-
-Once Darshan has been installed, you can use one of the static
-instrumentation methods described earlier in this document.  If you
-use the profiling configuration file method, then please note that the
-Darshan installation includes profiling configuration files that have been
-adapted specifically for the Blue Gene environment.  Set the following
-environment variables to enable them, and then use your normal compiler
-scripts.  This method is compatible with both GNU and IBM compilers.
-
-Blue Gene profiling configuration example:
-----
-export MPICC_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-bg-cc
-export MPICXX_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-bg-cxx
-export MPIF77_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-bg-f
-export MPIF90_PROFILE=$DARSHAN_PREFIX/share/mpi-profile/darshan-bg-f
-----
-
 === Cray platforms (XE, XC, or similar)
 
-The Cray programming environment produces static executables by default,
-which means that Darshan instrumentation must be inserted at compile
-time.  This can be accomplished by loading a software module that sets
-appropriate environment variables to modify the Cray compiler script link
-behavior.  This section describes how to compile and install Darshan,
+This section describes how to compile and install Darshan,
 as well as how to use a software module to enable and disable Darshan
 instrumentation.
 
@@ -454,13 +395,13 @@ configuring or compiling Darshan.  Although Darshan can be built with a
 variety of compilers, the GNU compilers are recommended because it will
 produce a Darshan library that is interoperable with the widest range
 of compilers and linkers.  On most Cray systems you can enable the GNU
-programming environment with a command similar to "module swap PrgEnv-pgi
+programming environment with a command similar to "module swap PrgEnv-intel
 PrgEnv-gnu".  Please see your site documentation for information about
 how to switch programming environments.
 
 The following example shows how to configure and build Darshan on a Cray
-system using the GNU programming environment.  Adjust the 
---with-log-path and --prefix arguments to point to the desired log file path 
+system using the GNU programming environment.  Adjust the
+--with-log-path and --prefix arguments to point to the desired log file path
 and installation path, respectively.
 
 ----
@@ -488,8 +429,8 @@ Darshan will typically use the LOGNAME environment variable to determine a
 userid.
 ====
 
-As in any Darshan installation, the darshan-mk-log-dirs.pl script can then be 
-used to create the appropriate directory hierarchy for storing Darshan log 
+As in any Darshan installation, the darshan-mk-log-dirs.pl script can then be
+used to create the appropriate directory hierarchy for storing Darshan log
 files in the --with-log-path directory.
 
 Note that Darshan is not currently capable of detecting the stripe size
@@ -497,7 +438,7 @@ Note that Darshan is not currently capable of detecting the stripe size
 If a Lustre file system is detected, then Darshan assumes an optimal
 file alignment of 1 MiB.
 
-==== Enabling Darshan instrumentation 
+==== Enabling Darshan instrumentation
 
 Darshan will automatically install example software module files in the
 following locations (depending on how you specified the --prefix option in
@@ -534,7 +475,7 @@ module use /soft/darshan-2.2.3/share/craype-<VERSION>/modulefiles
 From this point, Darshan instrumenation can be enabled for all future
 application compilations by running "module load darshan".
 
-=== Linux clusters using Intel MPI 
+=== Linux clusters using Intel MPI
 
 Most Intel MPI installations produce dynamic executables by default.  To
 configure Darshan in this environment you can use the following example:
@@ -551,46 +492,11 @@ the underlying GNU compilers rather than the Intel ICC compilers to compile
 Darshan itself.
 ====
 
-You can use the `LD_PRELOAD` method described earlier in this document to
-instrument executables compiled with the Intel MPI compiler scripts.  This
-method has been briefly tested using both GNU and Intel compilers.
+You can enable Darshan instrumentation at compile time by adding
+`darshan-config --dyn-ld-flags` options to your linker command line.
 
-.Caveat
-[NOTE]
-====
-Darshan is only known to work with C and C++ executables generated by the
-Intel MPI suite in versions prior to the 2017 version -- Darshan will not
-produce instrumentation for Fortran executables in these earlier versions (pre-2017).
-For more details on this issue please check this Intel forum discussion:
-
-http://software.intel.com/en-us/forums/showthread.php?t=103447&o=a&s=lr
-====
-
-=== Linux clusters using MPICH 
-
-Follow the generic instructions provided at the top of this document.  For MPICH versions 3.1 and
-later, MPICH uses shared libraries by default, so you may need to consider the dynamic linking
-instrumentation approach.  
-
-The static linking method can be used if MPICH is configured to use static
-linking by default, or if you are using a version prior to 3.1.
-The only modification is to make sure that the `CC` used for compilation is
-based on a GNU compiler.  Once Darshan has been installed, it should be
-capable of instrumenting executables built with GNU, Intel, and PGI
-compilers.
-
-[NOTE]
-Darshan is not capable of instrumenting Fortran applications build with MPICH versions 3.1.1, 3.1.2,
-or 3.1.3 due to a library symbol name compatibility issue.  Consider using a newer version of
-MPICH if you wish to instrument Fortran applications.  Please see
-http://trac.mpich.org/projects/mpich/ticket/2209 for more details.
-
-[NOTE]
-MPICH versions 3.1, 3.1.1, 3.1.2, and 3.1.3 may produce link-time errors when building static
-executables (i.e. using the -static option) if MPICH is built with shared library support.
-Please see http://trac.mpich.org/projects/mpich/ticket/2190 for more details.  The workaround if you
-wish to use static linking is to configure MPICH with `--enable-shared=no --enable-static=yes` to
-force it to use static MPI libraries with correct dependencies.
+Alternatively you can use `LD_PRELOAD` runtime instrumentation method to
+instrument executables that have already been compiled.
 
 === Linux clusters using Open MPI
 
@@ -598,13 +504,11 @@ Follow the generic instructions provided at the top of this document for
 compilation, and make sure that the `CC` used for compilation is based on a
 GNU compiler.
 
-Open MPI typically produces dynamically linked executables by default, which
-means that you should use the `LD_PRELOAD` method to instrument executables
-that have been built with Open MPI.  Darshan is only compatible with Open
-MPI 1.6.4 and newer.  For more details on why Darshan is not compatible with
-older versions of Open MPI, please refer to the following mailing list discussion:
+You can enable Darshan instrumentation at compile time by adding
+`darshan-config --dyn-ld-flags` options to your linker command line.
 
-http://www.open-mpi.org/community/lists/devel/2013/01/11907.php
+Alternatively you can use `LD_PRELOAD` runtime instrumentation method to
+instrument executables that have already been compiled.
 
 == Upgrading to Darshan 3.x from 2.x
 
@@ -678,18 +582,6 @@ For statically linked executables:
 00000000004070a0 T darshan_core_register_module
 ----
 
-* Make sure the application executable is statically linked:
-    ** In general, we encourage the use of purely statically linked executables when using the static
-instrumentation method given in link:darshan-runtime.html#_instrumenting_statically_linked_applications[Section 5]
-    ** If purely static executables are not an option, we encourage users to use the LD_PRELOAD method of
-instrumentation given in link:darshan-runtime.html#_instrumenting_dynamically_linked_applications[Section 6]
-    ** Statically linked executables are the default on Cray platforms and the IBM BG platforms; 
-statically linked executables can be explicitly requested using the `-static` compile option to most compilers
-    ** You can verify that an executable is purely statically linked by using the `file` command:
-----
-> file mpi-io-test
-mpi-io-test: ELF 64-bit LSB  executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.24, BuildID[sha1]=9893e599e7a560159ccf547b4c4ba5671f65ba32, not stripped
-----
 
 * Ensure that the linker is correctly linking in Darshan's runtime libraries:
     ** A common mistake is to explicitly link in the underlying MPI libraries (e.g., `-lmpich` or `-lmpichf90`)



View it on GitLab: https://xgitlab.cels.anl.gov/darshan/darshan/commit/2fbbbd3695d22db379a5a9c336f97a312cf17ea7

-- 
View it on GitLab: https://xgitlab.cels.anl.gov/darshan/darshan/commit/2fbbbd3695d22db379a5a9c336f97a312cf17ea7
You're receiving this email because of your account on xgitlab.cels.anl.gov.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-commits/attachments/20200325/742465ef/attachment-0001.html>


More information about the Darshan-commits mailing list