[Darshan-users] Unexpected Darshan I/O characterization running IOR BM

Carns, Philip H. carns at mcs.anl.gov
Tue Nov 27 07:56:07 CST 2018


Thanks for being patient :)

It turns out that I can reproduce this on Ubuntu 18.10 as well.  It looks like the problem is that ior is getting to the glibc open call via a path that we aren't wrapping.  Here's the stack trace from breaking on open() in ior with darshan preloaded:

Breakpoint 1, __libc_open64 (file=file at entry=0x7fffffffc490 "test.0",
    oflag=16386) at ../sysdeps/unix/sysv/linux/open64.c:37
37 in ../sysdeps/unix/sysv/linux/open64.c
(gdb) where
#0  __libc_open64 (file=file at entry=0x7fffffffc490 "test.0", oflag=16386)
    at ../sysdeps/unix/sysv/linux/open64.c:37
#1  0x00007ffff79aac1b in __open_2 (file=file at entry=0x7fffffffc490 "test.0",
    oflag=<optimized out>) at open_2.c:28
#2  0x000055555555a5a0 in open (__oflag=<optimized out>,
    __path=0x7fffffffc490 "test.0")
    at /usr/include/x86_64-linux-gnu/bits/fcntl2.h:57
#3  POSIX_Open (testFileName=0x7fffffffc490 "test.0", param=0x555555574340)
    at ../../src/aiori-POSIX.c:386
#4  0x000055555556092e in TestIoSys (test=test at entry=0x555555574340)
    at ../../src/ior.c:1459
#5  0x00005555555621b8 in ior_main (argc=<optimized out>, argv=<optimized out>)
    at ../../src/ior.c:148
#6  0x00007ffff78c509b in __libc_start_main (main=0x5555555577e0 <main>,
    argc=19, argv=0x7fffffffcc08, init=<optimized out>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=0x7fffffffcbf8)
    at ../csu/libc-start.c:308
#7  0x000055555555781a in _start ()

Notice that the open call is defined in a header (fcntl2.h) and is actually calling __open_2() instead in the transition to the glibc shared library.

Here are the relevant lines from fcntl2.h on my machine:

__fortify_function int
open (const char *__path, int __oflag, ...)
{
  if (__va_arg_pack_len () > 1)
    __open_too_many_args ();
  if (__builtin_constant_p (__oflag))
    {
      if (__OPEN_NEEDS_MODE (__oflag) && __va_arg_pack_len () < 1)
        {
          __open_missing_mode ();
          return __open_2 (__path, __oflag);
        }
      return __open_alias (__path, __oflag, __va_arg_pack ());
    }
  if (__va_arg_pack_len () < 1)
    return __open_2 (__path, __oflag);
  return __open_alias (__path, __oflag, __va_arg_pack ());
}

I haven't tracked down why we are getting here (there are a number of compile time conditionals that can alter the open() path), or what to do about it yet.  I just wanted to confirm that it's been reproduced on another system.

thanks,
-Phil


On 2018-11-26 19:36:18-05:00 Cormac Garvey wrote:


Hi Phil,
Here is the information you requested. (See attached files)
strace_ior_write.log - Running strace on a single ior process (no mpirun) (write file)
strace_ior_read.log - Running strace on a single ior process (no mpirun) (read files generated in write step)
I do not have the original build environment for the ior binary I am using, but I am sure beegfs was not enabled in the build, I did not explicitly enable it and I recall during the build process it could not find the necessary beegfs header files. I have attached the output from running nm on the ior executable (nm ior >& nm_ior.log) instead.
Also
ldd /avere/apps/ior/bin/ior
        linux-vdso.so.1 =>  (0x00007ffd26b7e000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f981c0a7000)
        libmpi.so.12 => /opt/intel/compilers_and_libraries_2016.3.223/linux/mpi/intel64/lib/libmpi.so.12 (0x00007f981b8d7000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f981b514000)
        /lib64/ld-linux-x86-64.so.2 (0x000055a13c2f4000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f981b30c000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f981b107000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f981aef1000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f981acd5000)
Thanks for your support,
Cormac.

On Mon, Nov 26, 2018 at 2:16 PM Carns, Philip H. <carns at mcs.anl.gov<mailto:carns at mcs.anl.gov>> wrote:
I just glanced through the latest IOR revision in git and realized that it has some conditional logic for beegfs that will use a beegfs-specific library for some calls.

I'm not sure if that is being triggered in your case or not, but that's a possibility.  Darshan doesn't have wrappers for that API.

Could you check to see if HAVE_BEEGFS_BEEGFS_H is set in your src/config.h file?

thanks,
-Phil




On 2018-11-26 15:50:10-05:00 Darshan-users wrote:

Thanks Cormac.  I don't see any obvious problems from your output.  We'll see if we can reproduce here.

In the mean time if you want to debug further, you could try running IOR as a single process (no mpirun or mpiexec) through strace to see if there is anything unusual in the file open path.

thanks,
-Phil



On 2018-11-21 17:37:50-05:00 Cormac Garvey wrote:

Hi Phil,
Thanks for looking into this for me.
Here is the information you requested.
ldd ./ior
        linux-vdso.so.1 =>  (0x00007fff0237c000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fce3e88f000)
        libmpi.so.12 => /opt/intel/compilers_and_libraries_2016.3.223/linux/mpi/intel64/lib/libmpi.so.12 (0x00007fce3e0bf000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fce3dcfc000)
        /lib64/ld-linux-x86-64.so.2 (0x000055de86c95000)
        librt.so.1 => /lib64/librt.so.1 (0x00007fce3daf4000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fce3d8ef000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fce3d6d9000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fce3d4bd000)
[testusera at ip-0A021004 bin]$
Here is the info on testfile.0 and testfile.1 before executing the IOR read test.
[testusera at ip-0A021004 testing]$ pwd
/mnt/beegfs/testing
[testusera at ip-0A021004 testing]$ ls -lt
total 4194304
-rw-r--r--. 1 testusera domain users 2147483648 Nov 21 22:18 testfile.1
-rw-r--r--. 1 testusera domain users 2147483648 Nov 21 22:17 testfile.0
[testusera at ip-0A021004 testing]$
[testusera at ip-0A021004 testing]$ stat testfile.0
  File: ‘testfile.0’
  Size: 2147483648      Blocks: 4194304    IO Block: 524288 regular file
Device: 29h/41d Inode: 2501562826468697566  Links: 1
Access: (0644/-rw-r--r--)  Uid: (1264001104/testusera)   Gid: (1264000513/domain users)
Context: system_u:object_r:tmp_t:s0
Access: 2018-11-21 22:17:41.000000000 +0000
Modify: 2018-11-21 22:17:56.000000000 +0000
Change: 2018-11-21 22:17:56.000000000 +0000
 Birth: -
Result from running the IOR read test.
IOR-3.2.0: MPI Coordinated Test of Parallel I/O
Began               : Wed Nov 21 22:22:52 2018
Command line        : /avere/apps/ior/bin/ior -a POSIX -B -r -k -z -v -o /mnt/beegfs/testing/testfile -i 2 -m -t 32m -b 256M -d 1
Machine             : Linux ip-0A021005
Start time skew across all tasks: 0.00 sec
TestID              : 0
StartTime           : Wed Nov 21 22:22:52 2018
Path                : /mnt/beegfs/testing
FS                  : 11.8 TiB   Used FS: 0.0%   Inodes: 0.0 Mi   Used Inodes: -nan%
Participating tasks: 8
Options:
api                 : POSIX
apiVersion          :
test filename       : /mnt/beegfs/testing/testfile
access              : single-shared-file
type                : independent
segments            : 1
ordering in a file  : random
ordering inter file : no tasks offsets
tasks               : 8
clients per node    : 8
repetitions         : 2
xfersize            : 32 MiB
blocksize           : 256 MiB
aggregate filesize  : 2 GiB
Results:
access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
------    ---------  ---------- ---------  --------   --------   --------   --------   ----
delaying 1 seconds . . .
Commencing read performance test: Wed Nov 21 22:22:53 2018
read      481.88     262144     32768      0.020854   4.23       2.97       4.25       0
delaying 1 seconds . . .
Commencing read performance test: Wed Nov 21 22:22:58 2018
read      609.04     262144     32768      0.020336   3.34       2.06       3.36       1
Max Read:  609.04 MiB/sec (638.62 MB/sec)
Summary of all tests:
Operation   Max(MiB)   Min(MiB)  Mean(MiB)     StdDev   Max(OPs)   Min(OPs)  Mean(OPs)     StdDev    Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt   blksiz    xsize aggs(MiB)   API RefNum
read          609.04     481.88     545.46      63.58      19.03      15.06      17.05       1.99    3.80633     0      8   8    2   0     0        1         0    0      1 268435456 33554432    2048.0 POSIX      0
Finished            : Wed Nov 21 22:23:02 2018
I also ran a test without the -B option and got the same result (no reads recorded)?
Thanks,
Cormac.

On Wed, Nov 21, 2018 at 1:46 PM Carns, Philip H. via Darshan-users <darshan-users at lists.mcs.anl.gov<mailto:darshan-users at lists.mcs.anl.gov>> wrote:
Hi Cormac,

That's strange.  The -i 2 and -m options are making IOR attempt to read two files: testfile.0 and testfile.1.  Darshan is definitely aware of those files, but not only are the read counters missing but also the open counters.  It only shows a stat from each rank for the files in question.

Can you confirm that both of the files exist and are big enough to accommodate the requested read volume?  I wonder if IOR might stat the files up front and exit early if they aren't the correct size?

Could you also share the output of "ldd ior"?  I'm curious to make sure there isn't anything unusual about the libraries linked in, but usually if that were a problem you wouldn't get a log at all.

Also one last idea, does the behavior change if you remove the -B (O_DIRECT) option?  That shouldn't matter from Darshan's perspective, but it might not hurt to check.

thanks,
-Phil



On 2018-11-20 19:59:55-05:00 Darshan-users wrote:

Hello,
I recently installed Darshan 3.1.6 on Microsoft Azure VM running Centos 7.4.
I got an unexpected result using Darshan to characterize the I/O for an IOR benchmark
experiment.
export LD_PRELOAD=/avere/apps/spack/linux-centos7-x86_64/gcc-4.8.5/darshan-runtime-3.1.6-daohky4yevagxajjl33lwk472lcgn6g4/lib/libdarshan.so
BLOCKSIZE=256M
mpirun -np 8 ior -a POSIX -B -r -k -z -v -o $FILEPATH -i 2 -m -t 32m -b 256M -d 1
After the job completed, a Darshan log file was created, the resulting text report (darshan-parser_ior_read_shared.out, its attached) was generated using the following command
darshan-parser  --all testuser_ior_id22-20-80229-14589323501426222819_1.darshan >& darshan-parser_ior_read_shared.out
The above IOR benchmark is a read only benchmark to a shared file system, but the resulting darshan report indicates there are no READ operations?
Any ideas why the resulting Darshan resport has no read operations? (Note if I add a -w option to the above IOR benchmark  (i.e do a write and read to a shared filesystem, darshan only
reports the writes and not reads?)
Any help would be appreciated,
Thanks for your support,
Cormac.
_______________________________________________
Darshan-users mailing list
Darshan-users at lists.mcs.anl.gov<mailto:Darshan-users at lists.mcs.anl.gov>
https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20181127/701e035d/attachment-0001.html>


More information about the Darshan-users mailing list