[Darshan-users] Unexpected Darshan I/O characterization running IOR BM

Cormac Garvey cormac.t.garvey at gmail.com
Tue Nov 27 16:59:43 CST 2018


Hi Phil,
Thank you very much for your help tracking down this error.

I have verified the modifications you make work and I now get the expected
results with IOR.

Best Regards,
Cormac.

On Tue, Nov 27, 2018 at 8:28 AM Carns, Philip H. <carns at mcs.anl.gov> wrote:

> Here is a PR that fixes the issue and makes the posix read counters show
> up properly in IOR with Darshan preloaded:
>
> https://xgitlab.cels.anl.gov/darshan/darshan/merge_requests/28
>
> It's a little frustrating to hit cases like this where glibc is using a
> different symbol that we expect, but I'm not sure what else to do about it.
>
> Thanks for the detailed bug report!
>
> -Phil
>
>
>
> On 2018-11-27 08:56:07-05:00 Carns, Philip H. wrote:
>
>
> Thanks for being patient :)
>
> It turns out that I can reproduce this on Ubuntu 18.10 as well.  It looks
> like the problem is that ior is getting to the glibc open call via a path
> that we aren't wrapping.  Here's the stack trace from breaking on open() in
> ior with darshan preloaded:
>
> Breakpoint 1, __libc_open64 (file=file at entry=0x7fffffffc490 "test.0",
>     oflag=16386) at ../sysdeps/unix/sysv/linux/open64.c:37
> 37 in ../sysdeps/unix/sysv/linux/open64.c
> (gdb) where
> #0  __libc_open64 (file=file at entry=0x7fffffffc490 "test.0", oflag=16386)
>     at ../sysdeps/unix/sysv/linux/open64.c:37
> #1  0x00007ffff79aac1b in __open_2 (file=file at entry=0x7fffffffc490
> "test.0",
>     oflag=<optimized out>) at open_2.c:28
> #2  0x000055555555a5a0 in open (__oflag=<optimized out>,
>     __path=0x7fffffffc490 "test.0")
>     at /usr/include/x86_64-linux-gnu/bits/fcntl2.h:57
> #3  POSIX_Open (testFileName=0x7fffffffc490 "test.0", param=0x555555574340)
>     at ../../src/aiori-POSIX.c:386
> #4  0x000055555556092e in TestIoSys (test=test at entry=0x555555574340)
>     at ../../src/ior.c:1459
> #5  0x00005555555621b8 in ior_main (argc=<optimized out>, argv=<optimized
> out>)
>     at ../../src/ior.c:148
> #6  0x00007ffff78c509b in __libc_start_main (main=0x5555555577e0 <main>,
>     argc=19, argv=0x7fffffffcc08, init=<optimized out>, fini=<optimized
> out>,
>     rtld_fini=<optimized out>, stack_end=0x7fffffffcbf8)
>     at ../csu/libc-start.c:308
> #7  0x000055555555781a in _start ()
>
> Notice that the open call is defined in a header (fcntl2.h) and is
> actually calling __open_2() instead in the transition to the glibc shared
> library.
>
> Here are the relevant lines from fcntl2.h on my machine:
>
> __fortify_function int
> open (const char *__path, int __oflag, ...)
> {
>   if (__va_arg_pack_len () > 1)
>     __open_too_many_args ();
>   if (__builtin_constant_p (__oflag))
>     {
>       if (__OPEN_NEEDS_MODE (__oflag) && __va_arg_pack_len () < 1)
>         {
>           __open_missing_mode ();
>           return __open_2 (__path, __oflag);
>         }
>       return __open_alias (__path, __oflag, __va_arg_pack ());
>     }
>   if (__va_arg_pack_len () < 1)
>     return __open_2 (__path, __oflag);
>   return __open_alias (__path, __oflag, __va_arg_pack ());
> }
>
> I haven't tracked down why we are getting here (there are a number of
> compile time conditionals that can alter the open() path), or what to do
> about it yet.  I just wanted to confirm that it's been reproduced on
> another system.
>
> thanks,
> -Phil
>
>
> On 2018-11-26 19:36:18-05:00 Cormac Garvey wrote:
>
>
> Hi Phil,
> Here is the information you requested. (See attached files)
> strace_ior_write.log - Running strace on a single ior process (no mpirun)
> (write file)
> strace_ior_read.log - Running strace on a single ior process (no mpirun)
> (read files generated in write step)
> I do not have the original build environment for the ior binary I am
> using, but I am sure beegfs was not enabled in the build, I did not
> explicitly enable it and I recall during the build process it could not
> find the necessary beegfs header files. I have attached the output from
> running nm on the ior executable (nm ior >& nm_ior.log) instead.
> Also
> ldd /avere/apps/ior/bin/ior
>         linux-vdso.so.1 =>  (0x00007ffd26b7e000)
>         libm.so.6 => /lib64/libm.so.6 (0x00007f981c0a7000)
>         libmpi.so.12 =>
> /opt/intel/compilers_and_libraries_2016.3.223/linux/mpi/intel64/lib/libmpi.so.12
> (0x00007f981b8d7000)
>         libc.so.6 => /lib64/libc.so.6 (0x00007f981b514000)
>         /lib64/ld-linux-x86-64.so.2 (0x000055a13c2f4000)
>         librt.so.1 => /lib64/librt.so.1 (0x00007f981b30c000)
>         libdl.so.2 => /lib64/libdl.so.2 (0x00007f981b107000)
>         libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f981aef1000)
>         libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f981acd5000)
> Thanks for your support,
> Cormac.
>
> On Mon, Nov 26, 2018 at 2:16 PM Carns, Philip H. <carns at mcs.anl.gov>
> wrote:
>
>> I just glanced through the latest IOR revision in git and realized that
>> it has some conditional logic for beegfs that will use a beegfs-specific
>> library for some calls.
>>
>> I'm not sure if that is being triggered in your case or not, but that's a
>> possibility.  Darshan doesn't have wrappers for that API.
>>
>> Could you check to see if HAVE_BEEGFS_BEEGFS_H is set in your
>> src/config.h file?
>>
>> thanks,
>> -Phil
>>
>>
>>
>>
>> On 2018-11-26 15:50:10-05:00 Darshan-users wrote:
>>
>> Thanks Cormac.  I don't see any obvious problems from your output.  We'll
>> see if we can reproduce here.
>>
>> In the mean time if you want to debug further, you could try running IOR
>> as a single process (no mpirun or mpiexec) through strace to see if there
>> is anything unusual in the file open path.
>>
>> thanks,
>> -Phil
>>
>>
>>
>> On 2018-11-21 17:37:50-05:00 Cormac Garvey wrote:
>>
>> Hi Phil,
>> Thanks for looking into this for me.
>> Here is the information you requested.
>> ldd ./ior
>>         linux-vdso.so.1 =>  (0x00007fff0237c000)
>>         libm.so.6 => /lib64/libm.so.6 (0x00007fce3e88f000)
>>         libmpi.so.12 =>
>> /opt/intel/compilers_and_libraries_2016.3.223/linux/mpi/intel64/lib/libmpi.so.12
>> (0x00007fce3e0bf000)
>>         libc.so.6 => /lib64/libc.so.6 (0x00007fce3dcfc000)
>>         /lib64/ld-linux-x86-64.so.2 (0x000055de86c95000)
>>         librt.so.1 => /lib64/librt.so.1 (0x00007fce3daf4000)
>>         libdl.so.2 => /lib64/libdl.so.2 (0x00007fce3d8ef000)
>>         libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fce3d6d9000)
>>         libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fce3d4bd000)
>> [testusera at ip-0A021004 bin]$
>> Here is the info on testfile.0 and testfile.1 before executing the IOR
>> read test.
>> [testusera at ip-0A021004 testing]$ pwd
>> /mnt/beegfs/testing
>> [testusera at ip-0A021004 testing]$ ls -lt
>> total 4194304
>> -rw-r--r--. 1 testusera domain users 2147483648 Nov 21 22:18 testfile.1
>> -rw-r--r--. 1 testusera domain users 2147483648 Nov 21 22:17 testfile.0
>> [testusera at ip-0A021004 testing]$
>> [testusera at ip-0A021004 testing]$ stat testfile.0
>>   File: ‘testfile.0’
>>   Size: 2147483648      Blocks: 4194304    IO Block: 524288 regular file
>> Device: 29h/41d Inode: 2501562826468697566  Links: 1
>> Access: (0644/-rw-r--r--)  Uid: (1264001104/testusera)   Gid:
>> (1264000513/domain users)
>> Context: system_u:object_r:tmp_t:s0
>> Access: 2018-11-21 22:17:41.000000000 +0000
>> Modify: 2018-11-21 22:17:56.000000000 +0000
>> Change: 2018-11-21 22:17:56.000000000 +0000
>>  Birth: -
>> Result from running the IOR read test.
>> IOR-3.2.0: MPI Coordinated Test of Parallel I/O
>> Began               : Wed Nov 21 22:22:52 2018
>> Command line        : /avere/apps/ior/bin/ior -a POSIX -B -r -k -z -v -o
>> /mnt/beegfs/testing/testfile -i 2 -m -t 32m -b 256M -d 1
>> Machine             : Linux ip-0A021005
>> Start time skew across all tasks: 0.00 sec
>> TestID              : 0
>> StartTime           : Wed Nov 21 22:22:52 2018
>> Path                : /mnt/beegfs/testing
>> FS                  : 11.8 TiB   Used FS: 0.0%   Inodes: 0.0 Mi   Used
>> Inodes: -nan%
>> Participating tasks: 8
>> Options:
>> api                 : POSIX
>> apiVersion          :
>> test filename       : /mnt/beegfs/testing/testfile
>> access              : single-shared-file
>> type                : independent
>> segments            : 1
>> ordering in a file  : random
>> ordering inter file : no tasks offsets
>> tasks               : 8
>> clients per node    : 8
>> repetitions         : 2
>> xfersize            : 32 MiB
>> blocksize           : 256 MiB
>> aggregate filesize  : 2 GiB
>> Results:
>> access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)
>>  close(s)   total(s)   iter
>> ------    ---------  ---------- ---------  --------   --------
>>  --------   --------   ----
>> delaying 1 seconds . . .
>> Commencing read performance test: Wed Nov 21 22:22:53 2018
>> read      481.88     262144     32768      0.020854   4.23       2.97
>>    4.25       0
>> delaying 1 seconds . . .
>> Commencing read performance test: Wed Nov 21 22:22:58 2018
>> read      609.04     262144     32768      0.020336   3.34       2.06
>>    3.36       1
>> Max Read:  609.04 MiB/sec (638.62 MB/sec)
>> Summary of all tests:
>> Operation   Max(MiB)   Min(MiB)  Mean(MiB)     StdDev   Max(OPs)
>>  Min(OPs)  Mean(OPs)     StdDev    Mean(s) Test# #Tasks tPN reps fPP reord
>> reordoff reordrand seed segcnt   blksiz    xsize aggs(MiB)   API RefNum
>> read          609.04     481.88     545.46      63.58      19.03
>> 15.06      17.05       1.99    3.80633     0      8   8    2   0     0
>>   1         0    0      1 268435456 33554432    2048.0 POSIX      0
>> Finished            : Wed Nov 21 22:23:02 2018
>> I also ran a test without the -B option and got the same result (no reads
>> recorded)?
>> Thanks,
>> Cormac.
>>
>> On Wed, Nov 21, 2018 at 1:46 PM Carns, Philip H. via Darshan-users <
>> darshan-users at lists.mcs.anl.gov> wrote:
>>
>>> Hi Cormac,
>>>
>>> That's strange.  The -i 2 and -m options are making IOR attempt to read
>>> two files: testfile.0 and testfile.1.  Darshan is definitely aware of those
>>> files, but not only are the read counters missing but also the open
>>> counters.  It only shows a stat from each rank for the files in question.
>>>
>>> Can you confirm that both of the files exist and are big enough to
>>> accommodate the requested read volume?  I wonder if IOR might stat the
>>> files up front and exit early if they aren't the correct size?
>>>
>>> Could you also share the output of "ldd ior"?  I'm curious to make sure
>>> there isn't anything unusual about the libraries linked in, but usually if
>>> that were a problem you wouldn't get a log at all.
>>>
>>> Also one last idea, does the behavior change if you remove the -B
>>> (O_DIRECT) option?  That shouldn't matter from Darshan's perspective, but
>>> it might not hurt to check.
>>>
>>> thanks,
>>> -Phil
>>>
>>>
>>>
>>> On 2018-11-20 19:59:55-05:00 Darshan-users wrote:
>>>
>>> Hello,
>>> I recently installed Darshan 3.1.6 on Microsoft Azure VM running Centos
>>> 7.4.
>>> I got an unexpected result using Darshan to characterize the I/O for an
>>> IOR benchmark
>>> experiment.
>>> export
>>> LD_PRELOAD=/avere/apps/spack/linux-centos7-x86_64/gcc-4.8.5/darshan-runtime-3.1.6-daohky4yevagxajjl33lwk472lcgn6g4/lib/libdarshan.so
>>> BLOCKSIZE=256M
>>> mpirun -np 8 ior -a POSIX -B -r -k -z -v -o $FILEPATH -i 2 -m -t 32m -b
>>> 256M -d 1
>>> After the job completed, a Darshan log file was created, the resulting
>>> text report (darshan-parser_ior_read_shared.out, its attached) was
>>> generated using the following command
>>> darshan-parser  --all
>>> testuser_ior_id22-20-80229-14589323501426222819_1.darshan >&
>>> darshan-parser_ior_read_shared.out
>>> The above IOR benchmark is a read only benchmark to a shared file
>>> system, but the resulting darshan report indicates there are no READ
>>> operations?
>>> Any ideas why the resulting Darshan resport has no read operations?
>>> (Note if I add a -w option to the above IOR benchmark  (i.e do a write and
>>> read to a shared filesystem, darshan only
>>> reports the writes and not reads?)
>>> Any help would be appreciated,
>>> Thanks for your support,
>>> Cormac.
>>>
>>> _______________________________________________
>>> Darshan-users mailing list
>>> Darshan-users at lists.mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20181127/046604d0/attachment-0001.html>


More information about the Darshan-users mailing list