[Darshan-users] Unexpected Darshan I/O characterization running IOR BM

Carns, Philip H. carns at mcs.anl.gov
Mon Nov 26 15:16:36 CST 2018


I just glanced through the latest IOR revision in git and realized that it has some conditional logic for beegfs that will use a beegfs-specific library for some calls.

I'm not sure if that is being triggered in your case or not, but that's a possibility.  Darshan doesn't have wrappers for that API.

Could you check to see if HAVE_BEEGFS_BEEGFS_H is set in your src/config.h file?

thanks,
-Phil




On 2018-11-26 15:50:10-05:00 Darshan-users wrote:

Thanks Cormac.  I don't see any obvious problems from your output.  We'll see if we can reproduce here.

In the mean time if you want to debug further, you could try running IOR as a single process (no mpirun or mpiexec) through strace to see if there is anything unusual in the file open path.

thanks,
-Phil



On 2018-11-21 17:37:50-05:00 Cormac Garvey wrote:

Hi Phil,
Thanks for looking into this for me.
Here is the information you requested.
ldd ./ior
        linux-vdso.so.1 =>  (0x00007fff0237c000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fce3e88f000)
        libmpi.so.12 => /opt/intel/compilers_and_libraries_2016.3.223/linux/mpi/intel64/lib/libmpi.so.12 (0x00007fce3e0bf000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fce3dcfc000)
        /lib64/ld-linux-x86-64.so.2 (0x000055de86c95000)
        librt.so.1 => /lib64/librt.so.1 (0x00007fce3daf4000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fce3d8ef000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fce3d6d9000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fce3d4bd000)
[testusera at ip-0A021004 bin]$
Here is the info on testfile.0 and testfile.1 before executing the IOR read test.
[testusera at ip-0A021004 testing]$ pwd
/mnt/beegfs/testing
[testusera at ip-0A021004 testing]$ ls -lt
total 4194304
-rw-r--r--. 1 testusera domain users 2147483648 Nov 21 22:18 testfile.1
-rw-r--r--. 1 testusera domain users 2147483648 Nov 21 22:17 testfile.0
[testusera at ip-0A021004 testing]$
[testusera at ip-0A021004 testing]$ stat testfile.0
  File: ‘testfile.0’
  Size: 2147483648      Blocks: 4194304    IO Block: 524288 regular file
Device: 29h/41d Inode: 2501562826468697566  Links: 1
Access: (0644/-rw-r--r--)  Uid: (1264001104/testusera)   Gid: (1264000513/domain users)
Context: system_u:object_r:tmp_t:s0
Access: 2018-11-21 22:17:41.000000000 +0000
Modify: 2018-11-21 22:17:56.000000000 +0000
Change: 2018-11-21 22:17:56.000000000 +0000
 Birth: -
Result from running the IOR read test.
IOR-3.2.0: MPI Coordinated Test of Parallel I/O
Began               : Wed Nov 21 22:22:52 2018
Command line        : /avere/apps/ior/bin/ior -a POSIX -B -r -k -z -v -o /mnt/beegfs/testing/testfile -i 2 -m -t 32m -b 256M -d 1
Machine             : Linux ip-0A021005
Start time skew across all tasks: 0.00 sec
TestID              : 0
StartTime           : Wed Nov 21 22:22:52 2018
Path                : /mnt/beegfs/testing
FS                  : 11.8 TiB   Used FS: 0.0%   Inodes: 0.0 Mi   Used Inodes: -nan%
Participating tasks: 8
Options:
api                 : POSIX
apiVersion          :
test filename       : /mnt/beegfs/testing/testfile
access              : single-shared-file
type                : independent
segments            : 1
ordering in a file  : random
ordering inter file : no tasks offsets
tasks               : 8
clients per node    : 8
repetitions         : 2
xfersize            : 32 MiB
blocksize           : 256 MiB
aggregate filesize  : 2 GiB
Results:
access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
------    ---------  ---------- ---------  --------   --------   --------   --------   ----
delaying 1 seconds . . .
Commencing read performance test: Wed Nov 21 22:22:53 2018
read      481.88     262144     32768      0.020854   4.23       2.97       4.25       0
delaying 1 seconds . . .
Commencing read performance test: Wed Nov 21 22:22:58 2018
read      609.04     262144     32768      0.020336   3.34       2.06       3.36       1
Max Read:  609.04 MiB/sec (638.62 MB/sec)
Summary of all tests:
Operation   Max(MiB)   Min(MiB)  Mean(MiB)     StdDev   Max(OPs)   Min(OPs)  Mean(OPs)     StdDev    Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt   blksiz    xsize aggs(MiB)   API RefNum
read          609.04     481.88     545.46      63.58      19.03      15.06      17.05       1.99    3.80633     0      8   8    2   0     0        1         0    0      1 268435456 33554432    2048.0 POSIX      0
Finished            : Wed Nov 21 22:23:02 2018
I also ran a test without the -B option and got the same result (no reads recorded)?
Thanks,
Cormac.

On Wed, Nov 21, 2018 at 1:46 PM Carns, Philip H. via Darshan-users <darshan-users at lists.mcs.anl.gov<mailto:darshan-users at lists.mcs.anl.gov>> wrote:
Hi Cormac,

That's strange.  The -i 2 and -m options are making IOR attempt to read two files: testfile.0 and testfile.1.  Darshan is definitely aware of those files, but not only are the read counters missing but also the open counters.  It only shows a stat from each rank for the files in question.

Can you confirm that both of the files exist and are big enough to accommodate the requested read volume?  I wonder if IOR might stat the files up front and exit early if they aren't the correct size?

Could you also share the output of "ldd ior"?  I'm curious to make sure there isn't anything unusual about the libraries linked in, but usually if that were a problem you wouldn't get a log at all.

Also one last idea, does the behavior change if you remove the -B (O_DIRECT) option?  That shouldn't matter from Darshan's perspective, but it might not hurt to check.

thanks,
-Phil



On 2018-11-20 19:59:55-05:00 Darshan-users wrote:

Hello,
I recently installed Darshan 3.1.6 on Microsoft Azure VM running Centos 7.4.
I got an unexpected result using Darshan to characterize the I/O for an IOR benchmark
experiment.
export LD_PRELOAD=/avere/apps/spack/linux-centos7-x86_64/gcc-4.8.5/darshan-runtime-3.1.6-daohky4yevagxajjl33lwk472lcgn6g4/lib/libdarshan.so
BLOCKSIZE=256M
mpirun -np 8 ior -a POSIX -B -r -k -z -v -o $FILEPATH -i 2 -m -t 32m -b 256M -d 1
After the job completed, a Darshan log file was created, the resulting text report (darshan-parser_ior_read_shared.out, its attached) was generated using the following command
darshan-parser  --all testuser_ior_id22-20-80229-14589323501426222819_1.darshan >& darshan-parser_ior_read_shared.out
The above IOR benchmark is a read only benchmark to a shared file system, but the resulting darshan report indicates there are no READ operations?
Any ideas why the resulting Darshan resport has no read operations? (Note if I add a -w option to the above IOR benchmark  (i.e do a write and read to a shared filesystem, darshan only
reports the writes and not reads?)
Any help would be appreciated,
Thanks for your support,
Cormac.
_______________________________________________
Darshan-users mailing list
Darshan-users at lists.mcs.anl.gov<mailto:Darshan-users at lists.mcs.anl.gov>
https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20181126/5dc8c057/attachment-0001.html>


More information about the Darshan-users mailing list