[Darshan-users] Unexpected Darshan I/O characterization running IOR BM
Carns, Philip H.
carns at mcs.anl.gov
Mon Nov 26 14:49:38 CST 2018
Thanks Cormac. I don't see any obvious problems from your output. We'll see if we can reproduce here.
In the mean time if you want to debug further, you could try running IOR as a single process (no mpirun or mpiexec) through strace to see if there is anything unusual in the file open path.
thanks,
-Phil
On 2018-11-21 17:37:50-05:00 Cormac Garvey wrote:
Hi Phil,
Thanks for looking into this for me.
Here is the information you requested.
ldd ./ior
linux-vdso.so.1 => (0x00007fff0237c000)
libm.so.6 => /lib64/libm.so.6 (0x00007fce3e88f000)
libmpi.so.12 => /opt/intel/compilers_and_libraries_2016.3.223/linux/mpi/intel64/lib/libmpi.so.12 (0x00007fce3e0bf000)
libc.so.6 => /lib64/libc.so.6 (0x00007fce3dcfc000)
/lib64/ld-linux-x86-64.so.2 (0x000055de86c95000)
librt.so.1 => /lib64/librt.so.1 (0x00007fce3daf4000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fce3d8ef000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fce3d6d9000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fce3d4bd000)
[testusera at ip-0A021004 bin]$
Here is the info on testfile.0 and testfile.1 before executing the IOR read test.
[testusera at ip-0A021004 testing]$ pwd
/mnt/beegfs/testing
[testusera at ip-0A021004 testing]$ ls -lt
total 4194304
-rw-r--r--. 1 testusera domain users 2147483648 Nov 21 22:18 testfile.1
-rw-r--r--. 1 testusera domain users 2147483648 Nov 21 22:17 testfile.0
[testusera at ip-0A021004 testing]$
[testusera at ip-0A021004 testing]$ stat testfile.0
File: ‘testfile.0’
Size: 2147483648 Blocks: 4194304 IO Block: 524288 regular file
Device: 29h/41d Inode: 2501562826468697566 Links: 1
Access: (0644/-rw-r--r--) Uid: (1264001104/testusera) Gid: (1264000513/domain users)
Context: system_u:object_r:tmp_t:s0
Access: 2018-11-21 22:17:41.000000000 +0000
Modify: 2018-11-21 22:17:56.000000000 +0000
Change: 2018-11-21 22:17:56.000000000 +0000
Birth: -
Result from running the IOR read test.
IOR-3.2.0: MPI Coordinated Test of Parallel I/O
Began : Wed Nov 21 22:22:52 2018
Command line : /avere/apps/ior/bin/ior -a POSIX -B -r -k -z -v -o /mnt/beegfs/testing/testfile -i 2 -m -t 32m -b 256M -d 1
Machine : Linux ip-0A021005
Start time skew across all tasks: 0.00 sec
TestID : 0
StartTime : Wed Nov 21 22:22:52 2018
Path : /mnt/beegfs/testing
FS : 11.8 TiB Used FS: 0.0% Inodes: 0.0 Mi Used Inodes: -nan%
Participating tasks: 8
Options:
api : POSIX
apiVersion :
test filename : /mnt/beegfs/testing/testfile
access : single-shared-file
type : independent
segments : 1
ordering in a file : random
ordering inter file : no tasks offsets
tasks : 8
clients per node : 8
repetitions : 2
xfersize : 32 MiB
blocksize : 256 MiB
aggregate filesize : 2 GiB
Results:
access bw(MiB/s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter
------ --------- ---------- --------- -------- -------- -------- -------- ----
delaying 1 seconds . . .
Commencing read performance test: Wed Nov 21 22:22:53 2018
read 481.88 262144 32768 0.020854 4.23 2.97 4.25 0
delaying 1 seconds . . .
Commencing read performance test: Wed Nov 21 22:22:58 2018
read 609.04 262144 32768 0.020336 3.34 2.06 3.36 1
Max Read: 609.04 MiB/sec (638.62 MB/sec)
Summary of all tests:
Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Max(OPs) Min(OPs) Mean(OPs) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggs(MiB) API RefNum
read 609.04 481.88 545.46 63.58 19.03 15.06 17.05 1.99 3.80633 0 8 8 2 0 0 1 0 0 1 268435456 33554432 2048.0 POSIX 0
Finished : Wed Nov 21 22:23:02 2018
I also ran a test without the -B option and got the same result (no reads recorded)?
Thanks,
Cormac.
On Wed, Nov 21, 2018 at 1:46 PM Carns, Philip H. via Darshan-users <darshan-users at lists.mcs.anl.gov<mailto:darshan-users at lists.mcs.anl.gov>> wrote:
Hi Cormac,
That's strange. The -i 2 and -m options are making IOR attempt to read two files: testfile.0 and testfile.1. Darshan is definitely aware of those files, but not only are the read counters missing but also the open counters. It only shows a stat from each rank for the files in question.
Can you confirm that both of the files exist and are big enough to accommodate the requested read volume? I wonder if IOR might stat the files up front and exit early if they aren't the correct size?
Could you also share the output of "ldd ior"? I'm curious to make sure there isn't anything unusual about the libraries linked in, but usually if that were a problem you wouldn't get a log at all.
Also one last idea, does the behavior change if you remove the -B (O_DIRECT) option? That shouldn't matter from Darshan's perspective, but it might not hurt to check.
thanks,
-Phil
On 2018-11-20 19:59:55-05:00 Darshan-users wrote:
Hello,
I recently installed Darshan 3.1.6 on Microsoft Azure VM running Centos 7.4.
I got an unexpected result using Darshan to characterize the I/O for an IOR benchmark
experiment.
export LD_PRELOAD=/avere/apps/spack/linux-centos7-x86_64/gcc-4.8.5/darshan-runtime-3.1.6-daohky4yevagxajjl33lwk472lcgn6g4/lib/libdarshan.so
BLOCKSIZE=256M
mpirun -np 8 ior -a POSIX -B -r -k -z -v -o $FILEPATH -i 2 -m -t 32m -b 256M -d 1
After the job completed, a Darshan log file was created, the resulting text report (darshan-parser_ior_read_shared.out, its attached) was generated using the following command
darshan-parser --all testuser_ior_id22-20-80229-14589323501426222819_1.darshan >& darshan-parser_ior_read_shared.out
The above IOR benchmark is a read only benchmark to a shared file system, but the resulting darshan report indicates there are no READ operations?
Any ideas why the resulting Darshan resport has no read operations? (Note if I add a -w option to the above IOR benchmark (i.e do a write and read to a shared filesystem, darshan only
reports the writes and not reads?)
Any help would be appreciated,
Thanks for your support,
Cormac.
_______________________________________________
Darshan-users mailing list
Darshan-users at lists.mcs.anl.gov<mailto:Darshan-users at lists.mcs.anl.gov>
https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20181126/422b8058/attachment.html>
More information about the Darshan-users
mailing list