[Darshan-users] Unexpected Darshan I/O characterization running IOR BM
Cormac Garvey
cormac.t.garvey at gmail.com
Mon Nov 26 18:35:00 CST 2018
Hi Phil,
Here is the information you requested. (See attached files)
strace_ior_write.log - Running strace on a single ior process (no mpirun)
(write file)
strace_ior_read.log - Running strace on a single ior process (no mpirun)
(read files generated in write step)
I do not have the original build environment for the ior binary I am using,
but I am sure beegfs was not enabled in the build, I did not explicitly
enable it and I recall during the build process it could not find the
necessary beegfs header files. I have attached the output from running nm
on the ior executable (nm ior >& nm_ior.log) instead.
Also
ldd /avere/apps/ior/bin/ior
linux-vdso.so.1 => (0x00007ffd26b7e000)
libm.so.6 => /lib64/libm.so.6 (0x00007f981c0a7000)
libmpi.so.12 =>
/opt/intel/compilers_and_libraries_2016.3.223/linux/mpi/intel64/lib/libmpi.so.12
(0x00007f981b8d7000)
libc.so.6 => /lib64/libc.so.6 (0x00007f981b514000)
/lib64/ld-linux-x86-64.so.2 (0x000055a13c2f4000)
librt.so.1 => /lib64/librt.so.1 (0x00007f981b30c000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f981b107000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f981aef1000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f981acd5000)
Thanks for your support,
Cormac.
On Mon, Nov 26, 2018 at 2:16 PM Carns, Philip H. <carns at mcs.anl.gov> wrote:
> I just glanced through the latest IOR revision in git and realized that it
> has some conditional logic for beegfs that will use a beegfs-specific
> library for some calls.
>
> I'm not sure if that is being triggered in your case or not, but that's a
> possibility. Darshan doesn't have wrappers for that API.
>
> Could you check to see if HAVE_BEEGFS_BEEGFS_H is set in your src/config.h
> file?
>
> thanks,
> -Phil
>
>
>
>
> On 2018-11-26 15:50:10-05:00 Darshan-users wrote:
>
> Thanks Cormac. I don't see any obvious problems from your output. We'll
> see if we can reproduce here.
>
> In the mean time if you want to debug further, you could try running IOR
> as a single process (no mpirun or mpiexec) through strace to see if there
> is anything unusual in the file open path.
>
> thanks,
> -Phil
>
>
>
> On 2018-11-21 17:37:50-05:00 Cormac Garvey wrote:
>
> Hi Phil,
> Thanks for looking into this for me.
> Here is the information you requested.
> ldd ./ior
> linux-vdso.so.1 => (0x00007fff0237c000)
> libm.so.6 => /lib64/libm.so.6 (0x00007fce3e88f000)
> libmpi.so.12 =>
> /opt/intel/compilers_and_libraries_2016.3.223/linux/mpi/intel64/lib/libmpi.so.12
> (0x00007fce3e0bf000)
> libc.so.6 => /lib64/libc.so.6 (0x00007fce3dcfc000)
> /lib64/ld-linux-x86-64.so.2 (0x000055de86c95000)
> librt.so.1 => /lib64/librt.so.1 (0x00007fce3daf4000)
> libdl.so.2 => /lib64/libdl.so.2 (0x00007fce3d8ef000)
> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fce3d6d9000)
> libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fce3d4bd000)
> [testusera at ip-0A021004 bin]$
> Here is the info on testfile.0 and testfile.1 before executing the IOR
> read test.
> [testusera at ip-0A021004 testing]$ pwd
> /mnt/beegfs/testing
> [testusera at ip-0A021004 testing]$ ls -lt
> total 4194304
> -rw-r--r--. 1 testusera domain users 2147483648 Nov 21 22:18 testfile.1
> -rw-r--r--. 1 testusera domain users 2147483648 Nov 21 22:17 testfile.0
> [testusera at ip-0A021004 testing]$
> [testusera at ip-0A021004 testing]$ stat testfile.0
> File: ‘testfile.0’
> Size: 2147483648 Blocks: 4194304 IO Block: 524288 regular file
> Device: 29h/41d Inode: 2501562826468697566 Links: 1
> Access: (0644/-rw-r--r--) Uid: (1264001104/testusera) Gid:
> (1264000513/domain users)
> Context: system_u:object_r:tmp_t:s0
> Access: 2018-11-21 22:17:41.000000000 +0000
> Modify: 2018-11-21 22:17:56.000000000 +0000
> Change: 2018-11-21 22:17:56.000000000 +0000
> Birth: -
> Result from running the IOR read test.
> IOR-3.2.0: MPI Coordinated Test of Parallel I/O
> Began : Wed Nov 21 22:22:52 2018
> Command line : /avere/apps/ior/bin/ior -a POSIX -B -r -k -z -v -o
> /mnt/beegfs/testing/testfile -i 2 -m -t 32m -b 256M -d 1
> Machine : Linux ip-0A021005
> Start time skew across all tasks: 0.00 sec
> TestID : 0
> StartTime : Wed Nov 21 22:22:52 2018
> Path : /mnt/beegfs/testing
> FS : 11.8 TiB Used FS: 0.0% Inodes: 0.0 Mi Used
> Inodes: -nan%
> Participating tasks: 8
> Options:
> api : POSIX
> apiVersion :
> test filename : /mnt/beegfs/testing/testfile
> access : single-shared-file
> type : independent
> segments : 1
> ordering in a file : random
> ordering inter file : no tasks offsets
> tasks : 8
> clients per node : 8
> repetitions : 2
> xfersize : 32 MiB
> blocksize : 256 MiB
> aggregate filesize : 2 GiB
> Results:
> access bw(MiB/s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s)
> total(s) iter
> ------ --------- ---------- --------- -------- -------- --------
> -------- ----
> delaying 1 seconds . . .
> Commencing read performance test: Wed Nov 21 22:22:53 2018
> read 481.88 262144 32768 0.020854 4.23 2.97
> 4.25 0
> delaying 1 seconds . . .
> Commencing read performance test: Wed Nov 21 22:22:58 2018
> read 609.04 262144 32768 0.020336 3.34 2.06
> 3.36 1
> Max Read: 609.04 MiB/sec (638.62 MB/sec)
> Summary of all tests:
> Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Max(OPs)
> Min(OPs) Mean(OPs) StdDev Mean(s) Test# #Tasks tPN reps fPP reord
> reordoff reordrand seed segcnt blksiz xsize aggs(MiB) API RefNum
> read 609.04 481.88 545.46 63.58 19.03
> 15.06 17.05 1.99 3.80633 0 8 8 2 0 0
> 1 0 0 1 268435456 33554432 2048.0 POSIX 0
> Finished : Wed Nov 21 22:23:02 2018
> I also ran a test without the -B option and got the same result (no reads
> recorded)?
> Thanks,
> Cormac.
>
> On Wed, Nov 21, 2018 at 1:46 PM Carns, Philip H. via Darshan-users <
> darshan-users at lists.mcs.anl.gov> wrote:
>
>> Hi Cormac,
>>
>> That's strange. The -i 2 and -m options are making IOR attempt to read
>> two files: testfile.0 and testfile.1. Darshan is definitely aware of those
>> files, but not only are the read counters missing but also the open
>> counters. It only shows a stat from each rank for the files in question.
>>
>> Can you confirm that both of the files exist and are big enough to
>> accommodate the requested read volume? I wonder if IOR might stat the
>> files up front and exit early if they aren't the correct size?
>>
>> Could you also share the output of "ldd ior"? I'm curious to make sure
>> there isn't anything unusual about the libraries linked in, but usually if
>> that were a problem you wouldn't get a log at all.
>>
>> Also one last idea, does the behavior change if you remove the -B
>> (O_DIRECT) option? That shouldn't matter from Darshan's perspective, but
>> it might not hurt to check.
>>
>> thanks,
>> -Phil
>>
>>
>>
>> On 2018-11-20 19:59:55-05:00 Darshan-users wrote:
>>
>> Hello,
>> I recently installed Darshan 3.1.6 on Microsoft Azure VM running Centos
>> 7.4.
>> I got an unexpected result using Darshan to characterize the I/O for an
>> IOR benchmark
>> experiment.
>> export
>> LD_PRELOAD=/avere/apps/spack/linux-centos7-x86_64/gcc-4.8.5/darshan-runtime-3.1.6-daohky4yevagxajjl33lwk472lcgn6g4/lib/libdarshan.so
>> BLOCKSIZE=256M
>> mpirun -np 8 ior -a POSIX -B -r -k -z -v -o $FILEPATH -i 2 -m -t 32m -b
>> 256M -d 1
>> After the job completed, a Darshan log file was created, the resulting
>> text report (darshan-parser_ior_read_shared.out, its attached) was
>> generated using the following command
>> darshan-parser --all
>> testuser_ior_id22-20-80229-14589323501426222819_1.darshan >&
>> darshan-parser_ior_read_shared.out
>> The above IOR benchmark is a read only benchmark to a shared file system,
>> but the resulting darshan report indicates there are no READ operations?
>> Any ideas why the resulting Darshan resport has no read operations? (Note
>> if I add a -w option to the above IOR benchmark (i.e do a write and read
>> to a shared filesystem, darshan only
>> reports the writes and not reads?)
>> Any help would be appreciated,
>> Thanks for your support,
>> Cormac.
>>
>> _______________________________________________
>> Darshan-users mailing list
>> Darshan-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20181126/0bcb3a45/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nm_ior.log
Type: application/octet-stream
Size: 12473 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20181126/0bcb3a45/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: strace_ior_read.log
Type: application/octet-stream
Size: 113780 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20181126/0bcb3a45/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: strace_ior_write.log
Type: application/octet-stream
Size: 114314 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20181126/0bcb3a45/attachment-0005.obj>
More information about the Darshan-users
mailing list