<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body>
<p>Hi Phil,</p>
<p>first of all thanks for your quite detailed analysis.</p>
<p>I did some further test and the behaviour, for example, slightly
change passing from MPIIFORT to MPIF90 (which is a sort of wrapper
to GNU)</p>
<p>[planucar@r033c01s05 source]$ mpif90 -v<br>
mpif90 for the Intel(R) MPI Library 2018 Update 4 for Linux*<br>
Copyright(C) 2003-2018, Intel Corporation. All rights reserved.<br>
Using built-in specs.<br>
COLLECT_GCC=gfortran<br>
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper<br>
Target: x86_64-redhat-linux<br>
Configured with: ../configure --prefix=/usr
--mandir=/usr/share/man --infodir=/usr/share/info
--with-bugurl=<a class="moz-txt-link-freetext" href="http://bugzilla.redhat.com/bugzilla">http://bugzilla.redhat.com/bugzilla</a>
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id --with-linker-hash-style=gnu
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto
--enable-plugin --enable-initfini-array --disable-libgcj
--with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install
--with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install
--enable-gnu-indirect-function --with-tune=generic
--with-arch_32=x86-64 --build=x86_64-redhat-linux<br>
Thread model: posix<br>
gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC)<br>
</p>
<p><br>
</p>
<p>In this case the estimate is more correct (altough there is still
some "time" missing also in that estimate). Of course, this is a
simple benchmark....so, in the end, results are not so useful.<br>
<br>
But, my question, is how to recover the "missing" I/O time?<br>
<br>
Infact, we should have an uncorrect information that the I/O
behaviour is more efficient than we can observe and in a real
situation this can be a limit of the tool.<br>
<br>
Are you agree?<br>
<br>
thanks again<br>
</p>
<p><br>
</p>
<p>Piero<br>
</p>
<p><br>
</p>
<p><br>
</p>
<p><br>
</p>
<p><br>
</p>
<p><br>
</p>
<p>in this case the estimate is more correct (altough there is still
some "time" missing also in that estimate).</p>
<p>of course, this is a simple benchmark....so, in the end, results
are not so useful.</p>
<p>But, my question, how to recover the "missing" time I/O time?</p>
<p>infact, we should have the incorrect information that the I/O
time is more efficient than we can observe and in a real situation
this can be a limit of the tool.</p>
<p>Agree?</p>
<p>thanks again</p>
<p><br>
</p>
<p>Piero<br>
</p>
<p><br>
</p>
<div class="moz-cite-prefix">Il 12/02/2020 21:37, Carns, Philip H.
ha scritto:<br>
</div>
<blockquote type="cite"
cite="mid:BN8PR09MB3620A79D72833535382E1281F71B0@BN8PR09MB3620.namprd09.prod.outlook.com">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
Thank you Piero, that does help quite a bit. I don't know the
cause, but I do see why Darshan reports a rate much higher than
the benchmark.<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
The clue is in the write() timing if you zero in on a specific
file, in this case serial.dat.1:</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<span>[carns@carns-x1-7g Downloads]$ darshan-parser
benchio_1202.dn.darshan |grep serial.dat.1 |grep F_WRITE<br>
</span>
<div>POSIX 0 2553187295521652424 POSIX_F_WRITE_START_TIMESTAMP
0.350832
/gpfs/scratch/userinternal/planucar/benchio-master/shared-file/source/benchio_files/serial.dat.1
/gpfs/scratch gpfs<br>
</div>
<div>POSIX 0 2553187295521652424 POSIX_F_WRITE_END_TIMESTAMP
4.519859
/gpfs/scratch/userinternal/planucar/benchio-master/shared-file/source/benchio_files/serial.dat.1
/gpfs/scratch gpfs<br>
</div>
<div>POSIX 0 2553187295521652424 POSIX_F_WRITE_TIME 1.336200
/gpfs/scratch/userinternal/planucar/benchio-master/shared-file/source/benchio_files/serial.dat.1
/gpfs/scratch gpfs<br>
</div>
<span></span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
The POSIX_F_WRITE_TIME shows the cumulative time in seconds was
spent waiting on glibc write() calls: 1.3 seconds.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
The POSIX_F_WRITE_{START/END}_TIMESTAMP show the elapsed time
between when the first write started and the last write ended:
4.2 seconds.
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
Somehow almost 70% of the elapsed time measured by the benchmark
is spent doing something between individual write() calls that
make it to glibc. The writes seem to be relatively sparse in
time.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
This doesn't make any sense if you look at the benchmark code,
though. The write loop is very simple, in fact it is just a
tight loop calling write():</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<a
href="https://github.com/EPCCed/benchio/blob/master/shared-file/source/mpiio.F90#L154"
style="" moz-do-not-send="true">https://github.com/EPCCed/benchio/blob/master/shared-file/source/mpiio.F90#L154</a></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
Maybe someone on the list with more Fortran expertise than me
can comment on why this might happen? Is it possible that the
Fortran runtime is doing something (buffering?) that consumes
time in the Fortran-level write call before the data is actually
relayed to the system?</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
As a side note we could use other heuristics for performance,
but they all have their tradeoffs. The one that would obviously
work here (measuring time from first to last I/O) would break
for other common patterns.<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
thanks,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255,
255, 255);">
-Phil<br>
</div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
face="Calibri, sans-serif" color="#000000"><b>From:</b> Piero
LANUCARA <a class="moz-txt-link-rfc2396E" href="mailto:p.lanucara@cineca.it"><p.lanucara@cineca.it></a><br>
<b>Sent:</b> Wednesday, February 12, 2020 1:47 PM<br>
<b>To:</b> Carns, Philip H. <a class="moz-txt-link-rfc2396E" href="mailto:carns@mcs.anl.gov"><carns@mcs.anl.gov></a>; Snyder,
Shane <a class="moz-txt-link-rfc2396E" href="mailto:ssnyder@mcs.anl.gov"><ssnyder@mcs.anl.gov></a>; Harms, Kevin
<a class="moz-txt-link-rfc2396E" href="mailto:harms@alcf.anl.gov"><harms@alcf.anl.gov></a><br>
<b>Cc:</b> <a class="moz-txt-link-abbreviated" href="mailto:darshan-users@lists.mcs.anl.gov">darshan-users@lists.mcs.anl.gov</a>
<a class="moz-txt-link-rfc2396E" href="mailto:darshan-users@lists.mcs.anl.gov"><darshan-users@lists.mcs.anl.gov></a><br>
<b>Subject:</b> Re: [Darshan-users] Darshan & EPCC benchio
different behaviour</font>
<div> </div>
</div>
<div>
<p>p.s</p>
<p>sorry, I forgotted to say that we built the environment (both
Darshan and benchmark) using IntelMPI release mpiifort script:<br>
</p>
<p> mpiifort -v<br>
mpiifort for the Intel(R) MPI Library 2018 Update 4 for Linux*<br>
Copyright(C) 2003-2018, Intel Corporation. All rights
reserved.</p>
<p><br>
</p>
<p>regards</p>
<p><br>
</p>
<p>Piero<br>
</p>
<div class="x_moz-cite-prefix">Il 12/02/2020 16:54, Piero
LANUCARA ha scritto:<br>
</div>
<blockquote type="cite">
<p>Hi Phil</p>
<p><br>
</p>
<p>in attach</p>
<p>dn stands for different names....</p>
<p><br>
</p>
<p>cheers</p>
<p>Piero<br>
</p>
<div class="x_moz-cite-prefix">Il 12/02/2020 16:13, Carns,
Philip H. ha scritto:<br>
</div>
<blockquote type="cite">
<style type="text/css" style="display:none">
<!--
p
{margin-top:0;
margin-bottom:0}
-->
</style>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
Ah, great, thank you for the confirmation.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
In that case it looks like Darshan is instrumenting
properly at run time, but I think Kevin is on the right
track that Darshan's heuristics for calculating
performance in post processing are getting confused for
some reason.<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
Probably GPFS is client-side caching aggressively in the
single client case, but that wouldn't explain why the
benchmark output reports a much different number than
Darshan, though. They should both perceive roughly the
same performance; neither the benchmark itself nor Darshan
know if caching is happening or not.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
It's hard to see where the performance heuristic went
wrong from looking at the log, in large part because the
app repeatedly opens a file with the same name (there is a
clue to this in the OPEN counters; the same file name is
opened 20 times):</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<span># WARNING: POSIX_OPENS counter includes both
POSIX_FILENOS and POSIX_DUPS counts<br>
</span><span>POSIX 0 6563482044800691889 POSIX_OPENS 20
/gpfs/scratch/userinternal/planucar/benchio-master/shared-file/source/benchio_files/serial.dat
/gpfs/scratch gpfs</span></div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
Every time the file is opened (regardless of whether it
was unlinked in between or not), Darshan keeps adding
counters to the same record, which are associated with
that serial.dat file name. So things like close()
timestamps become nonsensical because Darshan records when
the first close() starts and when the last one finishes:</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<span>[carns@carns-x1-7g Downloads]$ darshan-parser
benchio_1202.darshan |grep CLOSE<br>
</span>
<div>POSIX 0 6563482044800691889
POSIX_F_CLOSE_START_TIMESTAMP
4.536853/gpfs/scratch/userinternal/planucar/benchio-master/shared-file/source/benchio_files/serial.dat
/gpfs/scratch gpfs<br>
</div>
<div>POSIX 0 6563482044800691889
POSIX_F_CLOSE_END_TIMESTAMP 43.041987
/gpfs/scratch/userinternal/planucar/benchio-master/shared-file/source/benchio_files/serial.dat
/gpfs/scratch gpfs<br>
</div>
<span></span></div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
(This doesn't mean there was one close() that took ~40
seconds; in this case there were many close() calls and
~40 seconds elapsed between the start of the first one and
completion of the last one).</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
If it is possible for you to modify the benchmark (as an
experiment) so that it chooses a new file name on each
iteration, then I think it would probably disentangle the
counters enough for us to tell what went wrong.<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
thanks,</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
-Phil<br>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_divRplyFwdMsg" dir="ltr"><font
style="font-size:11pt" face="Calibri, sans-serif"
color="#000000"><b>From:</b> Piero LANUCARA
<a class="x_moz-txt-link-rfc2396E"
href="mailto:p.lanucara@cineca.it"
moz-do-not-send="true"><p.lanucara@cineca.it></a><br>
<b>Sent:</b> Wednesday, February 12, 2020 9:47 AM<br>
<b>To:</b> Carns, Philip H. <a
class="x_moz-txt-link-rfc2396E"
href="mailto:carns@mcs.anl.gov" moz-do-not-send="true">
<carns@mcs.anl.gov></a>; Snyder, Shane <a
class="x_moz-txt-link-rfc2396E"
href="mailto:ssnyder@mcs.anl.gov"
moz-do-not-send="true">
<ssnyder@mcs.anl.gov></a>; Harms, Kevin <a
class="x_moz-txt-link-rfc2396E"
href="mailto:harms@alcf.anl.gov"
moz-do-not-send="true">
<harms@alcf.anl.gov></a><br>
<b>Cc:</b> <a class="x_moz-txt-link-abbreviated"
href="mailto:darshan-users@lists.mcs.anl.gov"
moz-do-not-send="true">
darshan-users@lists.mcs.anl.gov</a> <a
class="x_moz-txt-link-rfc2396E"
href="mailto:darshan-users@lists.mcs.anl.gov"
moz-do-not-send="true">
<darshan-users@lists.mcs.anl.gov></a><br>
<b>Subject:</b> Re: [Darshan-users] Darshan & EPCC
benchio different behaviour</font>
<div> </div>
</div>
<div>
<p>Hi Phil.....POSIX</p>
<p>this is a well known benchmark ....you can easily
verify it!</p>
<p><br>
</p>
<p>by the way it's something like that</p>
<p><br>
</p>
<p>! Serial write is unconditionally compiled<br>
subroutine serialwrite(filename, iodata, n1, n2, n3,
cartcomm)<br>
<br>
character*(*) :: filename<br>
<br>
integer :: n1, n2, n3<br>
double precision, dimension(0:n1+1,0:n2+1,0:n3+1) ::
iodata<br>
<br>
integer :: cartcomm, ierr, rank, size<br>
integer, parameter :: iounit = 10<br>
<br>
integer :: i<br>
<br>
call MPI_Comm_size(cartcomm, size, ierr)<br>
call MPI_Comm_rank(cartcomm, rank, ierr)<br>
<br>
! Write same amount of data as the parallel write but
do it all from rank 0<br>
! This is just to get a baseline figure for serial IO
performance - note<br>
! that the contents of the file will be differnent from
the parallel calls<br>
<br>
if (rank == 0) then<br>
<br>
open(file=filename, unit=iounit, access='stream')<br>
<br>
do i = 1, size<br>
write(unit=iounit) iodata(1:n1, 1:n2, 1:n3)<br>
end do<br>
<br>
close(iounit)<br>
<br>
end if<br>
<br>
end subroutine serialwrite<br>
<br>
</p>
<p><br>
</p>
<p>Piero<br>
</p>
<div class="x_x_moz-cite-prefix">Il 12/02/2020 14:13,
Carns, Philip H. ha scritto:<br>
</div>
<blockquote type="cite">
<style type="text/css" style="display:none">
<!--
p
{margin-top:0;
margin-bottom:0}
-->
</style>
<div
style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
Hi Piero,</div>
<div
style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div
style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
In the serial case, is the rank that's doing I/O still
using MPI-IO, or is it making calls directly to POSIX
in that case?</div>
<div
style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div
style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
The Darshan log for the serial case doesn't show any
MPI-IO activity, but I'm not sure if that's accurate,
or if it's an indication that we missed some
instrumentation.</div>
<div
style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
<br>
</div>
<div
style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
thanks,</div>
<div
style="font-family:Calibri,Arial,Helvetica,sans-serif;
font-size:12pt; color:rgb(0,0,0);
background-color:rgb(255,255,255)">
-Phil<br>
</div>
<hr tabindex="-1" style="display:inline-block;
width:98%">
<div id="x_x_divRplyFwdMsg" dir="ltr"><font
style="font-size:11pt" face="Calibri, sans-serif"
color="#000000"><b>From:</b> Darshan-users
<a class="x_x_moz-txt-link-rfc2396E"
href="mailto:darshan-users-bounces@lists.mcs.anl.gov"
moz-do-not-send="true">
<darshan-users-bounces@lists.mcs.anl.gov></a>
on behalf of Piero LANUCARA <a
class="x_x_moz-txt-link-rfc2396E"
href="mailto:p.lanucara@cineca.it"
moz-do-not-send="true">
<p.lanucara@cineca.it></a><br>
<b>Sent:</b> Wednesday, February 12, 2020 5:29 AM<br>
<b>To:</b> Snyder, Shane <a
class="x_x_moz-txt-link-rfc2396E"
href="mailto:ssnyder@mcs.anl.gov"
moz-do-not-send="true">
<ssnyder@mcs.anl.gov></a>; Harms, Kevin <a
class="x_x_moz-txt-link-rfc2396E"
href="mailto:harms@alcf.anl.gov"
moz-do-not-send="true">
<harms@alcf.anl.gov></a><br>
<b>Cc:</b> <a class="x_x_moz-txt-link-abbreviated"
href="mailto:darshan-users@lists.mcs.anl.gov"
moz-do-not-send="true">
darshan-users@lists.mcs.anl.gov</a> <a
class="x_x_moz-txt-link-rfc2396E"
href="mailto:darshan-users@lists.mcs.anl.gov"
moz-do-not-send="true">
<darshan-users@lists.mcs.anl.gov></a><br>
<b>Subject:</b> Re: [Darshan-users] Darshan &
EPCC benchio different behaviour</font>
<div> </div>
</div>
<div class="x_x_BodyFragment"><font size="2"><span
style="font-size:11pt">
<div class="x_x_PlainText">Hi Shane, Kevin<br>
<br>
thanks for the update.<br>
<br>
I attached a new upated files (log and pdf) to
this email.<br>
<br>
Also, the log from BENCHIO is attached.<br>
<br>
thanks again<br>
<br>
regards<br>
<br>
Piero<br>
<br>
<br>
Il 11/02/2020 20:15, Shane Snyder ha scritto:<br>
> Definitely looks like something strange is
happening when Darshan is <br>
> estimating the time spent in I/O operations
(as seen in the very first <br>
> figure, observed write time barely even
registers) in the serial case, <br>
> which it is ultimately used to provide the
performance estimate.<br>
><br>
> If you could provide them, the raw Darshan
logs would be really <br>
> helpful. That should make it clear whether
it's an instrumentation <br>
> issue (i.e., under accounting for time
spent in I/O operations at <br>
> runtime) or if its an issue with the
heuristics in the PDF summary <br>
> tool you are using, as Kevin points out. If
it's the latter, having an <br>
> example log to test modifications to our
heuristics would be very <br>
> helpful to us.<br>
><br>
> Thanks,<br>
> --Shane<br>
><br>
> On 2/11/20 8:36 AM, Harms, Kevin wrote:<br>
>> Piero,<br>
>><br>
>> the performance estimate is based on
heuristics, it's possible the <br>
>> 'serial' model is breaking some
assumptions about how the I/O is <br>
>> done. Is every rank opening the file,
but only rank 0 is doing actual <br>
>> I/O?<br>
>><br>
>> If possible, you could provide the
log and we could check to see <br>
>> what the counters look like.<br>
>><br>
>> kevin<br>
>><br>
>>
________________________________________<br>
>> From: Piero LANUCARA <a
class="x_x_moz-txt-link-rfc2396E"
href="mailto:p.lanucara@cineca.it"
moz-do-not-send="true">
<p.lanucara@cineca.it></a><br>
>> Sent: Tuesday, February 11, 2020 2:28
AM<br>
>> To: Harms, Kevin<br>
>> Cc: <a
class="x_x_moz-txt-link-abbreviated"
href="mailto:darshan-users@lists.mcs.anl.gov"
moz-do-not-send="true">
darshan-users@lists.mcs.anl.gov</a><br>
>> Subject: Re: [Darshan-users] Darshan
& EPCC benchio different behaviour<br>
>><br>
>> Hi Kevin<br>
>><br>
>> first of all thanks for the
investigation..I did some futher test and it<br>
>> seems like the issue may appear using
Fortran (MPI, mainly IntelMPI) <br>
>> codes.<br>
>><br>
>> Is this information useful?<br>
>><br>
>> regards<br>
>> Piero<br>
>> Il 07/02/2020 16:07, Harms, Kevin ha
scritto:<br>
>>> Piero,<br>
>>><br>
>>> just to confirm, the serial
case is still running in parallel, <br>
>>> 36 processes, but the I/O is only
from rank 0?<br>
>>><br>
>>> kevin<br>
>>><br>
>>>
________________________________________<br>
>>> From: Darshan-users <a
class="x_x_moz-txt-link-rfc2396E"
href="mailto:darshan-users-bounces@lists.mcs.anl.gov"
moz-do-not-send="true">
<darshan-users-bounces@lists.mcs.anl.gov></a> on <br>
>>> behalf of Piero LANUCARA <a
class="x_x_moz-txt-link-rfc2396E"
href="mailto:p.lanucara@cineca.it"
moz-do-not-send="true">
<p.lanucara@cineca.it></a><br>
>>> Sent: Wednesday, February 5, 2020
4:56 AM<br>
>>> To: <a
class="x_x_moz-txt-link-abbreviated"
href="mailto:darshan-users@lists.mcs.anl.gov"
moz-do-not-send="true">
darshan-users@lists.mcs.anl.gov</a><br>
>>> Subject: Re: [Darshan-users]
Darshan & EPCC benchio different behaviour<br>
>>><br>
>>> p.s<br>
>>><br>
>>> to be more "verbose" I add to the
discussion:<br>
>>><br>
>>> Darshan output for the "serial" run
(serial.pdf)<br>
>>><br>
>>> Darshan output for the MPI-IO run
(mpiio.pdf)<br>
>>><br>
>>> benchio output for "serial" run
(serial.out)<br>
>>><br>
>>> benchio output for "MPI-IO" run
(mpi-io.out)<br>
>>><br>
>>> thanks<br>
>>><br>
>>> Piero<br>
>>><br>
>>> Il 04/02/2020 19:44, Piero LANUCARA
ha scritto:<br>
>>>> Dear all<br>
>>>><br>
>>>> I'm using Darshan to measure
EPCC benchio benchmark<br>
>>>> (<a
href="https://github.com/EPCCed/benchio"
moz-do-not-send="true">https://github.com/EPCCed/benchio</a>)
behaviour on a given x86 Tier1<br>
>>>> machine.<br>
>>>><br>
>>>> running two benchio tests
(MPI-IO and serial) a different behaviour<br>
>>>> appear<br>
>>>><br>
>>>> while Darhsan pdf log file is
able to recover the estimated time and<br>
>>>> bandwidth in the MPI-IO case,
the "serial" run is completely<br>
>>>> underestimated by Darshan (the
time and bandwidth are less/greater<br>
>>>> than benchio output).<br>
>>>><br>
>>>> Suggestions are welcomed<br>
>>>><br>
>>>> thanks<br>
>>>><br>
>>>> Piero<br>
>>>><br>
>>>>
_______________________________________________<br>
>>>> Darshan-users mailing list<br>
>>>> <a
class="x_x_moz-txt-link-abbreviated"
href="mailto:Darshan-users@lists.mcs.anl.gov"
moz-do-not-send="true">
Darshan-users@lists.mcs.anl.gov</a><br>
>>>> <a
href="https://lists.mcs.anl.gov/mailman/listinfo/darshan-users"
moz-do-not-send="true">https://lists.mcs.anl.gov/mailman/listinfo/darshan-users</a><br>
>>
_______________________________________________<br>
>> Darshan-users mailing list<br>
>> <a
class="x_x_moz-txt-link-abbreviated"
href="mailto:Darshan-users@lists.mcs.anl.gov"
moz-do-not-send="true">
Darshan-users@lists.mcs.anl.gov</a><br>
>> <a
href="https://lists.mcs.anl.gov/mailman/listinfo/darshan-users"
moz-do-not-send="true">https://lists.mcs.anl.gov/mailman/listinfo/darshan-users</a><br>
><br>
>
_______________________________________________<br>
> Darshan-users mailing list<br>
> <a class="x_x_moz-txt-link-abbreviated"
href="mailto:Darshan-users@lists.mcs.anl.gov"
moz-do-not-send="true">
Darshan-users@lists.mcs.anl.gov</a><br>
> <a
href="https://lists.mcs.anl.gov/mailman/listinfo/darshan-users"
moz-do-not-send="true">https://lists.mcs.anl.gov/mailman/listinfo/darshan-users</a><br>
</div>
</span></font></div>
</blockquote>
</div>
</blockquote>
<br>
<fieldset class="x_mimeAttachmentHeader"></fieldset>
<pre class="x_moz-quote-pre">_______________________________________________
Darshan-users mailing list
<a class="x_moz-txt-link-abbreviated" href="mailto:Darshan-users@lists.mcs.anl.gov" moz-do-not-send="true">Darshan-users@lists.mcs.anl.gov</a>
<a class="x_moz-txt-link-freetext" href="https://lists.mcs.anl.gov/mailman/listinfo/darshan-users" moz-do-not-send="true">https://lists.mcs.anl.gov/mailman/listinfo/darshan-users</a>
</pre>
</blockquote>
</div>
</blockquote>
</body>
</html>