[Darshan-users] Darshan & EPCC benchio different behaviour

Piero LANUCARA p.lanucara at cineca.it
Wed Feb 12 04:29:49 CST 2020


Hi Shane, Kevin

thanks for the update.

I attached a new upated files (log and pdf) to this email.

Also, the log from BENCHIO is attached.

thanks again

regards

Piero


Il 11/02/2020 20:15, Shane Snyder ha scritto:
> Definitely looks like something strange is happening when Darshan is 
> estimating the time spent in I/O operations (as seen in the very first 
> figure, observed write time barely even registers) in the serial case, 
> which it is ultimately used to provide the performance estimate.
>
> If you could provide them, the raw Darshan logs would be really 
> helpful. That should make it clear whether it's an instrumentation 
> issue (i.e., under accounting for time spent in I/O operations at 
> runtime) or if its an issue with the heuristics in the PDF summary 
> tool you are using, as Kevin points out. If it's the latter, having an 
> example log to test modifications to our heuristics would be very 
> helpful to us.
>
> Thanks,
> --Shane
>
> On 2/11/20 8:36 AM, Harms, Kevin wrote:
>> Piero,
>>
>>    the performance estimate is based on heuristics, it's possible the 
>> 'serial' model is breaking some assumptions about how the I/O is 
>> done. Is every rank opening the file, but only rank 0 is doing actual 
>> I/O?
>>
>>    If possible, you could provide the log and we could check to see 
>> what the counters look like.
>>
>> kevin
>>
>> ________________________________________
>> From: Piero LANUCARA <p.lanucara at cineca.it>
>> Sent: Tuesday, February 11, 2020 2:28 AM
>> To: Harms, Kevin
>> Cc: darshan-users at lists.mcs.anl.gov
>> Subject: Re: [Darshan-users] Darshan & EPCC benchio different behaviour
>>
>> Hi Kevin
>>
>> first of all thanks for the investigation..I did some futher test and it
>> seems like the issue may appear using Fortran (MPI, mainly IntelMPI) 
>> codes.
>>
>> Is this information useful?
>>
>> regards
>> Piero
>> Il 07/02/2020 16:07, Harms, Kevin ha scritto:
>>> Piero,
>>>
>>>     just to confirm, the serial case is still running in parallel, 
>>> 36 processes, but the I/O is only from rank 0?
>>>
>>> kevin
>>>
>>> ________________________________________
>>> From: Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> on 
>>> behalf of Piero LANUCARA <p.lanucara at cineca.it>
>>> Sent: Wednesday, February 5, 2020 4:56 AM
>>> To: darshan-users at lists.mcs.anl.gov
>>> Subject: Re: [Darshan-users] Darshan & EPCC benchio different behaviour
>>>
>>> p.s
>>>
>>> to be more "verbose" I add to the discussion:
>>>
>>> Darshan output for the "serial" run (serial.pdf)
>>>
>>> Darshan output for the MPI-IO run (mpiio.pdf)
>>>
>>> benchio output for "serial" run (serial.out)
>>>
>>> benchio output for "MPI-IO" run (mpi-io.out)
>>>
>>> thanks
>>>
>>> Piero
>>>
>>> Il 04/02/2020 19:44, Piero LANUCARA ha scritto:
>>>> Dear all
>>>>
>>>> I'm using Darshan to measure EPCC benchio benchmark
>>>> (https://github.com/EPCCed/benchio) behaviour on a given x86 Tier1
>>>> machine.
>>>>
>>>> running two benchio tests (MPI-IO and serial) a different behaviour
>>>> appear
>>>>
>>>> while Darhsan pdf log file is able to recover the estimated time and
>>>> bandwidth in the MPI-IO case, the "serial" run is completely
>>>> underestimated by Darshan (the time and bandwidth are less/greater
>>>> than benchio output).
>>>>
>>>> Suggestions are welcomed
>>>>
>>>> thanks
>>>>
>>>> Piero
>>>>
>>>> _______________________________________________
>>>> Darshan-users mailing list
>>>> Darshan-users at lists.mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>> _______________________________________________
>> Darshan-users mailing list
>> Darshan-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>
> _______________________________________________
> Darshan-users mailing list
> Darshan-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
-------------- next part --------------
A non-text attachment was scrubbed...
Name: benchio_1202.darshan
Type: application/octet-stream
Size: 877 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20200212/844476f4/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: benchio_1202.darshan.pdf
Type: application/pdf
Size: 67310 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20200212/844476f4/attachment-0001.pdf>
-------------- next part --------------
 
 Simple Parallel IO benchmark
 ----------------------------
 
 Running on           32  process(es)
 Process grid is (           2 ,            4 ,            4 )
 Array size is   (         256 ,          256 ,          256 )
 Global size is  (         512 ,         1024 ,         1024 )
 
 Total amount of data =    4096.00000000000       MiB
 
 Clock resolution is    1.00000000000000      , usecs
 
 ------
 Serial                                                          
 ------
 
 Writing to benchio_files/serial.dat                                        
 time =    4.30992698669434      , rate =    950.364127430749       MiB/s
 time =    4.26962995529175      , rate =    959.333722802710       MiB/s
 time =    4.28477692604065      , rate =    955.942414436243       MiB/s
 time =    4.28777194023132      , rate =    955.274687435690       MiB/s
 time =    4.28683900833130      , rate =    955.482580997231       MiB/s
 time =    4.29409790039062      , rate =    953.867400095232       MiB/s
 time =    4.28207302093506      , rate =    956.546042062957       MiB/s
 time =    4.26365089416504      , rate =    960.679028765119       MiB/s
 time =    4.26587200164795      , rate =    960.178832936777       MiB/s
 time =    4.26457810401917      , rate =    960.470156740643       MiB/s
 mintime =    4.26365089416504      , maxrate =    960.679028765119       MiB/s
 avgtime =    4.28092167377472      , avgrate =    956.813899370335       MiB/s
 Deleting: benchio_files/serial.dat                                        
 
 
 --------
 Finished
 --------
 

real	0m45.555s
user	0m0.025s
sys	0m0.025s


More information about the Darshan-users mailing list