[Darshan-users] Darshan & EPCC benchio different behaviour

Carns, Philip H. carns at mcs.anl.gov
Wed Feb 12 07:13:20 CST 2020


Hi Piero,

In the serial case, is the rank that's doing I/O still using MPI-IO, or is it making calls directly to POSIX in that case?

The Darshan log for the serial case doesn't show any MPI-IO activity, but I'm not sure if that's accurate, or if it's an indication that we missed some instrumentation.

thanks,
-Phil
________________________________
From: Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> on behalf of Piero LANUCARA <p.lanucara at cineca.it>
Sent: Wednesday, February 12, 2020 5:29 AM
To: Snyder, Shane <ssnyder at mcs.anl.gov>; Harms, Kevin <harms at alcf.anl.gov>
Cc: darshan-users at lists.mcs.anl.gov <darshan-users at lists.mcs.anl.gov>
Subject: Re: [Darshan-users] Darshan & EPCC benchio different behaviour

Hi Shane, Kevin

thanks for the update.

I attached a new upated files (log and pdf) to this email.

Also, the log from BENCHIO is attached.

thanks again

regards

Piero


Il 11/02/2020 20:15, Shane Snyder ha scritto:
> Definitely looks like something strange is happening when Darshan is
> estimating the time spent in I/O operations (as seen in the very first
> figure, observed write time barely even registers) in the serial case,
> which it is ultimately used to provide the performance estimate.
>
> If you could provide them, the raw Darshan logs would be really
> helpful. That should make it clear whether it's an instrumentation
> issue (i.e., under accounting for time spent in I/O operations at
> runtime) or if its an issue with the heuristics in the PDF summary
> tool you are using, as Kevin points out. If it's the latter, having an
> example log to test modifications to our heuristics would be very
> helpful to us.
>
> Thanks,
> --Shane
>
> On 2/11/20 8:36 AM, Harms, Kevin wrote:
>> Piero,
>>
>>    the performance estimate is based on heuristics, it's possible the
>> 'serial' model is breaking some assumptions about how the I/O is
>> done. Is every rank opening the file, but only rank 0 is doing actual
>> I/O?
>>
>>    If possible, you could provide the log and we could check to see
>> what the counters look like.
>>
>> kevin
>>
>> ________________________________________
>> From: Piero LANUCARA <p.lanucara at cineca.it>
>> Sent: Tuesday, February 11, 2020 2:28 AM
>> To: Harms, Kevin
>> Cc: darshan-users at lists.mcs.anl.gov
>> Subject: Re: [Darshan-users] Darshan & EPCC benchio different behaviour
>>
>> Hi Kevin
>>
>> first of all thanks for the investigation..I did some futher test and it
>> seems like the issue may appear using Fortran (MPI, mainly IntelMPI)
>> codes.
>>
>> Is this information useful?
>>
>> regards
>> Piero
>> Il 07/02/2020 16:07, Harms, Kevin ha scritto:
>>> Piero,
>>>
>>>     just to confirm, the serial case is still running in parallel,
>>> 36 processes, but the I/O is only from rank 0?
>>>
>>> kevin
>>>
>>> ________________________________________
>>> From: Darshan-users <darshan-users-bounces at lists.mcs.anl.gov> on
>>> behalf of Piero LANUCARA <p.lanucara at cineca.it>
>>> Sent: Wednesday, February 5, 2020 4:56 AM
>>> To: darshan-users at lists.mcs.anl.gov
>>> Subject: Re: [Darshan-users] Darshan & EPCC benchio different behaviour
>>>
>>> p.s
>>>
>>> to be more "verbose" I add to the discussion:
>>>
>>> Darshan output for the "serial" run (serial.pdf)
>>>
>>> Darshan output for the MPI-IO run (mpiio.pdf)
>>>
>>> benchio output for "serial" run (serial.out)
>>>
>>> benchio output for "MPI-IO" run (mpi-io.out)
>>>
>>> thanks
>>>
>>> Piero
>>>
>>> Il 04/02/2020 19:44, Piero LANUCARA ha scritto:
>>>> Dear all
>>>>
>>>> I'm using Darshan to measure EPCC benchio benchmark
>>>> (https://github.com/EPCCed/benchio) behaviour on a given x86 Tier1
>>>> machine.
>>>>
>>>> running two benchio tests (MPI-IO and serial) a different behaviour
>>>> appear
>>>>
>>>> while Darhsan pdf log file is able to recover the estimated time and
>>>> bandwidth in the MPI-IO case, the "serial" run is completely
>>>> underestimated by Darshan (the time and bandwidth are less/greater
>>>> than benchio output).
>>>>
>>>> Suggestions are welcomed
>>>>
>>>> thanks
>>>>
>>>> Piero
>>>>
>>>> _______________________________________________
>>>> Darshan-users mailing list
>>>> Darshan-users at lists.mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>> _______________________________________________
>> Darshan-users mailing list
>> Darshan-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
>
> _______________________________________________
> Darshan-users mailing list
> Darshan-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/darshan-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/darshan-users/attachments/20200212/9c2bde15/attachment.html>


More information about the Darshan-users mailing list