[petsc-users] MPI Communication times

Zhang, Junchao jczhang at mcs.anl.gov
Wed Mar 20 16:02:04 CDT 2019


See the "Mess   AvgLen  Reduct" number in each log stage.  Mess is the total number of messages sent in an event over all processes.  AvgLen is average message len. Reduct is the number of global reduction.
Each event like VecScatterBegin/End has a maximal execution time over all processes, and a max/min ratio.  %T is sum(execution time of the event on each process)/sum(execution time of the stage on each process). %T indicates how expensive the event is. It is a number you should pay attention to.
If your code is imbalanced (i.e., with a big max/min ratio), then the performance number is skewed and becomes misleading because some processes are just waiting for others. Then, besides -log_view, you can add -log_sync, which adds an extra MPI_Barrier for each event to let them start at the same time. With that, it is easier to interpret the number.
src/vec/vscat/examples/ex4.c is a tiny example for VecScatter logging.

--Junchao Zhang


On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hello,

I am working on timing my model, which we made MPI scalable using petsc DMDAs, i want to know more about the output log and how to calculate a total communication times for my runs, so far i see we have "MPI Messages" and "MPI Messages Lengths" in the log, along VecScatterEnd and VecScatterBegin reports.

My question is, how do i interpret these number to get a rough estimate on how much overhead we have just from MPI communications times in my model runs?

Thanks,


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190320/3bb071b0/attachment.html>


More information about the petsc-users mailing list