[petsc-users] log_summary time ratio and flops ratio

Matthew Knepley knepley at gmail.com
Fri Feb 5 11:18:18 CST 2016


On Fri, Feb 5, 2016 at 11:12 AM, Xiangdong <epscodes at gmail.com> wrote:

> Hello everyone,
>
> When I looked at the log_summary, I found that flops ratio of VecDot is
> 1.1, but the time ratio is 1.8. Also the flops ratio of MatMult is 1.2, but
> the time ratio is 1.6. I am using the simple SOR preconditioner on 256
> procs. Since each of them consumes about 20% of total time, is it worth
> looking into these imbalance issue? If yes, how do I know which procs are
> causing the issue? What tools should I start with?
>

1) Always send all the output

2) You have a large number of dot products, and I am guessing you have a
slow reduce on this machine (we could see the data at the end of
log_summary).
     The waiting time comes from slow collective communication, load
imbalance, system noise, etc. You can see this by looking at the difference
between
     VecDot() and VecMDot(). For the latter there is enough work to cover
up the latency, and its imbalance is only 1.1, not 1.8. Thus I think it is
clear that your
     problem is due to a slow communication network rather than load
imbalance.

3) Thus you would be better served with a stronger preconditioner (more
flops) which resulted in fewer iterations (less synchronization).

  Thanks,

     Matt


> Thanks.
>
> Xiangdong
>
>
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops
>       --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecView               44 1.0 3.9026e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 4.8e+01  1  0  0  0  0   1  0  0  0  0     0
> VecDot            817452 1.0 1.5972e+02 1.8 1.51e+10 1.1 0.0e+00 0.0e+00
> 8.2e+05 18  3  0  0 40  18  3  0  0 40 22969
> VecMDot           613089 1.0 5.8918e+01 1.1 2.27e+10 1.1 0.0e+00 0.0e+00
> 6.1e+05  8  4  0  0 30   8  4  0  0 30 93397
> VecNorm           618619 1.0 1.4426e+02 2.2 1.14e+10 1.1 0.0e+00 0.0e+00
> 6.2e+05 15  2  0  0 30  15  2  0  0 30 19145
> VecScale             938 1.0 8.3907e-03 1.2 8.67e+06 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 250859
> VecCopy             8159 1.0 1.0394e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet              1884 1.0 3.9701e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY          1021815 1.0 2.2148e+01 2.1 1.89e+10 1.1 0.0e+00 0.0e+00
> 0.0e+00  2  4  0  0  0   2  4  0  0  0 207057
> VecAYPX           613089 1.0 6.5360e+00 1.9 1.13e+10 1.1 0.0e+00 0.0e+00
> 0.0e+00  1  2  0  0  0   1  2  0  0  0 420336
> VecAXPBYCZ           191 1.0 3.7365e-03 1.4 7.06e+06 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 458830
> VecWAXPY             938 1.0 3.8502e-02 1.4 8.67e+06 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 54669
> VecMAXPY          613089 1.0 1.3276e+01 2.2 2.27e+10 1.1 0.0e+00 0.0e+00
> 0.0e+00  1  4  0  0  0   1  4  0  0  0 414499
> VecLoad                5 1.0 2.0913e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecScatterBegin   820695 1.0 1.8959e+01 2.8 0.00e+00 0.0 1.1e+09 4.6e+03
> 0.0e+00  2  0100100  0   2  0100100  0     0
> VecScatterEnd     820695 1.0 5.2002e+01 4.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  4  0  0  0  0   4  0  0  0  0     0
> VecReduceArith      2814 1.0 1.8517e-02 1.2 5.20e+07 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 681995
> VecReduceComm        938 1.0 2.6041e+00 3.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 9.4e+02  0  0  0  0  0   0  0  0  0  0     0
> MatMult           817452 1.0 1.9379e+02 1.6 2.04e+11 1.2 1.0e+09 4.6e+03
> 0.0e+00 23 40100100  0  23 40100100  0 253083
> MatSOR            818390 1.0 1.9608e+02 1.5 2.00e+11 1.1 0.0e+00 0.0e+00
> 0.0e+00 22 40  0  0  0  22 40  0  0  0 247472
> MatAssemblyBegin     939 1.0 7.9671e+00 6.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.9e+03  1  0  0  0  0   1  0  0  0  0     0
> MatAssemblyEnd       939 1.0 2.0570e-01 1.2 0.00e+00 0.0 2.6e+03 1.2e+03
> 8.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatZeroEntries       938 1.0 1.8202e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> SNESSolve            210 1.0 6.8585e+02 1.0 5.06e+11 1.2 1.1e+09 4.6e+03
> 2.1e+06 98100100100100  98100100100100 178411
> SNESFunctionEval    2296 1.0 2.0512e+01 1.3 0.00e+00 0.0 2.9e+06 3.5e+03
> 0.0e+00  3  0  0  0  0   3  0  0  0  0     0
> SNESJacobianEval     938 1.0 2.7680e+01 1.0 0.00e+00 0.0 1.2e+06 3.5e+03
> 1.9e+03  4  0  0  0  0   4  0  0  0  0     0
> SNESLineSearch       938 1.0 9.2491e+00 1.0 6.07e+07 1.1 1.2e+06 3.5e+03
> 9.4e+02  1  0  0  0  0   1  0  0  0  0  1593
> KSPSetUp             938 1.0 2.7661e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+01  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve             938 1.0 6.3433e+02 1.0 5.06e+11 1.2 1.0e+09 4.6e+03
> 2.0e+06 91100100100 99  91100100100 99 192852
> PCSetUp              938 1.0 2.1005e-04 3.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCApply           818390 1.0 1.9675e+02 1.5 2.00e+11 1.1 0.0e+00 0.0e+00
> 0.0e+00 22 40  0  0  0  22 40  0  0  0 246626
>
> ------------------------------------------------------------------------------------------------------------------------
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160205/6fe7e3da/attachment.html>


More information about the petsc-users mailing list