[petsc-users] log_summary time ratio and flops ratio
Matthew Knepley
knepley at gmail.com
Fri Feb 5 11:18:18 CST 2016
On Fri, Feb 5, 2016 at 11:12 AM, Xiangdong <epscodes at gmail.com> wrote:
> Hello everyone,
>
> When I looked at the log_summary, I found that flops ratio of VecDot is
> 1.1, but the time ratio is 1.8. Also the flops ratio of MatMult is 1.2, but
> the time ratio is 1.6. I am using the simple SOR preconditioner on 256
> procs. Since each of them consumes about 20% of total time, is it worth
> looking into these imbalance issue? If yes, how do I know which procs are
> causing the issue? What tools should I start with?
>
1) Always send all the output
2) You have a large number of dot products, and I am guessing you have a
slow reduce on this machine (we could see the data at the end of
log_summary).
The waiting time comes from slow collective communication, load
imbalance, system noise, etc. You can see this by looking at the difference
between
VecDot() and VecMDot(). For the latter there is enough work to cover
up the latency, and its imbalance is only 1.1, not 1.8. Thus I think it is
clear that your
problem is due to a slow communication network rather than load
imbalance.
3) Thus you would be better served with a stronger preconditioner (more
flops) which resulted in fewer iterations (less synchronization).
Thanks,
Matt
> Thanks.
>
> Xiangdong
>
>
> ------------------------------------------------------------------------------------------------------------------------
> Event Count Time (sec) Flops
> --- Global --- --- Stage --- Total
> Max Ratio Max Ratio Max Ratio Mess Avg len
> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecView 44 1.0 3.9026e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 4.8e+01 1 0 0 0 0 1 0 0 0 0 0
> VecDot 817452 1.0 1.5972e+02 1.8 1.51e+10 1.1 0.0e+00 0.0e+00
> 8.2e+05 18 3 0 0 40 18 3 0 0 40 22969
> VecMDot 613089 1.0 5.8918e+01 1.1 2.27e+10 1.1 0.0e+00 0.0e+00
> 6.1e+05 8 4 0 0 30 8 4 0 0 30 93397
> VecNorm 618619 1.0 1.4426e+02 2.2 1.14e+10 1.1 0.0e+00 0.0e+00
> 6.2e+05 15 2 0 0 30 15 2 0 0 30 19145
> VecScale 938 1.0 8.3907e-03 1.2 8.67e+06 1.1 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 250859
> VecCopy 8159 1.0 1.0394e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecSet 1884 1.0 3.9701e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAXPY 1021815 1.0 2.2148e+01 2.1 1.89e+10 1.1 0.0e+00 0.0e+00
> 0.0e+00 2 4 0 0 0 2 4 0 0 0 207057
> VecAYPX 613089 1.0 6.5360e+00 1.9 1.13e+10 1.1 0.0e+00 0.0e+00
> 0.0e+00 1 2 0 0 0 1 2 0 0 0 420336
> VecAXPBYCZ 191 1.0 3.7365e-03 1.4 7.06e+06 1.1 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 458830
> VecWAXPY 938 1.0 3.8502e-02 1.4 8.67e+06 1.1 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 54669
> VecMAXPY 613089 1.0 1.3276e+01 2.2 2.27e+10 1.1 0.0e+00 0.0e+00
> 0.0e+00 1 4 0 0 0 1 4 0 0 0 414499
> VecLoad 5 1.0 2.0913e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecScatterBegin 820695 1.0 1.8959e+01 2.8 0.00e+00 0.0 1.1e+09 4.6e+03
> 0.0e+00 2 0100100 0 2 0100100 0 0
> VecScatterEnd 820695 1.0 5.2002e+01 4.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 4 0 0 0 0 4 0 0 0 0 0
> VecReduceArith 2814 1.0 1.8517e-02 1.2 5.20e+07 1.1 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 681995
> VecReduceComm 938 1.0 2.6041e+00 3.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 9.4e+02 0 0 0 0 0 0 0 0 0 0 0
> MatMult 817452 1.0 1.9379e+02 1.6 2.04e+11 1.2 1.0e+09 4.6e+03
> 0.0e+00 23 40100100 0 23 40100100 0 253083
> MatSOR 818390 1.0 1.9608e+02 1.5 2.00e+11 1.1 0.0e+00 0.0e+00
> 0.0e+00 22 40 0 0 0 22 40 0 0 0 247472
> MatAssemblyBegin 939 1.0 7.9671e+00 6.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.9e+03 1 0 0 0 0 1 0 0 0 0 0
> MatAssemblyEnd 939 1.0 2.0570e-01 1.2 0.00e+00 0.0 2.6e+03 1.2e+03
> 8.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatZeroEntries 938 1.0 1.8202e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> SNESSolve 210 1.0 6.8585e+02 1.0 5.06e+11 1.2 1.1e+09 4.6e+03
> 2.1e+06 98100100100100 98100100100100 178411
> SNESFunctionEval 2296 1.0 2.0512e+01 1.3 0.00e+00 0.0 2.9e+06 3.5e+03
> 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
> SNESJacobianEval 938 1.0 2.7680e+01 1.0 0.00e+00 0.0 1.2e+06 3.5e+03
> 1.9e+03 4 0 0 0 0 4 0 0 0 0 0
> SNESLineSearch 938 1.0 9.2491e+00 1.0 6.07e+07 1.1 1.2e+06 3.5e+03
> 9.4e+02 1 0 0 0 0 1 0 0 0 0 1593
> KSPSetUp 938 1.0 2.7661e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+01 0 0 0 0 0 0 0 0 0 0 0
> KSPSolve 938 1.0 6.3433e+02 1.0 5.06e+11 1.2 1.0e+09 4.6e+03
> 2.0e+06 91100100100 99 91100100100 99 192852
> PCSetUp 938 1.0 2.1005e-04 3.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> PCApply 818390 1.0 1.9675e+02 1.5 2.00e+11 1.1 0.0e+00 0.0e+00
> 0.0e+00 22 40 0 0 0 22 40 0 0 0 246626
>
> ------------------------------------------------------------------------------------------------------------------------
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160205/6fe7e3da/attachment.html>
More information about the petsc-users
mailing list