[petsc-users] Question about the performance of KSP solver
Mark Adams
mfadams at lbl.gov
Sun Feb 27 09:23:13 CST 2022
First, you probably want -ksp_type cg
ILU sucks, you can try 'gamg' and I would configure it with hypre and try
that also.
Now, you are getting very little increase in flop rate on MatMult (8K
-->8.6K). That is the problem
* You have a "bad" network
* Your problem is too small (with respect to your network) to get speedup
after 16 porcs
* Bad partitioning
* You have a fair amount of load imbalance (MatMult 671 1.0
4.7602e+01 *2.0* ), and it gets worse at 32 procs. That can be from bad
partitioning or a bad network.
With a decent network you probably want to keep at least 25K equations per
processor. This depends on these other issues, but it is a start. With a
good network you should be able to get down to 10K/proc
It is best to start with a simple model problem like a cube, with cube
shaped subdomains is ideal, and isolate issues.
Mark
On Sun, Feb 27, 2022 at 9:51 AM Gong Yujie <yc17470 at connect.um.edu.mo>
wrote:
> Hi,
>
> I'm using the GMRES with ASM preconditioner with sub-domain solver ILU(2)
> to solve an elasticity problem. First, I use 16 cores to test the
> computation time, then use 32 cores to run the same code with the same
> parameters. But I just get about 10% speed up. From the log file I found
> that the computation time of KSPSolve() and MatSolve() just decrease a
> little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when
> configure it. The matrix size is about 7*10^6. Some detail of the log is
> shown below:
>
> 16-cores:
> ------------------------------------------------------------------------------------------------------------------------
>
> Event Count Time (sec) Flop
> --- Global --- --- Stage ---- Total
> Max Ratio Max Ratio Max Ratio Mess AvgLen
> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
> MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04
> 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010
> MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00
> 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932
> MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00
> 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056
> MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
> KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04
> 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437
> KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00
> 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578
> PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05
> 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591
> PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00
> 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002
> PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04
> 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701
> PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00
> 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910
>
> 32-cores:
> ------------------------------------------------------------------------------------------------------------------------
>
> Event Count Time (sec) Flop
> --- Global --- --- Stage ---- Total
> Max Ratio Max Ratio Max Ratio Mess AvgLen
> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
> MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04
> 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637
> MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00
> 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544
> MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00
> 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743
> MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
> KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04
> 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592
> KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00
> 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450
> PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05
> 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440
> PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00
> 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267
> PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04
> 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245
> PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00
> 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517
>
> Hope you can help me!
>
> Best Regards,
> Yujie
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220227/a661a827/attachment.html>
More information about the petsc-users
mailing list