[petsc-users] Effect of -pc_gamg_threshold vs PETSc version

Thu Apr 13 11:27:58 CDT 2023

Hi Jeremy,

We did make some changes for performance reasons that we could not avoid,
but I have never seen anything like this, so let's dig into it.

0) what is your test problem? eg, 3D Lapacian with Q1 finite elements.

First, you can get GAMG diagnostics by running with '-info :pc' and grep on
GAMG.

Second, you are going to want to look at *iteration count* and *solve
times, *and you want to separate the solve time (KSPSolve) and the GAMG
setup time.
If you have your own timer dig into -log_view data and get the "KSPSolve"
time (solve time) and "RAP" or "P'AP" for the setup time.
You could run one warm up solve and time a second one separately. That is
what I do.

*Iteration count:*
You want to look at the eigen estimates for chebyshev.
If you have an SPD problem then you want to use CG and not the default
GMRES.
If the eigen estimates are low GAMG convergence can suffer, but this is
ussually catastrphic.
*If your interation counts increase dramatically then this could be the
issue.*

*Time / iteration and setup time:*
You can also see the grid sizes and number of nnz/row (ave). This will
effect time/iteration and setip time

3.17) In looking at the change logs for 3.17 (
https://petsc.org/main/changes/317/#:~:text=maximum%20of%20ten-,PCMG,-%3A)
we made few changes:
* moved default smoothing to Jacobi from SOR because Jacobi works on GPUs
* Some eigen estimate changes that you should look at. You should add the
MatOptions if your matrix is SPD especially.

SOR ussually converges faster, but is slower per interation.
*Maybe Jacobi runs a lot faster for you.*

+ Check iteration counts
+ Check the eigen estimates did not change. If they did then we can dig
into that

3.18) big chagnes:
https://petsc.org/main/changes/318/#:~:text=based%20aggregation%20algorithm-,PC,-%3A
* Some small things but *the -pc_gamg_sym_graph bullet might be (one of)
your problem(s)*. Related to the MatOptions bullet above
* The "aggressive" coarsening stratagy (use to be called "square_graph" but
the old syntax is supported) is different because the old was was very slow.
   I have noticed that the rate of coarsening changes a little with the new
method, but not much.
*   But the way threshould works with the new method is a bit different so
that could explain some of this.*
   (new method calls MIS twice; old method call MIS on A'A)

** There are two things that you want to check:
1) Eigen estimates. If Eigen estimates are too small interation counts can
increase a lot or ussually the solver just fails.
* See if there are any changes in the eigen estimates for chebyshev*
2) Rate of coarening, which effects the number of NNZ per row. If that is
too slow, NNZ goes up and the coarse grid construction (RAP) cost go way up.
Check that the coarse grid sizes, which is related to NNZ per row, do not
change. I think they do and we can dig into into it.
*A quick proxy for (2) is the "grid complexity" output. This should be
around 1.01 to 1.2*

Anyway, sorry for the changes.
I hate changing GAMG for this reason and I hate AMG for this reason!

Thanks,
Mark

On Thu, Apr 13, 2023 at 8:17 AM Jeremy Theler <jeremy at seamplex.com> wrote:

> When using GAMG+cg for linear elasticity and providing the near
> nullspace computed by MatNullSpaceCreateRigidBody(), I used to find
> "experimentally" that a small value of -pc_gamg_threshold in the order
> of 0.0001 would slightly decrease the solve time.
>
> Starting with 3.18, I started seeing that any positive value for the
> treshold would increase the solve time. I did a quick parametric
> (serial) run solving an elastic problem with a matrix size of approx
> 570k x 570k for different values of GAMG threshold and different PETSc
> versions (compiled with the same compiler, options and flags).
>
> I noted that
>
>  1. starting from 3.18, a threshold of 0.0001 that used to improve the
> speed now worsens it.
>  2. PETSc 3.17 looks like a "sweet spot" of speed
>
> I would like to hear any comments you might have.
>
> The wall time shown includes the time needed to read the mesh and
> assemble the stiffness matrix. It is a refined version of the NAFEMS
> LE10 benchmark described here:
>
> https://seamplex.com/feenox/examples/mechanical.html#nafems-le10-thick-plate-pressure-benchmark
>
> If you want, I could dump the matrix, rhs and near nullspace vectors
> and share them.
>
> --
> jeremy theler
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230413/7b6919bf/attachment.html>