[petsc-users] Expected weak scaling behaviour for AMG libraries?

Fri Nov 8 08:11:30 CST 2024

On Thu, Nov 7, 2024 at 10:08 AM Khurana, Parv <p.khurana22 at imperial.ac.uk>
wrote:

> Hello Mark and Mathew,
>
> Apologies for the delay in reply (I was gone for a vacation). Really
> appreciate the prompt response.
>
> I am now planning to redo these tests with the load balancing suggestions
> you have provided. *Would you suggest any load balancing options to use
> as default when dealing with unstructured meshes in general*? I
>

The default load balancing should be good. We do not use it in CI tests
because it is not reproducible across machines. When you
use

  -plexpartitioner_type simple

you are turning off the default load balancing. Don't do that.

> use PETSc as an external linear solver for my software, where I supply a
> Poisson system discretised using 3D simplical elements and FEM - which are
> solved using AMG. I observed bad weak scaling behaviour for my application
> for 20k DOF/rank, which prompted me to test something similar only in
> PETSc.
>
> I choose ex12 instead of ex56 because it uses 3D FEM. I am not sure if I
> can make ex56 work for tetrahedrons out of the box. Maybe ex13 is more
> suited as Mark mentioned.
>
> On point 3,4 from Mathew:
> The plot below is from the numbers extracted from the -log_view option for
> all the runs. I have attached a sample log file from my runs, and pasted a
> sample output in the email.
>

Yes, this has the simple partitioning, which is not what you want.

  Thanks,

     Matt

>
> ------------------------------------------------------------------ PETSc
> Performance Summary:
> ------------------------------------------------------------------
>
>
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flop
>        --- Global ---  --- Stage ----  Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen
>  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> KSPSolve               2 1.0 1.4079e-01 1.0 2.14e+07 2.0 1.2e+03 1.1e+04
> 4.4e+01  2  4 26 16 17   2  4 26 16 18   875
> SNESSolve              1 1.0 2.9310e+00 1.0 1.69e+08 1.1 1.7e+03 2.0e+04
> 6.1e+01 46 46 37 38 23  46 46 37 38 25   445
> PCApply               23 1.0 1.2774e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Thanks and Best,
> Parv
>
> ------------------------------
> *From:* Mark Adams <mfadams at lbl.gov>
> *Sent:* 31 October 2024 11:30
> *To:* Matthew Knepley <knepley at gmail.com>
> *Cc:* Khurana, Parv <p.khurana22 at imperial.ac.uk>; petsc-users at mcs.anl.gov
> <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Expected weak scaling behaviour for AMG
> libraries?
>
>
> This email from mfadams at lbl.gov originates from outside Imperial. Do not
> click on links and attachments unless you recognise the sender. If you
> trust the sender, add them to your safe senders list
> <https://urldefense.us/v3/__https://spam.ic.ac.uk/SpamConsole/Senders.aspx__;!!G_uCfscf7eWS!aRkIkjdOvZx9uPuKeRCMxZ-OzP2IYC81-tJZcPczdDkmsiYvbB5fa0eNLr_hJmfRf73BE9wcUyW78MXh0iAX$ > to disable email
> stamping for this address.
>
>
> As Matt said snes ex56 is better because it does a convergence test that
> refines the grid. You need/want these two parameters to have the same arg
> (eg, 2,2,1): -dm_plex_box_faces 2,2,1 -petscpartitioner_simple_process_grid
> 2,2,1.
> This will put one cell per process.
>
> Then you use: -max_conv_its N, to specify the N levels of refinement to
> do. It will run the 2,2,1 first then a 4,4,2, etc., N times.
>
> /src/snes/tests/ex13.c is designed for benchmarking and it uses
> '-petscpartitioner_simple_node_grid 1,1,1 [default]' to give you a two
> level partitioner.
> You need to have dm_plex_box_faces_i =
> petscpartitioner_simple_process_grid_i  *
> petscpartitioner_simple_node_grid_i
> Again, you should put one cell per process (NP = product of
> dm_plex_box_faces args) and use -dm_refine N to get a single solve.
>
> Mark
>
>
>
> On Wed, Oct 30, 2024 at 11:02 PM Matthew Knepley <knepley at gmail.com>
> wrote:
>
> On Wed, Oct 30, 2024 at 4:13 PM Khurana, Parv <p.khurana22 at imperial.ac.uk>
> wrote:
>
> Hello PETSc Community,
> I am trying to understand the scaling behaviour of AMG methods in PETSc
> (Hypre for now) and how many DOFs/Rank are needed for a performant AMG
> solve.
> I’m currently conducting weak scaling tests using
> src/snes/tutorials/ex12.c in 3D, applying Dirichlet BCs with FEM at P=1.
> The tests keep DOFs per processor constant while increasing the mesh size
> and processor count, specifically:
>
>    - *20000 and 80000 DOF/RANK* configurations.
>    - Running SNES twice, using GMRES with a tolerance of 1e-5 and
>    preconditioning with Hypre-BoomerAMG.
>
> A couple of quick points  in order to make sure that there is no confusion:
>
> 1) Partitioner type "simple" is for the CI. It is a very bad partition,
> and should not be used for timing. The default is ParMetis which should be
> good enough.
>
> 2) You start out with 6^3 = 216 elements, distribute that, and then refine
> it. This will be _really_ bad load balance on all arrangement except the
> divisors of 216. You usually want to start out with something bigger at the
> later stages. You can use -dm_refine_pre to refine before distribution.
>
> 3) It is not clear you are using the timing for just the solver
> (SNESSolve). It could be that extraneous things are taking time. When
> asking questions like this, please always send the output of -log_view for
> timing, and at least -ksp_monitor_true_residial for convergence.
>
> 4) SNES ex56 is the example we use for GAMG scalability testing
>
>   Thanks,
>
>       Matt
>
> Unfortunately, parallel efficiency degrades noticeably with increased
> processor counts. Are there any insights or rules of thumb for using AMG
> more effectively? I have been looking at this issue for a while
> now and would love to engage in a further discussion. Please find below the
> weak scaling results and the options I use to run the tests.
> *#Run type*
> -run_type full
> -petscpartitioner_type simple
>
> *#Mesh settings*
> -dm_plex_dim 3
> -dm_plex_simplex 1
> -dm_refine 5 #Varied this
> -dm_plex_box_faces 6,6,6
>
> *#BCs and FEM space*
> -bc_type dirichlet
> -petscspace_degree 1
>
> *#Solver settings*
> -snes_max_it 2
> -ksp_type gmres
> -ksp_rtol 1.0e-5
> #Same settings as what we use for LOR
> -pc_type hypre
> -pc_hypre_type boomeramg
> -pc_hypre_boomeramg_coarsen_type hmis
> -pc_hypre_boomeramg_relax_type_all symmetric-sor/jacobi
> -pc_hypre_boomeramg_strong_threshold 0.7
> -pc_hypre_boomeramg_interp_type ext+i
> -pc_hypre_boomeramg_P_max 2
> -pc_hypre_boomeramg_truncfactor 0.3
>
> Best,
> Parv
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aRkIkjdOvZx9uPuKeRCMxZ-OzP2IYC81-tJZcPczdDkmsiYvbB5fa0eNLr_hJmfRf73BE9wcUyW78CaVeYVe$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YXR-8qKioRYS0fNOHacYGkm6WaIuKge2zoTiW1n0vLsWQUBiyLM48cg58pRLtNm0QjVigIZYftn2x09fmjiN$>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aRkIkjdOvZx9uPuKeRCMxZ-OzP2IYC81-tJZcPczdDkmsiYvbB5fa0eNLr_hJmfRf73BE9wcUyW78CaVeYVe$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aRkIkjdOvZx9uPuKeRCMxZ-OzP2IYC81-tJZcPczdDkmsiYvbB5fa0eNLr_hJmfRf73BE9wcUyW78HgFR8ju$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241108/ab886669/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 119488 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241108/ab886669/attachment-0001.png>