[petsc-users] GAMG speed

Thu Aug 1 13:21:22 CDT 2013

   What kind of mesh are you using? Are you using DMDA? If you are using DMDA (and have written your code to use it "correctly") then it should be trivial to run with geometric multigrid and geometric multigrid should be a bit faster.

   For example on src/snes/examples/tutorials/ex19.c   I run with ./ex19 -pc_type mg -da_refine 4 and it refines the original DMDA 4 times and uses geometric multigrid with 5 levels.

   Barry

On Aug 1, 2013, at 1:14 PM, Michele Rosso <mrosso at uci.edu> wrote:

> Hi,
> 
> I am successfully using PETSc (v3.4.2)  to solve a 3D Poisson's equation with CG + GAMG as I was suggested to do in a previous thread. 
> So far I am using GAMG with the default settings, i.e.
> 
> -pc_type gamg -pc_gamg_agg_nsmooths 1 
> 
> The speed of the solution is satisfactory, but I would like to know if you have any suggestions to further speed it up, particularly
> if there is any parameters worth looking into to achieve an even faster solution, for example number of levels and so on.
> So far I am using Dirichlet's BCs for my test case, but I will soon have periodic conditions: in this case, does GAMG require particular settings?
> Finally, I did not try geometric multigrid: do you think it is worth a shot?
> 
> Here are my current settings:
> 
> I run with
> 
> -pc_type gamg -pc_gamg_agg_nsmooths 1 -ksp_view -options_left
> 
> and the output is:
> 
> KSP Object: 4 MPI processes
>   type: cg
>   maximum iterations=10000
>   tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
>   left preconditioning
>   using nonzero initial guess
>   using UNPRECONDITIONED norm type for convergence test
> PC Object: 4 MPI processes
>   type: gamg
>     MG: type is MULTIPLICATIVE, levels=3 cycles=v
>       Cycles per PCApply=1
>       Using Galerkin computed coarse grid matrices
>   Coarse grid solver -- level -------------------------------
>     KSP Object:    (mg_coarse_)     4 MPI processes
>       type: preonly
>       maximum iterations=1, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object:    (mg_coarse_)     4 MPI processes
>       type: bjacobi
>         block Jacobi: number of blocks = 4
>         Local solve info for each block is in the following KSP and PC objects:
>       [0] number of local blocks = 1, first local block number = 0
>                 [0] local block number 0
> KSP Object:          (mg_coarse_sub_)         1 MPI processes
>           type: preonly
>           maximum iterations=1, initial guess is zero
>                 tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> KSP Object:        (mg_coarse_sub_)            left preconditioning
>           using NONE norm type for convergence test
>           PC Object:        (mg_coarse_sub_)       1 MPI processes
>           type: preonly
>          1 MPI processes
>           type: lu
>           maximum iterations=1, initial guess is zero
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>           LU: out-of-place factorization
>             left preconditioning
>           using NONE norm type for convergence test
>           PC Object:        (mg_coarse_sub_)         1 MPI processes
>           type: lu
>           tolerance for zero pivot 2.22045e-14
>             using diagonal shift on blocks to prevent zero pivot
>             matrix ordering: nd
>             LU: out-of-place factorization
>             tolerance for zero pivot 2.22045e-14
>             using diagonal shift on blocks to prevent zero pivot
>             matrix ordering: nd
>             factor fill ratio given 5, needed 0
>               Factored matrix follows:
>             factor fill ratio given 5, needed 4.13207
>               Factored matrix follows:
>                   Matrix Object:              Matrix Object:                 1 MPI processes
>                   type: seqaij
>                     rows=395, cols=395
>                     package used to perform factorization: petsc
>                   total: nonzeros=132379, allocated nonzeros=132379
>                   total number of mallocs used during MatSetValues calls =0
>                         not using I-node routines
>            1 MPI processes
>                   type: seqaij
>           linear system matrix = precond matrix:
>                     rows=0, cols=0
>                     package used to perform factorization: petsc
>                   total: nonzeros=1, allocated nonzeros=1
>                     total number of mallocs used during MatSetValues calls =0
>                       not using I-node routines
>               linear system matrix = precond matrix:
>   Matrix Object:             1 MPI processes
>             type: seqaij
>           Matrix Object:KSP Object:           1 MPI processes
>             type: seqaij
>             rows=0, cols=0
>             total: nonzeros=0, allocated nonzeros=0
>             total number of mallocs used during MatSetValues calls =0
>                 not using I-node routines
>           rows=395, cols=395
>             total: nonzeros=32037, allocated nonzeros=32037
>             total number of mallocs used during MatSetValues calls =0
>               not using I-node routines
>           - - - - - - - - - - - - - - - - - -
>           KSP Object:        (mg_coarse_sub_)         1 MPI processes
>           type: preonly
>           maximum iterations=1, initial guess is zero
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>           left preconditioning
>           using NONE norm type for convergence test
>         PC Object:        (mg_coarse_sub_)         1 MPI processes
>           type: lu
>             LU: out-of-place factorization
>             tolerance for zero pivot 2.22045e-14
>             using diagonal shift on blocks to prevent zero pivot
>             matrix ordering: nd
>             factor fill ratio given 5, needed 0
>               Factored matrix follows:
>                 Matrix Object:                 1 MPI processes
>                   type: seqaij
>                   rows=0, cols=0
>                   package used to perform factorization: petsc
>                   total: nonzeros=1, allocated nonzeros=1
>                   total number of mallocs used during MatSetValues calls =0
>                     not using I-node routines
>           linear system matrix = precond matrix:
>           Matrix Object:           1 MPI processes
>             type: seqaij
>             rows=0, cols=0
>             total: nonzeros=0, allocated nonzeros=0
>             total number of mallocs used during MatSetValues calls =0
>               not using I-node routines
>   (mg_coarse_sub_)         1 MPI processes
>           type: preonly
>           maximum iterations=1, initial guess is zero
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>           left preconditioning
>           using NONE norm type for convergence test
>         PC Object:        (mg_coarse_sub_)         1 MPI processes
>           type: lu
>             LU: out-of-place factorization
>             tolerance for zero pivot 2.22045e-14
>             using diagonal shift on blocks to prevent zero pivot
>             matrix ordering: nd
>             factor fill ratio given 5, needed 0
>               Factored matrix follows:
>                 Matrix Object:                 1 MPI processes
>                   type: seqaij
>                   rows=0, cols=0
>                   package used to perform factorization: petsc
>                   total: nonzeros=1, allocated nonzeros=1
>                   total number of mallocs used during MatSetValues calls =0
>                     not using I-node routines
>           linear system matrix = precond matrix:
>           Matrix Object:           1 MPI processes
>             type: seqaij
>             rows=0, cols=0
>             total: nonzeros=0, allocated nonzeros=0
>             total number of mallocs used during MatSetValues calls =0
>               not using I-node routines
>       [1] number of local blocks = 1, first local block number = 1
>         [1] local block number 0
>         - - - - - - - - - - - - - - - - - -
>       [2] number of local blocks = 1, first local block number = 2
>         [2] local block number 0
>         - - - - - - - - - - - - - - - - - -
>       [3] number of local blocks = 1, first local block number = 3
>         [3] local block number 0
>         - - - - - - - - - - - - - - - - - -
>       linear system matrix = precond matrix:
>       Matrix Object:       4 MPI processes
>         type: mpiaij
>         rows=395, cols=395
>         total: nonzeros=32037, allocated nonzeros=32037
>         total number of mallocs used during MatSetValues calls =0
>           not using I-node (on process 0) routines
>   Down solver (pre-smoother) on level 1 -------------------------------
>     KSP Object:    (mg_levels_1_)     4 MPI processes
>       type: chebyshev
>         Chebyshev: eigenvalue estimates:  min = 0.0636225, max = 1.33607
>       maximum iterations=2
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using nonzero initial guess
>       using NONE norm type for convergence test
>     PC Object:    (mg_levels_1_)     4 MPI processes
>       type: jacobi
>       linear system matrix = precond matrix:
>       Matrix Object:       4 MPI processes
>         type: mpiaij
>         rows=23918, cols=23918
>         total: nonzeros=818732, allocated nonzeros=818732
>         total number of mallocs used during MatSetValues calls =0
>           not using I-node (on process 0) routines
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   Down solver (pre-smoother) on level 2 -------------------------------
>     KSP Object:    (mg_levels_2_)     4 MPI processes
>       type: chebyshev
>         Chebyshev: eigenvalue estimates:  min = 0.0971369, max = 2.03987
>       maximum iterations=2
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using nonzero initial guess
>       using NONE norm type for convergence test
>     PC Object:    (mg_levels_2_)     4 MPI processes
>       type: jacobi
>       linear system matrix = precond matrix:
>       Matrix Object:       4 MPI processes
>         type: mpiaij
>         rows=262144, cols=262144
>         total: nonzeros=1835008, allocated nonzeros=1835008
>         total number of mallocs used during MatSetValues calls =0
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   linear system matrix = precond matrix:
>   Matrix Object:   4 MPI processes
>     type: mpiaij
>     rows=262144, cols=262144
>     total: nonzeros=1835008, allocated nonzeros=1835008
>     total number of mallocs used during MatSetValues calls =0
> #PETSc Option Table entries:
> -ksp_view
> -options_left
> -pc_gamg_agg_nsmooths 1
> -pc_type gamg
> #End of PETSc Option Table entries
> There are no unused options.
> 
> 
> Thank you,
> Michele