[petsc-users] Problem with AMG packages

Tue Oct 8 16:16:55 CDT 2013

  We need the output from running with -log_summary -pc_mg_log

   Also you can run with PETSc's AMG called GAMG (run with -pc_type gamg) This will give the most useful information about where it is spending the time.

   Barry

On Oct 8, 2013, at 4:11 PM, Pierre Jolivet <jolivet at ann.jussieu.fr> wrote:

> Dear all,
> I'm trying to compare linear solvers for a simple Poisson equation in 3D.
> I thought that MG was the way to go, but looking at my log, the
> performance looks abysmal (I know that the matrices are way too small but
> if I go bigger, it just never performs a single iteration ..). Even though
> this is neither the BoomerAMG nor the ML mailing list, could you please
> tell me if PETSc sets some default flags that make the setup for those
> solvers so slow for this simple problem ? The performance of (G)ASM is in
> comparison much better.
> 
> Thanks in advance for your help.
> 
> PS: first the BoomerAMG log, then ML (much more verbose, sorry).
> 
>  0 KSP Residual norm 1.599647112604e+00
>  1 KSP Residual norm 5.450838232404e-02
>  2 KSP Residual norm 3.549673478318e-03
>  3 KSP Residual norm 2.901826808841e-04
>  4 KSP Residual norm 2.574235778729e-05
>  5 KSP Residual norm 2.253410171682e-06
>  6 KSP Residual norm 1.871067784877e-07
>  7 KSP Residual norm 1.681162800670e-08
>  8 KSP Residual norm 2.120841512414e-09
> KSP Object: 2048 MPI processes
>  type: gmres
>    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>    GMRES: happy breakdown tolerance 1e-30
>  maximum iterations=200, initial guess is zero
>  tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
>  left preconditioning
>  using PRECONDITIONED norm type for convergence test
> PC Object: 2048 MPI processes
>  type: hypre
>    HYPRE BoomerAMG preconditioning
>    HYPRE BoomerAMG: Cycle type V
>    HYPRE BoomerAMG: Maximum number of levels 25
>    HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>    HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
>    HYPRE BoomerAMG: Threshold for strong coupling 0.25
>    HYPRE BoomerAMG: Interpolation truncation factor 0
>    HYPRE BoomerAMG: Interpolation: max elements per row 0
>    HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>    HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>    HYPRE BoomerAMG: Maximum row sums 0.9
>    HYPRE BoomerAMG: Sweeps down         1
>    HYPRE BoomerAMG: Sweeps up           1
>    HYPRE BoomerAMG: Sweeps on coarse    1
>    HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
>    HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
>    HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>    HYPRE BoomerAMG: Relax weight  (all)      1
>    HYPRE BoomerAMG: Outer relax weight (all) 1
>    HYPRE BoomerAMG: Using CF-relaxation
>    HYPRE BoomerAMG: Measure type        local
>    HYPRE BoomerAMG: Coarsen type        Falgout
>    HYPRE BoomerAMG: Interpolation type  classical
>  linear system matrix = precond matrix:
>  Matrix Object:   2048 MPI processes
>    type: mpiaij
>    rows=4173281, cols=4173281
>    total: nonzeros=102576661, allocated nonzeros=102576661
>    total number of mallocs used during MatSetValues calls =0
>      not using I-node (on process 0) routines
> --- system solved with PETSc (in 1.005199e+02 seconds)
> 
>  0 KSP Residual norm 2.368804472986e-01
>  1 KSP Residual norm 5.676430019132e-02
>  2 KSP Residual norm 1.898005876002e-02
>  3 KSP Residual norm 6.193922902926e-03
>  4 KSP Residual norm 2.008448794493e-03
>  5 KSP Residual norm 6.390465670228e-04
>  6 KSP Residual norm 2.157709394389e-04
>  7 KSP Residual norm 7.295973819979e-05
>  8 KSP Residual norm 2.358343271482e-05
>  9 KSP Residual norm 7.489696222066e-06
> 10 KSP Residual norm 2.390946857593e-06
> 11 KSP Residual norm 8.068086385140e-07
> 12 KSP Residual norm 2.706607789749e-07
> 13 KSP Residual norm 8.636910863376e-08
> 14 KSP Residual norm 2.761981175852e-08
> 15 KSP Residual norm 8.755459874369e-09
> 16 KSP Residual norm 2.708848598341e-09
> 17 KSP Residual norm 8.968748876265e-10
> KSP Object: 2048 MPI processes
>  type: gmres
>    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>    GMRES: happy breakdown tolerance 1e-30
>  maximum iterations=200, initial guess is zero
>  tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
>  left preconditioning
>  using PRECONDITIONED norm type for convergence test
> PC Object: 2048 MPI processes
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v
>      Cycles per PCApply=1
>      Using Galerkin computed coarse grid matrices
>  Coarse grid solver -- level -------------------------------
>    KSP Object:    (mg_coarse_)     2048 MPI processes
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>      using NONE norm type for convergence test
>    PC Object:    (mg_coarse_)     2048 MPI processes
>      type: redundant
>        Redundant preconditioner: First (color=0) of 2048 PCs follows
>      KSP Object:      (mg_coarse_redundant_)       1 MPI processes
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>        using NONE norm type for convergence test
>      PC Object:      (mg_coarse_redundant_)       1 MPI processes
>        type: lu
>          LU: out-of-place factorization
>          tolerance for zero pivot 2.22045e-14
>          using diagonal shift on blocks to prevent zero pivot
>          matrix ordering: nd
>          factor fill ratio given 5, needed 4.38504
>            Factored matrix follows:
>              Matrix Object:               1 MPI processes
>                type: seqaij
>                rows=2055, cols=2055
>                package used to perform factorization: petsc
>                total: nonzeros=2476747, allocated nonzeros=2476747
>                total number of mallocs used during MatSetValues calls =0
>                  using I-node routines: found 1638 nodes, limit used is 5
>        linear system matrix = precond matrix:
>        Matrix Object:         1 MPI processes
>          type: seqaij
>          rows=2055, cols=2055
>          total: nonzeros=564817, allocated nonzeros=1093260
>          total number of mallocs used during MatSetValues calls =0
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:       2048 MPI processes
>        type: mpiaij
>        rows=2055, cols=2055
>        total: nonzeros=564817, allocated nonzeros=564817
>        total number of mallocs used during MatSetValues calls =0
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:    (mg_levels_1_)     2048 MPI processes
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=2
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>      using nonzero initial guess
>      using NONE norm type for convergence test
>    PC Object:    (mg_levels_1_)     2048 MPI processes
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, local iterations = 1,
> omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:       2048 MPI processes
>        type: mpiaij
>        rows=30194, cols=30194
>        total: nonzeros=3368414, allocated nonzeros=3368414
>        total number of mallocs used during MatSetValues calls =0
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) same as down solver (pre-smoother)
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:    (mg_levels_2_)     2048 MPI processes
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=2
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>      using nonzero initial guess
>      using NONE norm type for convergence test
>    PC Object:    (mg_levels_2_)     2048 MPI processes
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, local iterations = 1,
> omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:       2048 MPI processes
>        type: mpiaij
>        rows=531441, cols=531441
>        total: nonzeros=12476324, allocated nonzeros=12476324
>        total number of mallocs used during MatSetValues calls =0
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) same as down solver (pre-smoother)
>  linear system matrix = precond matrix:
>  Matrix Object:   2048 MPI processes
>    type: mpiaij
>    rows=531441, cols=531441
>    total: nonzeros=12476324, allocated nonzeros=12476324
>    total number of mallocs used during MatSetValues calls =0
>      not using I-node (on process 0) routines
> --- system solved with PETSc (in 2.407844e+02 seconds)
> 
>