[petsc-users] Problem with AMG packages

Pierre Jolivet jolivet at ann.jussieu.fr
Tue Oct 8 16:11:31 CDT 2013


Dear all,
I'm trying to compare linear solvers for a simple Poisson equation in 3D.
I thought that MG was the way to go, but looking at my log, the
performance looks abysmal (I know that the matrices are way too small but
if I go bigger, it just never performs a single iteration ..). Even though
this is neither the BoomerAMG nor the ML mailing list, could you please
tell me if PETSc sets some default flags that make the setup for those
solvers so slow for this simple problem ? The performance of (G)ASM is in
comparison much better.

Thanks in advance for your help.

PS: first the BoomerAMG log, then ML (much more verbose, sorry).

  0 KSP Residual norm 1.599647112604e+00
  1 KSP Residual norm 5.450838232404e-02
  2 KSP Residual norm 3.549673478318e-03
  3 KSP Residual norm 2.901826808841e-04
  4 KSP Residual norm 2.574235778729e-05
  5 KSP Residual norm 2.253410171682e-06
  6 KSP Residual norm 1.871067784877e-07
  7 KSP Residual norm 1.681162800670e-08
  8 KSP Residual norm 2.120841512414e-09
KSP Object: 2048 MPI processes
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 2048 MPI processes
  type: hypre
    HYPRE BoomerAMG preconditioning
    HYPRE BoomerAMG: Cycle type V
    HYPRE BoomerAMG: Maximum number of levels 25
    HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
    HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
    HYPRE BoomerAMG: Threshold for strong coupling 0.25
    HYPRE BoomerAMG: Interpolation truncation factor 0
    HYPRE BoomerAMG: Interpolation: max elements per row 0
    HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
    HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
    HYPRE BoomerAMG: Maximum row sums 0.9
    HYPRE BoomerAMG: Sweeps down         1
    HYPRE BoomerAMG: Sweeps up           1
    HYPRE BoomerAMG: Sweeps on coarse    1
    HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
    HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
    HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
    HYPRE BoomerAMG: Relax weight  (all)      1
    HYPRE BoomerAMG: Outer relax weight (all) 1
    HYPRE BoomerAMG: Using CF-relaxation
    HYPRE BoomerAMG: Measure type        local
    HYPRE BoomerAMG: Coarsen type        Falgout
    HYPRE BoomerAMG: Interpolation type  classical
  linear system matrix = precond matrix:
  Matrix Object:   2048 MPI processes
    type: mpiaij
    rows=4173281, cols=4173281
    total: nonzeros=102576661, allocated nonzeros=102576661
    total number of mallocs used during MatSetValues calls =0
      not using I-node (on process 0) routines
 --- system solved with PETSc (in 1.005199e+02 seconds)

  0 KSP Residual norm 2.368804472986e-01
  1 KSP Residual norm 5.676430019132e-02
  2 KSP Residual norm 1.898005876002e-02
  3 KSP Residual norm 6.193922902926e-03
  4 KSP Residual norm 2.008448794493e-03
  5 KSP Residual norm 6.390465670228e-04
  6 KSP Residual norm 2.157709394389e-04
  7 KSP Residual norm 7.295973819979e-05
  8 KSP Residual norm 2.358343271482e-05
  9 KSP Residual norm 7.489696222066e-06
 10 KSP Residual norm 2.390946857593e-06
 11 KSP Residual norm 8.068086385140e-07
 12 KSP Residual norm 2.706607789749e-07
 13 KSP Residual norm 8.636910863376e-08
 14 KSP Residual norm 2.761981175852e-08
 15 KSP Residual norm 8.755459874369e-09
 16 KSP Residual norm 2.708848598341e-09
 17 KSP Residual norm 8.968748876265e-10
KSP Object: 2048 MPI processes
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 2048 MPI processes
  type: ml
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     2048 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     2048 MPI processes
      type: redundant
        Redundant preconditioner: First (color=0) of 2048 PCs follows
      KSP Object:      (mg_coarse_redundant_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_coarse_redundant_)       1 MPI processes
        type: lu
          LU: out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: nd
          factor fill ratio given 5, needed 4.38504
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=2055, cols=2055
                package used to perform factorization: petsc
                total: nonzeros=2476747, allocated nonzeros=2476747
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 1638 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=2055, cols=2055
          total: nonzeros=564817, allocated nonzeros=1093260
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       2048 MPI processes
        type: mpiaij
        rows=2055, cols=2055
        total: nonzeros=564817, allocated nonzeros=564817
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     2048 MPI processes
      type: richardson
        Richardson: damping factor=1
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     2048 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1,
omega = 1
      linear system matrix = precond matrix:
      Matrix Object:       2048 MPI processes
        type: mpiaij
        rows=30194, cols=30194
        total: nonzeros=3368414, allocated nonzeros=3368414
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     2048 MPI processes
      type: richardson
        Richardson: damping factor=1
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     2048 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1,
omega = 1
      linear system matrix = precond matrix:
      Matrix Object:       2048 MPI processes
        type: mpiaij
        rows=531441, cols=531441
        total: nonzeros=12476324, allocated nonzeros=12476324
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   2048 MPI processes
    type: mpiaij
    rows=531441, cols=531441
    total: nonzeros=12476324, allocated nonzeros=12476324
    total number of mallocs used during MatSetValues calls =0
      not using I-node (on process 0) routines
 --- system solved with PETSc (in 2.407844e+02 seconds)




More information about the petsc-users mailing list