Smoother settings for AMG

Wed Jul 29 15:54:35 CDT 2009

Hi,

I am trying to solve a system of equations and I am having difficulty
picking the right smoothers for AMG (using ML as pc_type) in PETSc for
parallel execution. First here is what happens in terms of CG (ksp_type)
iteration counts (both columns use block jacobi):

cpus	|	AMG w/ ICC(0) x1	|	AMG w/ SOR x4
------------------------------------------------------
1	|		43		|		243
4	|		699		|		379

x1 or x4 means 1 or 4 iterations of smoother application at each AMG
level (all details from ksp view for the 4 cpu run are below). The main
observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls
apart in parallel. SOR on the other hand experiences a 1.5X increase in
iteration count which is totally expected from the quality of coarsening
ML delivers in parallel.

I basically would like to find a way (if possible) to have the number of
iterations in parallel stay with 1-2X of 1 cpu iteration count for the
AMG w/ ICC case. Is there a way to achieve this?

Thanks,
Harun

%%%%%%%%%%%%%%%%%%%%%%%%%
AMG w/ ICC(0) x1 ksp_view
%%%%%%%%%%%%%%%%%%%%%%%%%
KSP Object:
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
PC Object:
  type: ml
    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
post-smooths=1
  Coarse gride solver -- level 0 -------------------------------
    KSP Object:(mg_coarse_)
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_coarse_)
      type: redundant
        Redundant preconditioner: First (color=0) of 4 PCs follows
      KSP Object:(mg_coarse_redundant_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_coarse_redundant_)
        type: lu
          LU: out-of-place factorization
            matrix ordering: nd
          LU: tolerance for zero pivot 1e-12
          LU: factor fill ratio needed 2.17227
               Factored matrix follows
              Matrix Object:
                type=seqaij, rows=283, cols=283
                total: nonzeros=21651, allocated nonzeros=21651
                  using I-node routines: found 186 nodes, limit used is
5
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=283, cols=283
          total: nonzeros=9967, allocated nonzeros=14150
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=283, cols=283
        total: nonzeros=9967, allocated nonzeros=9967
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:(mg_levels_1_)
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_1_)
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC
objects:
      KSP Object:(mg_levels_1_sub_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_levels_1_sub_)
        type: icc
          ICC: 0 levels of fill
          ICC: factor fill ratio allocated 1
          ICC: using Manteuffel shift
          ICC: factor fill ratio needed 0.514899
               Factored matrix follows
              Matrix Object:
                type=seqsbaij, rows=2813, cols=2813
                total: nonzeros=48609, allocated nonzeros=48609
                    block size is 1
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=2813, cols=2813
          total: nonzeros=94405, allocated nonzeros=94405
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=10654, cols=10654
        total: nonzeros=376634, allocated nonzeros=376634
          not using I-node (on process 0) routines
  Up solver (post-smoother) on level 1 -------------------------------
    KSP Object:(mg_levels_1_)
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_1_)
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC
objects:
      KSP Object:(mg_levels_1_sub_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_levels_1_sub_)
        type: icc
          ICC: 0 levels of fill
          ICC: factor fill ratio allocated 1
          ICC: using Manteuffel shift
          ICC: factor fill ratio needed 0.514899
               Factored matrix follows
              Matrix Object:
                type=seqsbaij, rows=2813, cols=2813
                total: nonzeros=48609, allocated nonzeros=48609
                    block size is 1
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=2813, cols=2813
          total: nonzeros=94405, allocated nonzeros=94405
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=10654, cols=10654
        total: nonzeros=376634, allocated nonzeros=376634
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:(mg_levels_2_)
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_2_)
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC
objects:
      KSP Object:(mg_levels_2_sub_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_levels_2_sub_)
        type: icc
          ICC: 0 levels of fill
          ICC: factor fill ratio allocated 1
          ICC: using Manteuffel shift
          ICC: factor fill ratio needed 0.519045
               Factored matrix follows
              Matrix Object:
                type=seqsbaij, rows=101164, cols=101164
                total: nonzeros=1378558, allocated nonzeros=1378558
                    block size is 1
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=101164, cols=101164
          total: nonzeros=2655952, allocated nonzeros=5159364
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=411866, cols=411866
        total: nonzeros=10941434, allocated nonzeros=42010332
          not using I-node (on process 0) routines
  Up solver (post-smoother) on level 2 -------------------------------
    KSP Object:(mg_levels_2_)
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_2_)
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC
objects:
      KSP Object:(mg_levels_2_sub_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_levels_2_sub_)
        type: icc
          ICC: 0 levels of fill
          ICC: factor fill ratio allocated 1
          ICC: using Manteuffel shift
          ICC: factor fill ratio needed 0.519045
               Factored matrix follows
              Matrix Object:
                type=seqsbaij, rows=101164, cols=101164
                total: nonzeros=1378558, allocated nonzeros=1378558
                    block size is 1
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=101164, cols=101164
          total: nonzeros=2655952, allocated nonzeros=5159364
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=411866, cols=411866
        total: nonzeros=10941434, allocated nonzeros=42010332
          not using I-node (on process 0) routines
  linear system matrix = precond matrix:
  Matrix Object:
    type=mpiaij, rows=411866, cols=411866
    total: nonzeros=10941434, allocated nonzeros=42010332
      not using I-node (on process 0) routines

%%%%%%%%%%%%%%%%%%%%%%
AMG w/ SOR x4 ksp_view
%%%%%%%%%%%%%%%%%%%%%%

KSP Object:
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
PC Object:
  type: ml
    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
post-smooths=1
  Coarse gride solver -- level 0 -------------------------------
    KSP Object:(mg_coarse_)
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_coarse_)
      type: redundant
        Redundant preconditioner: First (color=0) of 4 PCs follows
      KSP Object:(mg_coarse_redundant_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_coarse_redundant_)
        type: lu
          LU: out-of-place factorization
            matrix ordering: nd
          LU: tolerance for zero pivot 1e-12
          LU: factor fill ratio needed 2.17227
               Factored matrix follows
              Matrix Object:
                type=seqaij, rows=283, cols=283
                total: nonzeros=21651, allocated nonzeros=21651
                  using I-node routines: found 186 nodes, limit used is
5
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=283, cols=283
          total: nonzeros=9967, allocated nonzeros=14150
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=283, cols=283
        total: nonzeros=9967, allocated nonzeros=9967
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:(mg_levels_1_)
      type: richardson
        Richardson: damping factor=1
      maximum iterations=4, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_1_)
      type: sor
        SOR: type = local_symmetric, iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=10654, cols=10654
        total: nonzeros=376634, allocated nonzeros=376634
          not using I-node (on process 0) routines
  Up solver (post-smoother) on level 1 -------------------------------
    KSP Object:(mg_levels_1_)
      type: richardson
        Richardson: damping factor=1
      maximum iterations=4
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_1_)
      type: sor
        SOR: type = local_symmetric, iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=10654, cols=10654
        total: nonzeros=376634, allocated nonzeros=376634
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:(mg_levels_2_)
      type: richardson
        Richardson: damping factor=1
      maximum iterations=4, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_2_)
      type: sor
        SOR: type = local_symmetric, iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=411866, cols=411866
        total: nonzeros=10941434, allocated nonzeros=42010332
          not using I-node (on process 0) routines
  Up solver (post-smoother) on level 2 -------------------------------
    KSP Object:(mg_levels_2_)
      type: richardson
        Richardson: damping factor=1
      maximum iterations=4
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_2_)
      type: sor
        SOR: type = local_symmetric, iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=411866, cols=411866
        total: nonzeros=10941434, allocated nonzeros=42010332
          not using I-node (on process 0) routines
  linear system matrix = precond matrix:
  Matrix Object:
    type=mpiaij, rows=411866, cols=411866
    total: nonzeros=10941434, allocated nonzeros=42010332
      not using I-node (on process 0) routines