[petsc-users] [MEF-QUAR] Re: Any changes in ML usage between 3.1-p8 -> 3.3-p6?

Wed Apr 17 11:26:47 CDT 2013

On 04.05.2013 08:34, Jozsef Bakosi wrote:
> Hi folks,
> 
> In switching from 3.1-p8 to 3.3-p6, keeping the same ML ml-6.2.tar.gz, I get
> indefinite preconditioner with the newer PETSc version. Has there been anything
> substantial changed around how PCs are handled, e.g. in the defaults?
> 
> I know this request is pretty general, I would just like to know where to start
> looking, where changes in PETSc might be clobbering the (supposedly same)
> behavior of ML.
> 

Alright, here is a little more information about what we see. Running the same
setup/solve using ML (using the same ML and application source code) and
switching from PETSc 3.1-p8 to 3.3-p6 appears to work differently, in some
cases, resulting in divergence compared to the old version.

I attach the output from KSPView() called after KSPSetup() for the 3.1-p8
(old.out) and for the 3.3-p6 (new.out), both running on 4 MPI ranks.

A diff reveals some notable differences:

  * using (PRECONDITIONED -> NONE) norm type for convergence test

  * (using -> not using) I-node routines

  * tolerance for zero pivot (1e-12 -> 2.22045e-14) for PPE_mg_levels_[12]_sub_
    (stayed the same for PPE_mg_coarse_redundant_)

So we are wondering what might have changed in the PETSc defaults around how
PCs, in particular ML, is used.

Thanks, and please let me know if I can give you more information,

Thanks,
Jozsef
-------------- next part --------------
KSP Object:(PPE_)
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object:(PPE_)
  type: ml
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
  Coarse grid solver -- level 0 presmooths=1 postsmooths=1 -----
    KSP Object:(PPE_mg_coarse_)
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using PRECONDITIONED norm type for convergence test
    PC Object:(PPE_mg_coarse_)
      type: redundant
        Redundant preconditioner: First (color=0) of 4 PCs follows
      KSP Object:(PPE_mg_coarse_redundant_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using PRECONDITIONED norm type for convergence test
      PC Object:(PPE_mg_coarse_redundant_)
        type: lu
          LU: out-of-place factorization
          tolerance for zero pivot 2.22045e-16
          matrix ordering: nd
          factor fill ratio given 5, needed 3.80946
            Factored matrix follows:
              Matrix Object:
                type=seqaij, rows=338, cols=338
                package used to perform factorization: petsc
                total: nonzeros=49302, allocated nonzeros=49302
                  using I-node routines: found 273 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=338, cols=338
          total: nonzeros=12942, allocated nonzeros=26026
            not using I-node routines
      KSP Object:(PPE_mg_coarse_redundant_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using PRECONDITIONED norm type for convergence test
      PC Object:(PPE_mg_coarse_redundant_)
        type: lu
          LU: out-of-place factorization
          tolerance for zero pivot 2.22045e-16
          matrix ordering: nd
          factor fill ratio given 5, needed 3.80946
            Factored matrix follows:
              Matrix Object:
                type=seqaij, rows=338, cols=338
                package used to perform factorization: petsc
                total: nonzeros=49302, allocated nonzeros=49302
                  using I-node routines: found 273 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=338, cols=338
          total: nonzeros=12942, allocated nonzeros=26026
            not using I-node routines
      KSP Object:(PPE_mg_coarse_redundant_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using PRECONDITIONED norm type for convergence test
      PC Object:(PPE_mg_coarse_redundant_)
        type: lu
          LU: out-of-place factorization
          tolerance for zero pivot 2.22045e-16
          matrix ordering: nd
          factor fill ratio given 5, needed 3.80946
            Factored matrix follows:
              Matrix Object:
                type=seqaij, rows=338, cols=338
                package used to perform factorization: petsc
                total: nonzeros=49302, allocated nonzeros=49302
                  using I-node routines: found 273 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=338, cols=338
          total: nonzeros=12942, allocated nonzeros=26026
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=338, cols=338
        total: nonzeros=12942, allocated nonzeros=12942
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 smooths=1 --------------------
    KSP Object:(PPE_mg_levels_1_)
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NO norm type for convergence test
    PC Object:(PPE_mg_levels_1_)
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:(PPE_mg_levels_1_sub_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NO norm type for convergence test
      PC Object:(PPE_mg_levels_1_sub_)
        type: icc
          0 levels of fill
          tolerance for zero pivot 1e-12
          using Manteuffel shift
          matrix ordering: natural
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=2265, cols=2265
          total: nonzeros=60131, allocated nonzeros=60131
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=9124, cols=9124
        total: nonzeros=267508, allocated nonzeros=267508
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 smooths=1 --------------------
    KSP Object:(PPE_mg_levels_2_)
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NO norm type for convergence test
    PC Object:(PPE_mg_levels_2_)
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:(PPE_mg_levels_2_sub_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NO norm type for convergence test
      PC Object:(PPE_mg_levels_2_sub_)
        type: icc
          0 levels of fill
          tolerance for zero pivot 1e-12
          using Manteuffel shift
          matrix ordering: natural
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=59045, cols=59045
          total: nonzeros=1504413, allocated nonzeros=1594215
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=236600, cols=236600
        total: nonzeros=6183334, allocated nonzeros=12776400
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:
    type=mpiaij, rows=236600, cols=236600
    total: nonzeros=6183334, allocated nonzeros=12776400
      not using I-node (on process 0) routines

-------------- next part --------------
KSP Object:(PPE_) 4 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object:(PPE_) 4 MPI processes
  type: ml
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (PPE_mg_coarse_)     4 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (PPE_mg_coarse_)     4 MPI processes
      type: redundant
        Redundant preconditioner: First (color=0) of 4 PCs follows
      KSP Object:      (PPE_mg_coarse_redundant_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (PPE_mg_coarse_redundant_)       1 MPI processes
        type: lu
          LU: out-of-place factorization
          tolerance for zero pivot 2.22045e-16
          matrix ordering: nd
          factor fill ratio given 5, needed 3.80946
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=338, cols=338
                package used to perform factorization: petsc
                total: nonzeros=49302, allocated nonzeros=49302
                total number of mallocs used during MatSetValues calls =0
                  not using I-node routines
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=338, cols=338
          total: nonzeros=12942, allocated nonzeros=26026
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       4 MPI processes
        type: mpiaij
        rows=338, cols=338
        total: nonzeros=12942, allocated nonzeros=12942
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (PPE_mg_levels_1_)     4 MPI processes
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (PPE_mg_levels_1_)     4 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (PPE_mg_levels_1_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (PPE_mg_levels_1_sub_)       1 MPI processes
        type: icc
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using Manteuffel shift
          matrix ordering: natural
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=2265, cols=2265
          total: nonzeros=60131, allocated nonzeros=60131
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       4 MPI processes
        type: mpiaij
        rows=9124, cols=9124
        total: nonzeros=267508, allocated nonzeros=267508
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) on level 1 -------------------------------
    KSP Object:    (PPE_mg_levels_1_)     4 MPI processes
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (PPE_mg_levels_1_)     4 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (PPE_mg_levels_1_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (PPE_mg_levels_1_sub_)       1 MPI processes
        type: icc
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using Manteuffel shift
          matrix ordering: natural
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=2265, cols=2265
          total: nonzeros=60131, allocated nonzeros=60131
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       4 MPI processes
        type: mpiaij
        rows=9124, cols=9124
        total: nonzeros=267508, allocated nonzeros=267508
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (PPE_mg_levels_2_)     4 MPI processes
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (PPE_mg_levels_2_)     4 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (PPE_mg_levels_2_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (PPE_mg_levels_2_sub_)       1 MPI processes
        type: icc
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using Manteuffel shift
          matrix ordering: natural
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=59045, cols=59045
          total: nonzeros=1504413, allocated nonzeros=1594215
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       4 MPI processes
        type: mpiaij
        rows=236600, cols=236600
        total: nonzeros=6183334, allocated nonzeros=12776400
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) on level 2 -------------------------------
    KSP Object:    (PPE_mg_levels_2_)     4 MPI processes
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (PPE_mg_levels_2_)     4 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (PPE_mg_levels_2_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (PPE_mg_levels_2_sub_)       1 MPI processes
        type: icc
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using Manteuffel shift
          matrix ordering: natural
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=59045, cols=59045
          total: nonzeros=1504413, allocated nonzeros=1594215
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       4 MPI processes
        type: mpiaij
        rows=236600, cols=236600
        total: nonzeros=6183334, allocated nonzeros=12776400
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  linear system matrix = precond matrix:
  Matrix Object:   4 MPI processes
    type: mpiaij
    rows=236600, cols=236600
    total: nonzeros=6183334, allocated nonzeros=12776400
    total number of mallocs used during MatSetValues calls =0
      not using I-node (on process 0) routines