<div dir="ltr">Dear PETSc users/developers,<div><br></div><div>I recently switched from PETSc-3.4 to PETSc-3.7 and found that some default setup for the "mg" (mutigrid) preconditioner have changed.</div><div><br></div><div>We were solving a linear system passing, throug command line, the following options:</div><div><font face="monospace, monospace">-ksp_type      fgmres </font><div><font face="monospace, monospace">-ksp_max_it    100000</font></div><div><font face="monospace, monospace">-ksp_rtol      0.000001 </font></div><div><font face="monospace, monospace">-pc_type       mg</font></div><div><font face="monospace, monospace">-ksp_view</font></div><div><br></div><div>The output of the KSP view is as follow:</div><div><br></div><div><div><font face="monospace, monospace">KSP Object: 128 MPI processes</font></div><div><font face="monospace, monospace">  type: fgmres</font></div><div><font face="monospace, monospace">    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</font></div><div><font face="monospace, monospace">    GMRES: happy breakdown tolerance 1e-30</font></div><div><font face="monospace, monospace">  maximum iterations=100000, initial guess is zero</font></div><div><font face="monospace, monospace">  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000</font></div><div><font face="monospace, monospace">  right preconditioning</font></div><div><font face="monospace, monospace">  using UNPRECONDITIONED norm type for convergence test</font></div><div><font face="monospace, monospace">PC Object: 128 MPI processes</font></div><div><font face="monospace, monospace">  type: mg</font></div><div><font face="monospace, monospace">    MG: type is MULTIPLICATIVE, levels=1 cycles=v</font></div><div><font face="monospace, monospace">      Cycles per PCApply=1</font></div><div><font face="monospace, monospace">      Not using Galerkin computed coarse grid matrices</font></div><div><font face="monospace, monospace">  Coarse grid solver -- level -------------------------------</font></div><div><font face="monospace, monospace">    KSP Object:    (mg_levels_0_)     128 MPI processes</font></div><div><font face="monospace, monospace">      type: chebyshev</font></div><div><font face="monospace, monospace">        Chebyshev: eigenvalue estimates:  min = 0.223549, max = 2.45903</font></div><div><font face="monospace, monospace">        Chebyshev: estimated using:  [0 0.1; 0 1.1]</font></div><div><font face="monospace, monospace">        KSP Object:        (mg_levels_0_est_)         128 MPI processes</font></div><div><font face="monospace, monospace">          type: gmres</font></div><div><font face="monospace, monospace">            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</font></div><div><font face="monospace, monospace">            GMRES: happy breakdown tolerance 1e-30</font></div><div><font face="monospace, monospace">          maximum iterations=10, initial guess is zero</font></div><div><font face="monospace, monospace">          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</font></div><div><font face="monospace, monospace">          left preconditioning</font></div><div><font face="monospace, monospace">          using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">        PC Object:        (mg_levels_0_)         128 MPI processes</font></div><div><font face="monospace, monospace">          type: sor</font></div><div><font face="monospace, monospace">            SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1</font></div><div><font face="monospace, monospace">          linear system matrix followed by preconditioner matrix:</font></div><div><font face="monospace, monospace">          Matrix Object:           128 MPI processes</font></div><div><font face="monospace, monospace">            type: mpiaij</font></div><div><font face="monospace, monospace">            rows=279669, cols=279669</font></div><div><font face="monospace, monospace">            total: nonzeros=6427943, allocated nonzeros=6427943</font></div><div><font face="monospace, monospace">            total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">              not using I-node (on process 0) routines</font></div><div><font face="monospace, monospace">          Matrix Object:           128 MPI processes</font></div><div><font face="monospace, monospace">            type: mpiaij</font></div><div><font face="monospace, monospace">            rows=279669, cols=279669</font></div><div><font face="monospace, monospace">            total: nonzeros=6427943, allocated nonzeros=6427943</font></div><div><font face="monospace, monospace">            total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">              not using I-node (on process 0) routines</font></div><div><font face="monospace, monospace">      maximum iterations=1, initial guess is zero</font></div><div><font face="monospace, monospace">      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</font></div><div><font face="monospace, monospace">      left preconditioning</font></div><div><font face="monospace, monospace">      using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">    PC Object:    (mg_levels_0_)     128 MPI processes</font></div><div><font face="monospace, monospace">      type: sor</font></div><div><font face="monospace, monospace">        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1</font></div><div><font face="monospace, monospace">      linear system matrix followed by preconditioner matrix:</font></div><div><font face="monospace, monospace">      Matrix Object:       128 MPI processes</font></div><div><font face="monospace, monospace">        type: mpiaij</font></div><div><font face="monospace, monospace">        rows=279669, cols=279669</font></div><div><font face="monospace, monospace">        total: nonzeros=6427943, allocated nonzeros=6427943</font></div><div><font face="monospace, monospace">        total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">          not using I-node (on process 0) routines</font></div><div><font face="monospace, monospace">      Matrix Object:       128 MPI processes</font></div><div><font face="monospace, monospace">        type: mpiaij</font></div><div><font face="monospace, monospace">        rows=279669, cols=279669</font></div><div><font face="monospace, monospace">        total: nonzeros=6427943, allocated nonzeros=6427943</font></div><div><font face="monospace, monospace">        total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">          not using I-node (on process 0) routines</font></div><div><font face="monospace, monospace">  linear system matrix followed by preconditioner matrix:</font></div><div><font face="monospace, monospace">  Matrix Object:   128 MPI processes</font></div><div><font face="monospace, monospace">    type: mpiaij</font></div><div><font face="monospace, monospace">    rows=279669, cols=279669</font></div><div><font face="monospace, monospace">    total: nonzeros=6427943, allocated nonzeros=6427943</font></div><div><font face="monospace, monospace">    total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">      not using I-node (on process 0) routines</font></div><div><font face="monospace, monospace">  Matrix Object:   128 MPI processes</font></div><div><font face="monospace, monospace">    type: mpiaij</font></div><div><font face="monospace, monospace">    rows=279669, cols=279669</font></div><div><font face="monospace, monospace">    total: nonzeros=6427943, allocated nonzeros=6427943</font></div><div><font face="monospace, monospace">    total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">      not using I-node (on process 0) routines</font></div></div><div><br></div><div>When I build the same program using PETSc-3.7 and run it with the same options we observe that the runtime increases and the convergence is slightly different. The output of the KSP view is:</div><div><br></div><div><font face="monospace, monospace">KSP Object: 128 MPI processes<br></font></div><div><div><font face="monospace, monospace">  type: fgmres</font></div><div><font face="monospace, monospace">    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</font></div><div><font face="monospace, monospace">    GMRES: happy breakdown tolerance 1e-30</font></div><div><font face="monospace, monospace">  maximum iterations=100000, initial guess is zero</font></div><div><font face="monospace, monospace">  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">  right preconditioning</font></div><div><font face="monospace, monospace">  using UNPRECONDITIONED norm type for convergence test</font></div><div><font face="monospace, monospace">PC Object: 128 MPI processes</font></div><div><font face="monospace, monospace">  type: mg</font></div><div><font face="monospace, monospace">    MG: type is MULTIPLICATIVE, levels=1 cycles=v</font></div><div><font face="monospace, monospace">      Cycles per PCApply=1</font></div><div><font face="monospace, monospace">      Not using Galerkin computed coarse grid matrices</font></div><div><font face="monospace, monospace">  Coarse grid solver -- level -------------------------------</font></div><div><font face="monospace, monospace">    KSP Object:    (mg_levels_0_)     128 MPI processes</font></div><div><font face="monospace, monospace">      type: chebyshev</font></div><div><font face="monospace, monospace">        Chebyshev: eigenvalue estimates:  min = 0.223549, max = 2.45903</font></div><div><font face="monospace, monospace">        Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</font></div><div><font face="monospace, monospace">        KSP Object:        (mg_levels_0_esteig_)         128 MPI processes</font></div><div><font face="monospace, monospace">          type: gmres</font></div><div><font face="monospace, monospace">            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</font></div><div><font face="monospace, monospace">            GMRES: happy breakdown tolerance 1e-30</font></div><div><font face="monospace, monospace">          maximum iterations=10, initial guess is zero</font></div><div><font face="monospace, monospace">          <b style="background-color:rgb(255,153,0)">tolerances:  relative=1e-12</b>, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">          left preconditioning</font></div><div><font face="monospace, monospace">          <b style="background-color:rgb(255,153,0)">using PRECONDITIONED norm type for convergence test</b></font></div><div><font face="monospace, monospace">      <span style="background-color:rgb(255,153,0)">maximum iterations=2</span>, initial guess is zero</font></div><div><font face="monospace, monospace">      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">      left preconditioning</font></div><div><font face="monospace, monospace">      using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">    PC Object:    (mg_levels_0_)     128 MPI processes</font></div><div><font face="monospace, monospace">      type: sor</font></div><div><font face="monospace, monospace">        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</font></div><div><font face="monospace, monospace">      linear system matrix followed by preconditioner matrix:</font></div><div><font face="monospace, monospace">      Mat Object:       128 MPI processes</font></div><div><font face="monospace, monospace">        type: mpiaij</font></div><div><font face="monospace, monospace">        rows=279669, cols=279669</font></div><div><font face="monospace, monospace">        total: nonzeros=6427943, allocated nonzeros=6427943</font></div><div><font face="monospace, monospace">        total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">          not using I-node (on process 0) routines</font></div><div><font face="monospace, monospace">      Mat Object:       128 MPI processes</font></div><div><font face="monospace, monospace">        type: mpiaij</font></div><div><font face="monospace, monospace">        rows=279669, cols=279669</font></div><div><font face="monospace, monospace">        total: nonzeros=6427943, allocated nonzeros=6427943</font></div><div><font face="monospace, monospace">        total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">          not using I-node (on process 0) routines</font></div><div><font face="monospace, monospace">  linear system matrix followed by preconditioner matrix:</font></div><div><font face="monospace, monospace">  Mat Object:   128 MPI processes</font></div><div><font face="monospace, monospace">    type: mpiaij</font></div><div><font face="monospace, monospace">    rows=279669, cols=279669</font></div><div><font face="monospace, monospace">    total: nonzeros=6427943, allocated nonzeros=6427943</font></div><div><font face="monospace, monospace">    total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">      not using I-node (on process 0) routines</font></div><div><font face="monospace, monospace">  Mat Object:   128 MPI processes</font></div><div><font face="monospace, monospace">    type: mpiaij</font></div><div><font face="monospace, monospace">    rows=279669, cols=279669</font></div><div><font face="monospace, monospace">    total: nonzeros=6427943, allocated nonzeros=6427943</font></div><div><font face="monospace, monospace">    total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">      not using I-node (on process 0) routines</font></div><div><br></div></div><div>I was able to get a closer solution adding the following options: </div><div><font face="monospace, monospace">-mg_levels_0_esteig_ksp_norm_type   none </font></div><div><font face="monospace, monospace">-mg_levels_0_esteig_ksp_rtol        1.0e-5 </font></div><div><font face="monospace, monospace">-mg_levels_ksp_max_it               1</font><br></div><div><br></div><div>But I still can reach the same runtime we were observing with PETSc-3.4, could you please advice me if I should specify any other options?</div><div><br></div><div>Thank you very much for your support,</div><div>Federico Golfre' Andreasi</div><div><br></div></div></div>