<div dir="ltr">Hi all, I've run into an unexpected issue with GAMG stagnating for a certain condition. I'm running a 3D high order DG discretization for compressible navier-stokes, using matrix-free gmres+amg, with the relevant petsc configuration:<div><br></div><div>-pc_type gamg </div><div>-ksp_type fgmres </div><div>-pc_gamg_agg_nsmooths 0 </div><div>-mg_levels_ksp_type gmres </div><div>-mg_levels_pc_type bjacobi </div><div>-mg_levels_ksp_max_it 20 </div><div>-mg_levels_ksp_rtol 0.0001 </div><div>-pc_mg_cycle_type v </div><div>-pc_mg_type full<br><div><br></div><div>So FGMRES on top, with AMG using ILU block jacobi + GMRES as a smoother. -ksp_view output pasted at the bottom here. This setup has been working fairly robustly.</div></div><div><br></div><div>I'm testing two small mesh resolutions, with 1,536 cells and 6,144 cells each, where in the jacobian each cell is a 50x50 dense block, with 4 off-diagonal block neighbors each. With that, I'm testing 2 configurations of the same problem, one with mach 0.1 and the other with mach 0.01 (where the latter makes system much worse conditioned, a kind of stress test.)</div><div><br></div><div>In serial everything converges well to relative tolerance 0.01:</div><div>1,536 cells, Mach 0.1:  2 iterations</div><div>6,144 cells, Mach 0.1:  2 iterations</div><div>1,536 cells, Mach 0.01: 5 iterations</div><div>6,144 cells, Mach 0.01: 5 iterations</div><div><br></div><div>In parallel most things converge well, with -np 16 cores here:</div><div>1,536 cells, Mach 0.1:  3 iterations</div><div>6,144 cells, Mach 0.1:  4 iterations</div><div>1,536 cells, Mach 0.01: 11 iterations</div><div><br></div><div>but for the 6,144 cell Mach 0.01 case, it's catastrophically worse:</div><div><div>    0 SNES Function norm 6.934657276072e+05 </div><div>      0 KSP Residual norm 6.934657276072e+05 </div><div>      1 KSP Residual norm 6.934440650708e+05 </div><div>      2 KSP Residual norm 6.934157525695e+05 </div><div>      3 KSP Residual norm 6.934145135179e+05 </div></div><div>...</div><div><div>     48 KSP Residual norm 6.830785654915e+05 </div><div>     49 KSP Residual norm 6.821332742917e+05 </div><div>     50 KSP Residual norm 6.807807049444e+05 </div></div><div><br></div><div>and quickly stalls entirely and won't converge in 100s of iterations. The exact same case in serial shows nice convergence:<br></div><div><div>    0 SNES Function norm 6.934657276072e+05 </div><div>      0 KSP Residual norm 6.934657276072e+05 </div><div>      1 KSP Residual norm 1.705989154365e+05 </div><div>      2 KSP Residual norm 3.183292610749e+04 </div><div>      3 KSP Residual norm 1.568738082749e+04 </div><div>      4 KSP Residual norm 9.875297457387e+03 </div><div>      5 KSP Residual norm 6.489083537720e+03 </div><div>    Linear solve converged due to CONVERGED_RTOL iterations 5</div></div><div><br></div><div>And the marginally coarser 1,536 cell case with the same physics is also healthy with parallel -np 16:</div><div><br></div><div><div>    0 SNES Function norm 2.400990060398e+05 </div><div>      0 KSP Residual norm 2.400990060398e+05 </div><div>      1 KSP Residual norm 2.391625967890e+05 </div><div>      2 KSP Residual norm 1.388195699805e+05 </div><div>      3 KSP Residual norm 3.072388366914e+04 </div><div>      4 KSP Residual norm 2.151010198865e+04 </div><div>      5 KSP Residual norm 1.305330349765e+04 </div><div>      6 KSP Residual norm 8.126579575968e+03 </div><div>      7 KSP Residual norm 6.186198840355e+03 </div><div>      8 KSP Residual norm 4.673764041449e+03 </div><div>      9 KSP Residual norm 3.332141521573e+03 </div><div>     10 KSP Residual norm 2.811481187948e+03 </div><div>     11 KSP Residual norm 2.189632613389e+03 </div><div>    Linear solve converged due to CONVERGED_RTOL iterations 11</div></div><div><br></div><div><br></div><div><br></div><div>Any thoughts here? Is there anything obviously wrong with my setup? Any way to reduce the dependence of the convergence iterations on the parallelism? -- obviously I expect the iteration count to be higher in parallel, but I didn't expect such catastrophic failure.</div><div><br></div><div><br></div><div>Thanks as always,</div><div>Mark</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div>-ksp_view:</div><div><br></div><div><div><div><font face="monospace, monospace">0 TS dt 30. time 0.</font></div><div><font face="monospace, monospace">    0 SNES Function norm 2.856641938332e+04 </font></div><div><font face="monospace, monospace">      0 KSP Residual norm 2.856641938332e+04 </font></div><div><font face="monospace, monospace">      1 KSP Residual norm 1.562096645358e+03 </font></div><div><font face="monospace, monospace">      2 KSP Residual norm 3.008746074553e+02 </font></div><div><font face="monospace, monospace">      3 KSP Residual norm 1.463990835793e+02 </font></div><div><font face="monospace, monospace">    Linear solve converged due to CONVERGED_RTOL iterations 3</font></div><div><font face="monospace, monospace">KSP Object: 16 MPI processes</font></div><div><font face="monospace, monospace">  type: fgmres</font></div><div><font face="monospace, monospace">    restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</font></div><div><font face="monospace, monospace">    happy breakdown tolerance 1e-30</font></div><div><font face="monospace, monospace">  maximum iterations=100, initial guess is zero</font></div><div><font face="monospace, monospace">  tolerances:  relative=0.01, absolute=1e-06, divergence=10.</font></div><div><font face="monospace, monospace">  right preconditioning</font></div><div><font face="monospace, monospace">  using UNPRECONDITIONED norm type for convergence test</font></div><div><font face="monospace, monospace">PC Object: 16 MPI processes</font></div><div><font face="monospace, monospace">  type: gamg</font></div><div><font face="monospace, monospace">    type is FULL, levels=5 cycles=v</font></div><div><font face="monospace, monospace">      Using externally compute Galerkin coarse grid matrices</font></div><div><font face="monospace, monospace">      GAMG specific options</font></div><div><font face="monospace, monospace">        Threshold for dropping small values in graph on each level =   0.   0.   0.  </font></div><div><font face="monospace, monospace">        Threshold scaling factor for each level not specified = 1.</font></div><div><font face="monospace, monospace">        AGG specific options</font></div><div><font face="monospace, monospace">          Symmetric graph false</font></div><div><font face="monospace, monospace">          Number of levels to square graph 1</font></div><div><font face="monospace, monospace">          Number smoothing steps 0</font></div><div><font face="monospace, monospace">  Coarse grid solver -- level -------------------------------</font></div><div><font face="monospace, monospace">    KSP Object: (mg_coarse_) 16 MPI processes</font></div><div><font face="monospace, monospace">      type: preonly</font></div><div><font face="monospace, monospace">      maximum iterations=10000, initial guess is zero</font></div><div><font face="monospace, monospace">      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">      left preconditioning</font></div><div><font face="monospace, monospace">      using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">    PC Object: (mg_coarse_) 16 MPI processes</font></div><div><font face="monospace, monospace">      type: bjacobi</font></div><div><font face="monospace, monospace">        number of blocks = 16</font></div><div><font face="monospace, monospace">        Local solve is same for all blocks, in the following KSP and PC objects:</font></div><div><font face="monospace, monospace">      KSP Object: (mg_coarse_sub_) 1 MPI processes</font></div><div><font face="monospace, monospace">        type: preonly</font></div><div><font face="monospace, monospace">        maximum iterations=1, initial guess is zero</font></div><div><font face="monospace, monospace">        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">        left preconditioning</font></div><div><font face="monospace, monospace">        using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">      PC Object: (mg_coarse_sub_) 1 MPI processes</font></div><div><font face="monospace, monospace">        type: lu</font></div><div><font face="monospace, monospace">          out-of-place factorization</font></div><div><font face="monospace, monospace">          tolerance for zero pivot 2.22045e-14</font></div><div><font face="monospace, monospace">          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]</font></div><div><font face="monospace, monospace">          matrix ordering: nd</font></div><div><font face="monospace, monospace">          factor fill ratio given 5., needed 1.10526</font></div><div><font face="monospace, monospace">            Factored matrix follows:</font></div><div><font face="monospace, monospace">              Mat Object: 1 MPI processes</font></div><div><font face="monospace, monospace">                type: seqaij</font></div><div><font face="monospace, monospace">                rows=25, cols=25, bs=5</font></div><div><font face="monospace, monospace">                package used to perform factorization: petsc</font></div><div><font face="monospace, monospace">                total: nonzeros=525, allocated nonzeros=525</font></div><div><font face="monospace, monospace">                total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">                  using I-node routines: found 5 nodes, limit used is 5</font></div><div><font face="monospace, monospace">        linear system matrix = precond matrix:</font></div><div><font face="monospace, monospace">        Mat Object: 1 MPI processes</font></div><div><font face="monospace, monospace">          type: seqaij</font></div><div><font face="monospace, monospace">          rows=25, cols=25, bs=5</font></div><div><font face="monospace, monospace">          total: nonzeros=475, allocated nonzeros=475</font></div><div><font face="monospace, monospace">          total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">            using I-node routines: found 5 nodes, limit used is 5</font></div><div><font face="monospace, monospace">      linear system matrix = precond matrix:</font></div><div><font face="monospace, monospace">      Mat Object: 16 MPI processes</font></div><div><font face="monospace, monospace">        type: mpiaij</font></div><div><font face="monospace, monospace">        rows=25, cols=25, bs=5</font></div><div><font face="monospace, monospace">        total: nonzeros=475, allocated nonzeros=475</font></div><div><font face="monospace, monospace">        total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">          using I-node (on process 0) routines: found 5 nodes, limit used is 5</font></div><div><font face="monospace, monospace">  Down solver (pre-smoother) on level 1 -------------------------------</font></div><div><font face="monospace, monospace">    KSP Object: (mg_levels_1_) 16 MPI processes</font></div><div><font face="monospace, monospace">      type: gmres</font></div><div><font face="monospace, monospace">        restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</font></div><div><font face="monospace, monospace">        happy breakdown tolerance 1e-30</font></div><div><font face="monospace, monospace">      maximum iterations=20, nonzero initial guess</font></div><div><font face="monospace, monospace">      tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">      left preconditioning</font></div><div><font face="monospace, monospace">      using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">    PC Object: (mg_levels_1_) 16 MPI processes</font></div><div><font face="monospace, monospace">      type: bjacobi</font></div><div><font face="monospace, monospace">        number of blocks = 16</font></div><div><font face="monospace, monospace">        Local solve is same for all blocks, in the following KSP and PC objects:</font></div><div><font face="monospace, monospace">      KSP Object: (mg_levels_1_sub_) 1 MPI processes</font></div><div><font face="monospace, monospace">        type: preonly</font></div><div><font face="monospace, monospace">        maximum iterations=10000, initial guess is zero</font></div><div><font face="monospace, monospace">        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">        left preconditioning</font></div><div><font face="monospace, monospace">        using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">      PC Object: (mg_levels_1_sub_) 1 MPI processes</font></div><div><font face="monospace, monospace">        type: ilu</font></div><div><font face="monospace, monospace">          out-of-place factorization</font></div><div><font face="monospace, monospace">          0 levels of fill</font></div><div><font face="monospace, monospace">          tolerance for zero pivot 2.22045e-14</font></div><div><font face="monospace, monospace">          matrix ordering: natural</font></div><div><font face="monospace, monospace">          factor fill ratio given 1., needed 1.</font></div><div><font face="monospace, monospace">            Factored matrix follows:</font></div><div><font face="monospace, monospace">              Mat Object: 1 MPI processes</font></div><div><font face="monospace, monospace">                type: seqaij</font></div><div><font face="monospace, monospace">                rows=75, cols=75, bs=5</font></div><div><font face="monospace, monospace">                package used to perform factorization: petsc</font></div><div><font face="monospace, monospace">                total: nonzeros=1925, allocated nonzeros=1925</font></div><div><font face="monospace, monospace">                total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">                  using I-node routines: found 15 nodes, limit used is 5</font></div><div><font face="monospace, monospace">        linear system matrix = precond matrix:</font></div><div><font face="monospace, monospace">        Mat Object: 1 MPI processes</font></div><div><font face="monospace, monospace">          type: seqaij</font></div><div><font face="monospace, monospace">          rows=75, cols=75, bs=5</font></div><div><font face="monospace, monospace">          total: nonzeros=1925, allocated nonzeros=1925</font></div><div><font face="monospace, monospace">          total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">            using I-node routines: found 15 nodes, limit used is 5</font></div><div><font face="monospace, monospace">      linear system matrix = precond matrix:</font></div><div><font face="monospace, monospace">      Mat Object: 16 MPI processes</font></div><div><font face="monospace, monospace">        type: mpiaij</font></div><div><font face="monospace, monospace">        rows=75, cols=75, bs=5</font></div><div><font face="monospace, monospace">        total: nonzeros=1925, allocated nonzeros=1925</font></div><div><font face="monospace, monospace">        total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">          using I-node (on process 0) routines: found 15 nodes, limit used is 5</font></div><div><font face="monospace, monospace">  Up solver (post-smoother) same as down solver (pre-smoother)</font></div><div><font face="monospace, monospace">  Down solver (pre-smoother) on level 2 -------------------------------</font></div><div><font face="monospace, monospace">    KSP Object: (mg_levels_2_) 16 MPI processes</font></div><div><font face="monospace, monospace">      type: gmres</font></div><div><font face="monospace, monospace">        restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</font></div><div><font face="monospace, monospace">        happy breakdown tolerance 1e-30</font></div><div><font face="monospace, monospace">      maximum iterations=20, nonzero initial guess</font></div><div><font face="monospace, monospace">      tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">      left preconditioning</font></div><div><font face="monospace, monospace">      using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">    PC Object: (mg_levels_2_) 16 MPI processes</font></div><div><font face="monospace, monospace">      type: bjacobi</font></div><div><font face="monospace, monospace">        number of blocks = 16</font></div><div><font face="monospace, monospace">        Local solve is same for all blocks, in the following KSP and PC objects:</font></div><div><font face="monospace, monospace">      KSP Object: (mg_levels_2_sub_) 1 MPI processes</font></div><div><font face="monospace, monospace">        type: preonly</font></div><div><font face="monospace, monospace">        maximum iterations=10000, initial guess is zero</font></div><div><font face="monospace, monospace">        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">        left preconditioning</font></div><div><font face="monospace, monospace">        using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">      PC Object: (mg_levels_2_sub_) 1 MPI processes</font></div><div><font face="monospace, monospace">        type: ilu</font></div><div><font face="monospace, monospace">          out-of-place factorization</font></div><div><font face="monospace, monospace">          0 levels of fill</font></div><div><font face="monospace, monospace">          tolerance for zero pivot 2.22045e-14</font></div><div><font face="monospace, monospace">          matrix ordering: natural</font></div><div><font face="monospace, monospace">          factor fill ratio given 1., needed 1.</font></div><div><font face="monospace, monospace">            Factored matrix follows:</font></div><div><font face="monospace, monospace">              Mat Object: 1 MPI processes</font></div><div><font face="monospace, monospace">                type: seqaij</font></div><div><font face="monospace, monospace">                rows=35, cols=35, bs=5</font></div><div><font face="monospace, monospace">                package used to perform factorization: petsc</font></div><div><font face="monospace, monospace">                total: nonzeros=675, allocated nonzeros=675</font></div><div><font face="monospace, monospace">                total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">                  using I-node routines: found 7 nodes, limit used is 5</font></div><div><font face="monospace, monospace">        linear system matrix = precond matrix:</font></div><div><font face="monospace, monospace">        Mat Object: 1 MPI processes</font></div><div><font face="monospace, monospace">          type: seqaij</font></div><div><font face="monospace, monospace">          rows=35, cols=35, bs=5</font></div><div><font face="monospace, monospace">          total: nonzeros=675, allocated nonzeros=675</font></div><div><font face="monospace, monospace">          total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">            using I-node routines: found 7 nodes, limit used is 5</font></div><div><font face="monospace, monospace">      linear system matrix = precond matrix:</font></div><div><font face="monospace, monospace">      Mat Object: 16 MPI processes</font></div><div><font face="monospace, monospace">        type: mpiaij</font></div><div><font face="monospace, monospace">        rows=305, cols=305, bs=5</font></div><div><font face="monospace, monospace">        total: nonzeros=8675, allocated nonzeros=8675</font></div><div><font face="monospace, monospace">        total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">          using I-node (on process 0) routines: found 7 nodes, limit used is 5</font></div><div><font face="monospace, monospace">  Up solver (post-smoother) same as down solver (pre-smoother)</font></div><div><font face="monospace, monospace">  Down solver (pre-smoother) on level 3 -------------------------------</font></div><div><font face="monospace, monospace">    KSP Object: (mg_levels_3_) 16 MPI processes</font></div><div><font face="monospace, monospace">      type: gmres</font></div><div><font face="monospace, monospace">        restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</font></div><div><font face="monospace, monospace">        happy breakdown tolerance 1e-30</font></div><div><font face="monospace, monospace">      maximum iterations=20, nonzero initial guess</font></div><div><font face="monospace, monospace">      tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">      left preconditioning</font></div><div><font face="monospace, monospace">      using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">    PC Object: (mg_levels_3_) 16 MPI processes</font></div><div><font face="monospace, monospace">      type: bjacobi</font></div><div><font face="monospace, monospace">        number of blocks = 16</font></div><div><font face="monospace, monospace">        Local solve is same for all blocks, in the following KSP and PC objects:</font></div><div><font face="monospace, monospace">      KSP Object: (mg_levels_3_sub_) 1 MPI processes</font></div><div><font face="monospace, monospace">        type: preonly</font></div><div><font face="monospace, monospace">        maximum iterations=10000, initial guess is zero</font></div><div><font face="monospace, monospace">        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">        left preconditioning</font></div><div><font face="monospace, monospace">        using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">      PC Object: (mg_levels_3_sub_) 1 MPI processes</font></div><div><font face="monospace, monospace">        type: ilu</font></div><div><font face="monospace, monospace">          out-of-place factorization</font></div><div><font face="monospace, monospace">          0 levels of fill</font></div><div><font face="monospace, monospace">          tolerance for zero pivot 2.22045e-14</font></div><div><font face="monospace, monospace">          matrix ordering: natural</font></div><div><font face="monospace, monospace">          factor fill ratio given 1., needed 1.</font></div><div><font face="monospace, monospace">            Factored matrix follows:</font></div><div><font face="monospace, monospace">              Mat Object: 1 MPI processes</font></div><div><font face="monospace, monospace">                type: seqaij</font></div><div><font face="monospace, monospace">                rows=50, cols=50, bs=5</font></div><div><font face="monospace, monospace">                package used to perform factorization: petsc</font></div><div><font face="monospace, monospace">                total: nonzeros=1050, allocated nonzeros=1050</font></div><div><font face="monospace, monospace">                total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">                  using I-node routines: found 10 nodes, limit used is 5</font></div><div><font face="monospace, monospace">        linear system matrix = precond matrix:</font></div><div><font face="monospace, monospace">        Mat Object: 1 MPI processes</font></div><div><font face="monospace, monospace">          type: seqaij</font></div><div><font face="monospace, monospace">          rows=50, cols=50, bs=5</font></div><div><font face="monospace, monospace">          total: nonzeros=1050, allocated nonzeros=1050</font></div><div><font face="monospace, monospace">          total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">            using I-node routines: found 10 nodes, limit used is 5</font></div><div><font face="monospace, monospace">      linear system matrix = precond matrix:</font></div><div><font face="monospace, monospace">      Mat Object: 16 MPI processes</font></div><div><font face="monospace, monospace">        type: mpiaij</font></div><div><font face="monospace, monospace">        rows=1090, cols=1090, bs=5</font></div><div><font face="monospace, monospace">        total: nonzeros=32050, allocated nonzeros=32050</font></div><div><font face="monospace, monospace">        total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">          using nonscalable MatPtAP() implementation</font></div><div><font face="monospace, monospace">          using I-node (on process 0) routines: found 10 nodes, limit used is 5</font></div><div><font face="monospace, monospace">  Up solver (post-smoother) same as down solver (pre-smoother)</font></div><div><font face="monospace, monospace">  Down solver (pre-smoother) on level 4 -------------------------------</font></div><div><font face="monospace, monospace">    KSP Object: (mg_levels_4_) 16 MPI processes</font></div><div><font face="monospace, monospace">      type: gmres</font></div><div><font face="monospace, monospace">        restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</font></div><div><font face="monospace, monospace">        happy breakdown tolerance 1e-30</font></div><div><font face="monospace, monospace">      maximum iterations=20, nonzero initial guess</font></div><div><font face="monospace, monospace">      tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">      left preconditioning</font></div><div><font face="monospace, monospace">      using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">    PC Object: (mg_levels_4_) 16 MPI processes</font></div><div><font face="monospace, monospace">      type: bjacobi</font></div><div><font face="monospace, monospace">        number of blocks = 16</font></div><div><font face="monospace, monospace">        Local solve is same for all blocks, in the following KSP and PC objects:</font></div><div><font face="monospace, monospace">      KSP Object: (mg_levels_4_sub_) 1 MPI processes</font></div><div><font face="monospace, monospace">        type: preonly</font></div><div><font face="monospace, monospace">        maximum iterations=10000, initial guess is zero</font></div><div><font face="monospace, monospace">        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</font></div><div><font face="monospace, monospace">        left preconditioning</font></div><div><font face="monospace, monospace">        using NONE norm type for convergence test</font></div><div><font face="monospace, monospace">      PC Object: (mg_levels_4_sub_) 1 MPI processes</font></div><div><font face="monospace, monospace">        type: ilu</font></div><div><font face="monospace, monospace">          out-of-place factorization</font></div><div><font face="monospace, monospace">          0 levels of fill</font></div><div><font face="monospace, monospace">          tolerance for zero pivot 2.22045e-14</font></div><div><font face="monospace, monospace">          matrix ordering: natural</font></div><div><font face="monospace, monospace">          factor fill ratio given 1., needed 1.</font></div><div><font face="monospace, monospace">            Factored matrix follows:</font></div><div><font face="monospace, monospace">              Mat Object: 1 MPI processes</font></div><div><font face="monospace, monospace">                type: seqaij</font></div><div><font face="monospace, monospace">                rows=4850, cols=4850, bs=5</font></div><div><font face="monospace, monospace">                package used to perform factorization: petsc</font></div><div><font face="monospace, monospace">                total: nonzeros=1117500, allocated nonzeros=1117500</font></div><div><font face="monospace, monospace">                total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">                  using I-node routines: found 970 nodes, limit used is 5</font></div><div><font face="monospace, monospace">        linear system matrix = precond matrix:</font></div><div><font face="monospace, monospace">        Mat Object: 1 MPI processes</font></div><div><font face="monospace, monospace">          type: seqaij</font></div><div><font face="monospace, monospace">          rows=4850, cols=4850, bs=5</font></div><div><font face="monospace, monospace">          total: nonzeros=1117500, allocated nonzeros=1117500</font></div><div><font face="monospace, monospace">          total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">            using I-node routines: found 970 nodes, limit used is 5</font></div><div><font face="monospace, monospace">      linear system matrix followed by preconditioner matrix:</font></div><div><font face="monospace, monospace">      Mat Object: 16 MPI processes</font></div><div><font face="monospace, monospace">        type: mffd</font></div><div><font face="monospace, monospace">        rows=76800, cols=76800</font></div><div><font face="monospace, monospace">          Matrix-free approximation:</font></div><div><font face="monospace, monospace">            err=1.49012e-08 (relative error in function evaluation)</font></div><div><font face="monospace, monospace">            Using wp compute h routine</font></div><div><font face="monospace, monospace">                Does not compute normU</font></div><div><font face="monospace, monospace">      Mat Object: 16 MPI processes</font></div><div><font face="monospace, monospace">        type: mpiaij</font></div><div><font face="monospace, monospace">        rows=76800, cols=76800, bs=5</font></div><div><font face="monospace, monospace">        total: nonzeros=18880000, allocated nonzeros=18880000</font></div><div><font face="monospace, monospace">        total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">          using I-node (on process 0) routines: found 970 nodes, limit used is 5</font></div><div><font face="monospace, monospace">  Up solver (post-smoother) same as down solver (pre-smoother)</font></div><div><font face="monospace, monospace">  linear system matrix followed by preconditioner matrix:</font></div><div><font face="monospace, monospace">  Mat Object: 16 MPI processes</font></div><div><font face="monospace, monospace">    type: mffd</font></div><div><font face="monospace, monospace">    rows=76800, cols=76800</font></div><div><font face="monospace, monospace">      Matrix-free approximation:</font></div><div><font face="monospace, monospace">        err=1.49012e-08 (relative error in function evaluation)</font></div><div><font face="monospace, monospace">        Using wp compute h routine</font></div><div><font face="monospace, monospace">            Does not compute normU</font></div><div><font face="monospace, monospace">  Mat Object: 16 MPI processes</font></div><div><font face="monospace, monospace">    type: mpiaij</font></div><div><font face="monospace, monospace">    rows=76800, cols=76800, bs=5</font></div><div><font face="monospace, monospace">    total: nonzeros=18880000, allocated nonzeros=18880000</font></div><div><font face="monospace, monospace">    total number of mallocs used during MatSetValues calls =0</font></div><div><font face="monospace, monospace">      using I-node (on process 0) routines: found 970 nodes, limit used is 5</font></div><div><font face="monospace, monospace">        Line search: Using full step: fnorm 2.856641938332e+04 gnorm 3.868815397561e+03</font></div><div><font face="monospace, monospace">    1 SNES Function norm 3.868815397561e+03</font></div></div></div></div>