Hello<div><br></div><div>I am the process of trying out some of the multigrid functionality in PETSc and not having much luck. The simple system I am trying to solve is adjoint system of equations resulting from the finite volume discretization of the Euler equation on a 147,456 cell mesh resulting in a linear system of equations of size 5*147,456=737280. All of the test are done on a single processor and use petsc-3.2-p7. My current technique for solving this is to use following options:</div>

<div><div><br></div><div>-matload_block_size 5 -mat_type seqbaij -ksp_type gmres -ksp_max_it 100 -ksp_gmres_restart 50  -ksp_monitor -pc_type asm   -pc_asm_overlap 1 -sub_pc_factor_mat_ordering_type rcm -sub_pc_factor_levels 1 </div>

</div><div>Which results in ~1e-6 convergence in ~50 iterations.  Next I naively tried the following options:</div><div><br></div><div><div>-mat_type seqaij -ksp_type gmres -ksp_max_it 100 -ksp_gmres_restart 50 -ksp_monitor -ksp_view -pc_type ml</div>

</div><div><br></div><div>The result of the ksp_monitor and ksp_view is:</div><div><br></div><div><div>  0 KSP Residual norm 7.366926114851e+70 </div><div>  1 KSP Residual norm 1.744597669120e+61 </div><div>KSP Object: 1 MPI processes</div>

<div>  type: gmres</div><div>    GMRES: restart=50, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>    GMRES: happy breakdown tolerance 1e-30</div><div>  maximum iterations=100, initial guess is zero</div>

<div>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div>  left preconditioning</div><div>  using PRECONDITIONED norm type for convergence test</div><div>PC Object: 1 MPI processes</div><div>  type: ml</div>

<div>    MG: type is MULTIPLICATIVE, levels=5 cycles=v</div><div>      Cycles per PCApply=1</div><div>      Using Galerkin computed coarse grid matrices</div><div>  Coarse grid solver -- level -------------------------------</div>

<div>    KSP Object:    (mg_coarse_)     1 MPI processes    </div><div>      type: preonly</div><div>      maximum iterations=1, initial guess is zero</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div>

<div>      left preconditioning</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_coarse_)     1 MPI processes    </div><div>      type: lu</div><div>        LU: out-of-place factorization</div>

<div>        tolerance for zero pivot 1e-12</div><div>        matrix ordering: nd</div><div>        factor fill ratio given 5, needed 1</div><div>          Factored matrix follows:</div><div>            Matrix Object:             1 MPI processes            </div>

<div>              type: seqaij</div><div>              rows=1, cols=1</div><div>              package used to perform factorization: petsc</div><div>              total: nonzeros=1, allocated nonzeros=1</div><div>              total number of mallocs used during MatSetValues calls =0</div>

<div>                not using I-node routines</div><div>      linear system matrix = precond matrix:</div><div>      Matrix Object:       1 MPI processes      </div><div>        type: seqaij</div><div>        rows=1, cols=1</div>

<div>        total: nonzeros=1, allocated nonzeros=1</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>          not using I-node routines</div><div>  Down solver (pre-smoother) on level 1 -------------------------------</div>

<div>    KSP Object:    (mg_levels_1_)     1 MPI processes    </div><div>      type: richardson</div><div>        Richardson: damping factor=1</div><div>      maximum iterations=1</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div>

<div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using PRECONDITIONED norm type for convergence test</div><div>    PC Object:    (mg_levels_1_)     1 MPI processes    </div><div>

      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1</div><div>      linear system matrix = precond matrix:</div><div>      Matrix Object:       1 MPI processes      </div>

<div>        type: seqaij</div><div>        rows=2, cols=2</div><div>        total: nonzeros=4, allocated nonzeros=4</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>          using I-node routines: found 1 nodes, limit used is 5</div>

<div>  Up solver (post-smoother) same as down solver (pre-smoother)</div><div>  Down solver (pre-smoother) on level 2 -------------------------------</div><div>    KSP Object:    (mg_levels_2_)     1 MPI processes    </div>

<div>      type: richardson</div><div>        Richardson: damping factor=1</div><div>      maximum iterations=1</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div>      left preconditioning</div>

<div>      using nonzero initial guess</div><div>      using PRECONDITIONED norm type for convergence test</div><div>    PC Object:    (mg_levels_2_)     1 MPI processes    </div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1</div>

<div>      linear system matrix = precond matrix:</div><div>      Matrix Object:       1 MPI processes      </div><div>        type: seqaij</div><div>        rows=93, cols=93</div><div>        total: nonzeros=3861, allocated nonzeros=3861</div>

<div>        total number of mallocs used during MatSetValues calls =0</div><div>          not using I-node routines</div><div>  Up solver (post-smoother) same as down solver (pre-smoother)</div><div>  Down solver (pre-smoother) on level 3 -------------------------------</div>

<div>    KSP Object:    (mg_levels_3_)     1 MPI processes    </div><div>      type: richardson</div><div>        Richardson: damping factor=1</div><div>      maximum iterations=1</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div>

<div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using PRECONDITIONED norm type for convergence test</div><div>    PC Object:    (mg_levels_3_)     1 MPI processes    </div><div>

      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1</div><div>      linear system matrix = precond matrix:</div><div>      Matrix Object:       1 MPI processes      </div>

<div>        type: seqaij</div><div>        rows=8303, cols=8303</div><div>        total: nonzeros=697606, allocated nonzeros=697606</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>          not using I-node routines</div>

<div>  Up solver (post-smoother) same as down solver (pre-smoother)</div><div>  Down solver (pre-smoother) on level 4 -------------------------------</div><div>    KSP Object:    (mg_levels_4_)     1 MPI processes    </div>

<div>      type: richardson</div><div>        Richardson: damping factor=1</div><div>      maximum iterations=1</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div>      left preconditioning</div>

<div>      using nonzero initial guess</div><div>      using PRECONDITIONED norm type for convergence test</div><div>    PC Object:    (mg_levels_4_)     1 MPI processes    </div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1</div>

<div>      linear system matrix = precond matrix:</div><div>      Matrix Object:       1 MPI processes      </div><div>        type: seqaij</div><div>        rows=737280, cols=737280</div><div>        total: nonzeros=46288425, allocated nonzeros=46288425</div>

<div>        total number of mallocs used during MatSetValues calls =0</div><div>          using I-node routines: found 147456 nodes, limit used is 5</div><div>  Up solver (post-smoother) same as down solver (pre-smoother)</div>

<div>  linear system matrix = precond matrix:</div><div>  Matrix Object:   1 MPI processes  </div><div>    type: seqaij</div><div>    rows=737280, cols=737280</div><div>    total: nonzeros=46288425, allocated nonzeros=46288425</div>

<div>    total number of mallocs used during MatSetValues calls =0</div><div>      using I-node routines: found 147456 nodes, limit used is 5</div></div><div><div>Time:   11.8599750995636     </div><div> Actual res norm:  0.132693075876745     </div>

</div><div><br></div><div>It stopped after the second iteration because it "converged" over 6 orders of magnitude. Of course, it didn't actually converge and the actual residual remained unchanged. Given that that the coarse solver is LU and all the smoothers are KSPRichardson with 1 iteration and SOR smoothing, I should be able to use GMRES.  I also tried it with </div>

<div>-ksp_type fgrmes  and all the other options the same. </div><div><br></div><div>This time, it doesn't blowup, but also doesn't make significant progress. The actual residual at end of 100 iterations is larger than the real initial residual. (The ksp_view for this is identical except for the change is ksp type to fgmres and the use of right preconditioning. </div>

<div><br></div><div> 0 KSP Residual norm 1.326930758772e-01 </div><div><div> 10 KSP Residual norm 1.326906693969e-01 </div><div> 20 KSP Residual norm 1.326879007861e-01 </div><div> 30 KSP Residual norm 1.326851758083e-01 </div>

<div> 40 KSP Residual norm 1.326824509983e-01 </div><div> 50 KSP Residual norm 1.333174206652e-01 </div><div> 60 KSP Residual norm 1.279860083890e-01 </div><div> 70 KSP Residual norm 1.227523658429e-01 </div><div> 80 KSP Residual norm 1.181123717224e-01 </div>

<div> 90 KSP Residual norm 1.139616660987e-01 </div><div>100 KSP Residual norm 1.102198901480e-01 </div></div><div> Time:   67.6706080436707     </div><div><div> Actual res norm:  0.443508987516716  </div></div><div><br></div>

<div>I then started trying some of the ML options. Specifically,</div><div>-pc_mg_smoothup 3</div><div><div>-pc_mg_smoothdown 3</div></div><div><br></div><div>Now it returns a floating point exception:</div><div><div> 0 KSP Residual norm 1.326930758772e-01 </div>

<div>[0]PETSC ERROR: --------------------- Error Message ------------------------------------</div><div>[0]PETSC ERROR: Floating point exception!</div><div>[0]PETSC ERROR: Infinite or not-a-number generated in norm!</div>

<div>[0]PETSC ERROR: ------------------------------------------------------------------------</div><div>[0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 7, Thu Mar 15 09:30:51 CDT 2012 </div><div>[0]PETSC ERROR: See docs/changes/index.html for recent updates.</div>

<div>[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.</div><div>[0]PETSC ERROR: See docs/index.html for manual pages.</div><div>[0]PETSC ERROR: ------------------------------------------------------------------------</div>

<div>[0]PETSC ERROR: ./main on a real-opt named mica by kenway Sun Apr 28 18:32:22 2013</div><div>[0]PETSC ERROR: Libraries linked from /home/kenway/Downloads/petsc-3.2-p7/real-opt/lib</div><div>[0]PETSC ERROR: Configure run at Sun Apr 28 15:16:05 2013</div>

<div>[0]PETSC ERROR: Configure options --with-shared-libraries --download-superlu_dist=yes --download-parmetis=yes --with-fortran-interfaces=1 --with-debuggig=no -with-scalar-type=real --PETSC_ARCH=real-opt --download-hypre=yes --download-spai=yes --download-sundials=yes --download-ml=yes</div>

<div>[0]PETSC ERROR: ------------------------------------------------------------------------</div><div>[0]PETSC ERROR: VecNorm() line 167 in src/vec/vec/interface/rvector.c</div><div>[0]PETSC ERROR: VecNormalize() line 261 in src/vec/vec/interface/rvector.c</div>

<div>[0]PETSC ERROR: GMREScycle() line 128 in src/ksp/ksp/impls/gmres/gmres.c</div><div>[0]PETSC ERROR: KSPSolve_GMRES() line 231 in src/ksp/ksp/impls/gmres/gmres.c</div><div>[0]PETSC ERROR: KSPSolve() line 423 in src/ksp/ksp/interface/itfunc.c</div>

<div>[0]PETSC ERROR: PCMGMCycle_Private() line 55 in src/ksp/pc/impls/mg/mg.c</div><div>[0]PETSC ERROR: PCMGMCycle_Private() line 49 in src/ksp/pc/impls/mg/mg.c</div><div>[0]PETSC ERROR: PCMGMCycle_Private() line 49 in src/ksp/pc/impls/mg/mg.c</div>

<div>[0]PETSC ERROR: PCMGMCycle_Private() line 49 in src/ksp/pc/impls/mg/mg.c</div><div>[0]PETSC ERROR: PCApply_MG() line 320 in src/ksp/pc/impls/mg/mg.c</div><div>[0]PETSC ERROR: PCApply() line 383 in src/ksp/pc/interface/precon.c</div>

<div>[0]PETSC ERROR: FGMREScycle() line 174 in src/ksp/ksp/impls/gmres/fgmres/fgmres.c</div><div>[0]PETSC ERROR: KSPSolve_FGMRES() line 299 in src/ksp/ksp/impls/gmres/fgmres/fgmres.c</div><div>[0]PETSC ERROR: KSPSolve() line 423 in src/ksp/ksp/interface/itfunc.c</div>

<div> Time:   20.1171851158142     </div><div> Actual res norm:  0.132693075877161     </div></div><div><br></div><div>So I tried running with </div><div><br></div><div>-pc_mg_smoothup 1</div><div><div>-pc_mg_smoothdown 1</div>

</div><div><br></div><div>This runs, but somehow does not result in the same sequence of KSP objects on the various levels. </div><div><div>  0 KSP Residual norm 1.326930758772e-01 </div><div> 10 KSP Residual norm 1.310779571550e-01 </div>

<div> 20 KSP Residual norm 1.290623886203e-01 </div><div> 30 KSP Residual norm 1.271370286802e-01 </div><div> 40 KSP Residual norm 1.252953431171e-01 </div><div> 50 KSP Residual norm 3.224565651498e-01 </div><div> 60 KSP Residual norm 2.974317106914e-01 </div>

<div> 70 KSP Residual norm 2.708573182063e-01 </div><div> 80 KSP Residual norm 2.503317725548e-01 </div><div> 90 KSP Residual norm 2.338612668565e-01 </div><div>100 KSP Residual norm 2.202653375200e-01 </div><div>KSP Object: 1 MPI processes</div>

<div>  type: fgmres</div><div>    GMRES: restart=50, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>    GMRES: happy breakdown tolerance 1e-30</div><div>  maximum iterations=100, initial guess is zero</div>

<div>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div>  right preconditioning</div><div>  using UNPRECONDITIONED norm type for convergence test</div><div>PC Object: 1 MPI processes</div><div>  type: ml</div>

<div>    MG: type is MULTIPLICATIVE, levels=5 cycles=v</div><div>      Cycles per PCApply=1</div><div>      Using Galerkin computed coarse grid matrices</div><div>  Coarse grid solver -- level -------------------------------</div>

<div>    KSP Object:    (mg_coarse_)     1 MPI processes    </div><div>      type: preonly</div><div>      maximum iterations=1, initial guess is zero</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div>

<div>      left preconditioning</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_coarse_)     1 MPI processes    </div><div>      type: lu</div><div>        LU: out-of-place factorization</div>

<div>        tolerance for zero pivot 1e-12</div><div>        matrix ordering: nd</div><div>        factor fill ratio given 5, needed 1</div><div>          Factored matrix follows:</div><div>            Matrix Object:             1 MPI processes            </div>

<div>              type: seqaij</div><div>              rows=1, cols=1</div><div>              package used to perform factorization: petsc</div><div>              total: nonzeros=1, allocated nonzeros=1</div><div>              total number of mallocs used during MatSetValues calls =0</div>

<div>                not using I-node routines</div><div>      linear system matrix = precond matrix:</div><div>      Matrix Object:       1 MPI processes      </div><div>        type: seqaij</div><div>        rows=1, cols=1</div>

<div>        total: nonzeros=1, allocated nonzeros=1</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>          not using I-node routines</div><div>  Down solver (pre-smoother) on level 1 -------------------------------</div>

<div>    KSP Object:    (mg_levels_1_)     1 MPI processes    </div><div>      type: richardson</div><div>        Richardson: damping factor=1</div><div>      maximum iterations=1, initial guess is zero</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div>

<div>      left preconditioning</div><div>      using PRECONDITIONED norm type for convergence test</div><div>    PC Object:    (mg_levels_1_)     1 MPI processes    </div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1</div>

<div>      linear system matrix = precond matrix:</div><div>      Matrix Object:       1 MPI processes      </div><div>        type: seqaij</div><div>        rows=2, cols=2</div><div>        total: nonzeros=4, allocated nonzeros=4</div>

<div>        total number of mallocs used during MatSetValues calls =0</div><div>          using I-node routines: found 1 nodes, limit used is 5</div><div>  Up solver (post-smoother) on level 1 -------------------------------</div>

<div>    KSP Object:    (mg_levels_1_)     1 MPI processes    </div><div>      type: gmres</div><div>        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div>

<div>        GMRES: happy breakdown tolerance 1e-30</div><div>      maximum iterations=1</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div>      left preconditioning</div><div>      using nonzero initial guess</div>

<div>      using PRECONDITIONED norm type for convergence test</div><div>    PC Object:    (mg_levels_1_)     1 MPI processes    </div><div>      type: ilu</div><div>        ILU: out-of-place factorization</div><div>        0 levels of fill</div>

<div>        tolerance for zero pivot 1e-12</div><div>        using diagonal shift to prevent zero pivot</div><div>        matrix ordering: natural</div><div>        factor fill ratio given 1, needed 1</div><div>          Factored matrix follows:</div>

<div>            Matrix Object:             1 MPI processes            </div><div>              type: seqaij</div><div>              rows=2, cols=2</div><div>              package used to perform factorization: petsc</div>

<div>              total: nonzeros=4, allocated nonzeros=4</div><div>              total number of mallocs used during MatSetValues calls =0</div><div>                using I-node routines: found 1 nodes, limit used is 5</div>

<div>      linear system matrix = precond matrix:</div><div>      Matrix Object:       1 MPI processes      </div><div>        type: seqaij</div><div>        rows=2, cols=2</div><div>        total: nonzeros=4, allocated nonzeros=4</div>

<div>        total number of mallocs used during MatSetValues calls =0</div><div>          using I-node routines: found 1 nodes, limit used is 5</div><div>  Down solver (pre-smoother) on level 2 -------------------------------</div>

<div>    KSP Object:    (mg_levels_2_)     1 MPI processes    </div><div>      type: richardson</div><div>        Richardson: damping factor=1</div><div>      maximum iterations=1, initial guess is zero</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div>

<div>      left preconditioning</div><div>      using PRECONDITIONED norm type for convergence test</div><div>    PC Object:    (mg_levels_2_)     1 MPI processes    </div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1</div>

<div>      linear system matrix = precond matrix:</div><div>      Matrix Object:       1 MPI processes      </div><div>        type: seqaij</div><div>        rows=93, cols=93</div><div>        total: nonzeros=3861, allocated nonzeros=3861</div>

<div>        total number of mallocs used during MatSetValues calls =0</div><div>          not using I-node routines</div><div>  Up solver (post-smoother) on level 2 -------------------------------</div><div>    KSP Object:    (mg_levels_2_)     1 MPI processes    </div>

<div>      type: gmres</div><div>        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>        GMRES: happy breakdown tolerance 1e-30</div><div>      maximum iterations=1</div>

<div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using PRECONDITIONED norm type for convergence test</div>

<div>    PC Object:    (mg_levels_2_)     1 MPI processes    </div><div>      type: ilu</div><div>        ILU: out-of-place factorization</div><div>        0 levels of fill</div><div>        tolerance for zero pivot 1e-12</div>

<div>        using diagonal shift to prevent zero pivot</div><div>        matrix ordering: natural</div><div>        factor fill ratio given 1, needed 1</div><div>          Factored matrix follows:</div><div>            Matrix Object:             1 MPI processes            </div>

<div>              type: seqaij</div><div>              rows=93, cols=93</div><div>              package used to perform factorization: petsc</div><div>              total: nonzeros=3861, allocated nonzeros=3861</div><div>

              total number of mallocs used during MatSetValues calls =0</div><div>                not using I-node routines</div><div>      linear system matrix = precond matrix:</div><div>      Matrix Object:       1 MPI processes      </div>

<div>        type: seqaij</div><div>        rows=93, cols=93</div><div>        total: nonzeros=3861, allocated nonzeros=3861</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>          not using I-node routines</div>

<div>  Down solver (pre-smoother) on level 3 -------------------------------</div><div>    KSP Object:    (mg_levels_3_)     1 MPI processes    </div><div>      type: richardson</div><div>        Richardson: damping factor=1</div>

<div>      maximum iterations=1, initial guess is zero</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div>      left preconditioning</div><div>      using PRECONDITIONED norm type for convergence test</div>

<div>    PC Object:    (mg_levels_3_)     1 MPI processes    </div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1</div><div>      linear system matrix = precond matrix:</div>

<div>      Matrix Object:       1 MPI processes      </div><div>        type: seqaij</div><div>        rows=8303, cols=8303</div><div>        total: nonzeros=697606, allocated nonzeros=697606</div><div>        total number of mallocs used during MatSetValues calls =0</div>

<div>          not using I-node routines</div><div>  Up solver (post-smoother) on level 3 -------------------------------</div><div>    KSP Object:    (mg_levels_3_)     1 MPI processes    </div><div>      type: gmres</div>

<div>        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>        GMRES: happy breakdown tolerance 1e-30</div><div>      maximum iterations=1</div><div>

      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using PRECONDITIONED norm type for convergence test</div>

<div>    PC Object:    (mg_levels_3_)     1 MPI processes    </div><div>      type: ilu</div><div>        ILU: out-of-place factorization</div><div>        0 levels of fill</div><div>        tolerance for zero pivot 1e-12</div>

<div>        using diagonal shift to prevent zero pivot</div><div>        matrix ordering: natural</div><div>        factor fill ratio given 1, needed 1</div><div>          Factored matrix follows:</div><div>            Matrix Object:             1 MPI processes            </div>

<div>              type: seqaij</div><div>              rows=8303, cols=8303</div><div>              package used to perform factorization: petsc</div><div>              total: nonzeros=697606, allocated nonzeros=697606</div>

<div>              total number of mallocs used during MatSetValues calls =0</div><div>                not using I-node routines</div><div>      linear system matrix = precond matrix:</div><div>      Matrix Object:       1 MPI processes      </div>

<div>        type: seqaij</div><div>        rows=8303, cols=8303</div><div>        total: nonzeros=697606, allocated nonzeros=697606</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>          not using I-node routines</div>

<div>  Down solver (pre-smoother) on level 4 -------------------------------</div><div>    KSP Object:    (mg_levels_4_)     1 MPI processes    </div><div>      type: richardson</div><div>        Richardson: damping factor=1</div>

<div>      maximum iterations=1, initial guess is zero</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div>      left preconditioning</div><div>      using PRECONDITIONED norm type for convergence test</div>

<div>    PC Object:    (mg_levels_4_)     1 MPI processes    </div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1</div><div>      linear system matrix = precond matrix:</div>

<div>      Matrix Object:       1 MPI processes      </div><div>        type: seqaij</div><div>        rows=737280, cols=737280</div><div>        total: nonzeros=46288425, allocated nonzeros=46288425</div><div>        total number of mallocs used during MatSetValues calls =0</div>

<div>          using I-node routines: found 147456 nodes, limit used is 5</div><div>  Up solver (post-smoother) on level 4 -------------------------------</div><div>    KSP Object:    (mg_levels_4_)     1 MPI processes    </div>

<div>      type: gmres</div><div>        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>        GMRES: happy breakdown tolerance 1e-30</div><div>      maximum iterations=1</div>

<div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000</div><div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using PRECONDITIONED norm type for convergence test</div>

<div>    PC Object:    (mg_levels_4_)     1 MPI processes    </div><div>      type: ilu</div><div>        ILU: out-of-place factorization</div><div>        0 levels of fill</div><div>        tolerance for zero pivot 1e-12</div>

<div>        using diagonal shift to prevent zero pivot</div><div>        matrix ordering: natural</div><div>        factor fill ratio given 1, needed 1</div><div>          Factored matrix follows:</div><div>            Matrix Object:             1 MPI processes            </div>

<div>              type: seqaij</div><div>              rows=737280, cols=737280</div><div>              package used to perform factorization: petsc</div><div>              total: nonzeros=46288425, allocated nonzeros=46288425</div>

<div>              total number of mallocs used during MatSetValues calls =0</div><div>                using I-node routines: found 147456 nodes, limit used is 5</div><div>      linear system matrix = precond matrix:</div>

<div>      Matrix Object:       1 MPI processes      </div><div>        type: seqaij</div><div>        rows=737280, cols=737280</div><div>        total: nonzeros=46288425, allocated nonzeros=46288425</div><div>        total number of mallocs used during MatSetValues calls =0</div>

<div>          using I-node routines: found 147456 nodes, limit used is 5</div><div>  linear system matrix = precond matrix:</div><div>  Matrix Object:   1 MPI processes  </div><div>    type: seqaij</div><div>    rows=737280, cols=737280</div>

<div>    total: nonzeros=46288425, allocated nonzeros=46288425</div><div>    total number of mallocs used during MatSetValues calls =0</div><div>      using I-node routines: found 147456 nodes, limit used is 5</div><div> Time:   118.043510913849     </div>

<div> Actual res norm:   1.11054776641920     </div></div><div><br></div><div>Now, the Down Solver is still KSPRichardson, but the UP solvers have turned into a default GMRES, ILU solver. In either case, this didn't help the convergence, and the final residual is larger than then initial. </div>

<div><br></div><div>I have also experimented with BoomerAMG, and have run into similar problems; all of the various options result in a preconditioner that is not significantly better tan not using any preconditioner at all. </div>

<div><br></div><div>Any suggestions would be greatly appreciated. </div><div><br></div><div>Thank you,</div><div><br></div><div>Gaetan Kenway</div><div><br></div><div><br></div><div>The compete source code listing is below:</div>

<div><br></div><div><div>program main</div><div><br></div><div>  ! Test different petsc solution techniques</div><div>  implicit none</div><div><br></div><div>#include "include/finclude/petsc.h"</div><div><br></div>

<div>  KSP ksp</div><div>  Vec RHS, x, res</div><div>  Mat A</div><div>  PetscViewer fd</div><div><br></div><div>  integer :: ierr, rank</div><div>  real*8 :: timeA, timeB,err_nrm</div><div>  ! Initialize PETSc</div><div>

  call PetscInitialize('petsc_options', ierr)</div><div>  call mpi_comm_rank(PETSC_COMM_WORLD, rank, ierr)</div><div>  ! Load matrix</div><div>  call PetscViewerBinaryOpen(PETSC_COMM_WORLD,"drdw.bin", FILE_MODE_READ, fd, ierr)</div>

<div>  call MatCreate(PETSC_COMM_WORLD, A, ierr)</div><div>  call MatSetFromOptions(A, ierr)</div><div>  call MatLoad(A, fd, ierr)</div><div>  call PetscViewerDestroy(fd, ierr)</div><div><br></div><div>  ! Create vector</div>

<div>  call MatGetVecs(A, RHS, x, ierr)</div><div>  call vecduplicate(x, res, ierr)</div><div>  ! Load RHS</div><div>  call PetscViewerBinaryOpen(PETSC_COMM_WORLD, "didw.bin", FILE_MODE_READ, fd, ierr)</div><div>

  call VecSetFromOptions(RHS, ierr)</div><div>  call VecLoad(RHS, fd, ierr)</div><div>  call PetscViewerDestroy(fd, ierr)</div><div><br></div><div>  ! Create KSP object</div><div>  call KSPCreate(PEtSC_COMM_WORLD, ksp, ierr)</div>

<div>  call KSPsetFromOptions(ksp, ierr)</div><div>  call KSPSetOperators(ksp, A, A, SAME_NONZERO_PATTERN, ierr)</div><div><br></div><div>  ! Solve system</div><div>  timeA = mpi_wtime()</div><div>  call KSPSolve(ksp, RHS, x, ierr)</div>

<div>  timeB = mpi_wtime()</div><div>  if (rank == 0) then</div><div>     print *,'Time:',timeB-timeA</div><div>  end if</div><div>  ! Check actual error</div><div>  call MatMult(A, x, res, ierr)</div><div>  call VecNorm(res, NORM_2, err_nrm, ierr)</div>

<div>  call VecAxPy(res, -1.0_8, RHS, ierr)</div><div>  call VecNorm(res, NORM_2, err_nrm, ierr)</div><div>  if (rank == 0) then</div><div>     print *,'Actual res norm:',err_nrm</div><div>  end if</div><div><br></div>

<div>  ! Destroy</div><div>  call KSPDestroy(ksp, ierr)</div><div>  call MatDestroy(A, ierr)</div><div>  call VecDestroy(RHS, ierr)</div><div>  call VecDestroy(x, ierr)</div><div>  call VecDestroy(res, ierr)</div><div><br>

</div><div>  ! Finalize</div><div>  call PETSCFinalize(ierr)</div><div>end program main</div></div><div><br></div>