<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jun 23, 2017 at 12:54 AM, Jason Lefley <span dir="ltr"><<a href="mailto:jason.lefley@aclectic.com" target="_blank">jason.lefley@aclectic.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div dir="auto" style="word-wrap:break-word"><div dir="auto" style="word-wrap:break-word"><br><div><blockquote type="cite"><div>On Jun 22, 2017, at 5:35 PM, Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>> wrote:</div><br class="m_7684181004605111704Apple-interchange-newline"><div><div dir="ltr" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"><div class="gmail_extra"><div class="gmail_quote">On Thu, Jun 22, 2017 at 3:20 PM, Jason Lefley<span class="m_7684181004605111704Apple-converted-space"> </span><span dir="ltr"><<a href="mailto:jason.lefley@aclectic.com" target="_blank">jason.lefley@aclectic.<wbr>com</a>></span><span class="m_7684181004605111704Apple-converted-space"> </span>wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div style="word-wrap:break-word"><div>Thanks for the prompt replies. I ran with gamg and the results look more promising. I tried the suggested -mg_* options and did not see improvement. The -ksp_view and -ksp_monitor_true_residual output from those tests and the solver_test source (modified ex34.c) follow:</div></div></blockquote><div><br></div><div>Okay, the second step is to replicate the smoother for the GMG, which will have a smaller and scalable setup time. The</div><div>smoother could be weak, or the restriction could be bad.</div></div></div></div></div></blockquote><div><br></div><div>I inspected the ksp_view output from the run with -pc_type gamg and ran the program again with -pc_type mg and the pre-conditioner options from the gamg run:</div><div><br></div><div><div>$ mpirun -n 16 ./solver_test -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -ksp_view -ksp_monitor_true_residual -ksp_type cg -pc_type mg -pc_mg_levels 5 -mg_coarse_ksp_type preonly -mg_coarse_pc_type bjacobi -mg_coarse_sub_ksp_type preonly -mg_coarse_sub_pc_type lu -mg_coarse_sub_pc_factor_<wbr>shift_type INBLOCKS -mg_levels_ksp_type chebyshev -mg_levels_pc_type sor -mg_levels_esteig_ksp_type gmres</div><div>right hand side 2 norm: 512.</div><div>right hand side infinity norm: 0.999097</div><div>building operator with Dirichlet boundary conditions, global grid size: 128 x 128 x 128</div><div>building operator with Dirichlet boundary conditions, global grid size: 16 x 16 x 16</div><div>building operator with Dirichlet boundary conditions, global grid size: 32 x 32 x 32</div><div>building operator with Dirichlet boundary conditions, global grid size: 64 x 64 x 64</div><div>building operator with Dirichlet boundary conditions, global grid size: 8 x 8 x 8</div><div>  0 KSP preconditioned resid norm 9.806726045668e+02 true resid norm 5.120000000000e+02 ||r(i)||/||b|| 1.000000000000e+00</div><div>  1 KSP preconditioned resid norm 3.621361277232e+02 true resid norm 1.429430352211e+03 ||r(i)||/||b|| 2.791856156662e+00</div><div>  2 KSP preconditioned resid norm 2.362961522860e+01 true resid norm 1.549620746006e+03 ||r(i)||/||b|| 3.026603019544e+00</div><div>  3 KSP preconditioned resid norm 7.695073339717e+01 true resid norm 1.542148820317e+03 ||r(i)||/||b|| 3.012009414681e+00</div><div>  4 KSP preconditioned resid norm 3.765270470793e+01 true resid norm 1.536405551882e+03 ||r(i)||/||b|| 3.000792093520e+00</div><div>  5 KSP preconditioned resid norm 6.761970780882e+01 true resid norm 1.502842623846e+03 ||r(i)||/||b|| 2.935239499698e+00</div><div>  6 KSP preconditioned resid norm 5.995995646652e+01 true resid norm 1.447456652501e+03 ||r(i)||/||b|| 2.827063774415e+00</div><div>  7 KSP preconditioned resid norm 4.388139142285e+01 true resid norm 1.413766393419e+03 ||r(i)||/||b|| 2.761262487146e+00</div><div>  8 KSP preconditioned resid norm 2.295909410512e+01 true resid norm 1.371727148377e+03 ||r(i)||/||b|| 2.679154586673e+00</div><div>  9 KSP preconditioned resid norm 1.961908891359e+01 true resid norm 1.339113282715e+03 ||r(i)||/||b|| 2.615455630302e+00</div><div> 10 KSP preconditioned resid norm 6.893687291220e+01 true resid norm 1.229592829746e+03 ||r(i)||/||b|| 2.401548495598e+00</div><div> 11 KSP preconditioned resid norm 3.833567365382e+01 true resid norm 1.118085982483e+03 ||r(i)||/||b|| 2.183761684536e+00</div><div> 12 KSP preconditioned resid norm 1.939604089596e+01 true resid norm 9.852672187664e+02 ||r(i)||/||b|| 1.924350036653e+00</div><div> 13 KSP preconditioned resid norm 2.252075208204e+01 true resid norm 8.159187018709e+02 ||r(i)||/||b|| 1.593591214592e+00</div><div> 14 KSP preconditioned resid norm 2.642782719810e+01 true resid norm 7.253214735753e+02 ||r(i)||/||b|| 1.416643503077e+00</div><div> 15 KSP preconditioned resid norm 2.548817259250e+01 true resid norm 6.070018478722e+02 ||r(i)||/||b|| 1.185550484125e+00</div><div> 16 KSP preconditioned resid norm 5.281972692525e+01 true resid norm 4.815894400238e+02 ||r(i)||/||b|| 9.406043750466e-01</div><div> 17 KSP preconditioned resid norm 2.402884696592e+01 true resid norm 4.144462871860e+02 ||r(i)||/||b|| 8.094654046603e-01</div><div> 18 KSP preconditioned resid norm 1.043080941574e+01 true resid norm 3.729148183697e+02 ||r(i)||/||b|| 7.283492546283e-01</div><div> 19 KSP preconditioned resid norm 1.490375076082e+01 true resid norm 3.122057027160e+02 ||r(i)||/||b|| 6.097767631172e-01</div><div> 20 KSP preconditioned resid norm 3.249426166084e+00 true resid norm 2.704136970440e+02 ||r(i)||/||b|| 5.281517520390e-01</div><div> 21 KSP preconditioned resid norm 4.898441103047e+00 true resid norm 2.346045017813e+02 ||r(i)||/||b|| 4.582119175416e-01</div><div> 22 KSP preconditioned resid norm 6.674657659594e+00 true resid norm 1.870390126135e+02 ||r(i)||/||b|| 3.653105715107e-01</div><div> 23 KSP preconditioned resid norm 5.475921158065e+00 true resid norm 1.732176093821e+02 ||r(i)||/||b|| 3.383156433245e-01</div><div> 24 KSP preconditioned resid norm 2.776421930727e+00 true resid norm 1.562809743536e+02 ||r(i)||/||b|| 3.052362780343e-01</div><div> 25 KSP preconditioned resid norm 3.424602247354e+00 true resid norm 1.375628929963e+02 ||r(i)||/||b|| 2.686775253835e-01</div><div> 26 KSP preconditioned resid norm 2.212037280808e+00 true resid norm 1.221828497054e+02 ||r(i)||/||b|| 2.386383783309e-01</div><div> 27 KSP preconditioned resid norm 1.365474968893e+00 true resid norm 1.082476112493e+02 ||r(i)||/||b|| 2.114211157213e-01</div><div> 28 KSP preconditioned resid norm 2.638907538318e+00 true resid norm 8.864935716757e+01 ||r(i)||/||b|| 1.731432757179e-01</div><div> 29 KSP preconditioned resid norm 1.719908158919e+00 true resid norm 7.632670876324e+01 ||r(i)||/||b|| 1.490756030532e-01</div><div> 30 KSP preconditioned resid norm 7.985033219249e-01 true resid norm 6.949169231958e+01 ||r(i)||/||b|| 1.357259615617e-01</div><div> 31 KSP preconditioned resid norm 3.811670663811e+00 true resid norm 6.151000812796e+01 ||r(i)||/||b|| 1.201367346249e-01</div><div> 32 KSP preconditioned resid norm 7.888148376757e+00 true resid norm 5.694823999920e+01 ||r(i)||/||b|| 1.112270312484e-01</div><div> 33 KSP preconditioned resid norm 7.545633821809e-01 true resid norm 4.589854278402e+01 ||r(i)||/||b|| 8.964559137503e-02</div><div> 34 KSP preconditioned resid norm 2.271801800991e+00 true resid norm 3.728668301821e+01 ||r(i)||/||b|| 7.282555276994e-02</div><div> 35 KSP preconditioned resid norm 3.961087334680e+00 true resid norm 3.169140910721e+01 ||r(i)||/||b|| 6.189728341253e-02</div><div> 36 KSP preconditioned resid norm 9.139405064634e-01 true resid norm 2.825299509385e+01 ||r(i)||/||b|| 5.518163104268e-02</div><div> 37 KSP preconditioned resid norm 3.403605053170e-01 true resid norm 2.102215336663e+01 ||r(i)||/||b|| 4.105889329421e-02</div><div> 38 KSP preconditioned resid norm 4.614799224677e-01 true resid norm 1.651863757642e+01 ||r(i)||/||b|| 3.226296401644e-02</div><div> 39 KSP preconditioned resid norm 1.996074237552e+00 true resid norm 1.439868559977e+01 ||r(i)||/||b|| 2.812243281205e-02</div><div> 40 KSP preconditioned resid norm 1.106018322401e+00 true resid norm 1.313250681787e+01 ||r(i)||/||b|| 2.564942737865e-02</div><div> 41 KSP preconditioned resid norm 2.639402464711e-01 true resid norm 1.164910167179e+01 ||r(i)||/||b|| 2.275215170271e-02</div><div> 42 KSP preconditioned resid norm 1.749941228669e-01 true resid norm 1.053438524789e+01 ||r(i)||/||b|| 2.057497118729e-02</div><div> 43 KSP preconditioned resid norm 6.464433193720e-01 true resid norm 9.105614545741e+00 ||r(i)||/||b|| 1.778440340965e-02</div><div> 44 KSP preconditioned resid norm 5.990029838187e-01 true resid norm 8.803151647663e+00 ||r(i)||/||b|| 1.719365556184e-02</div><div> 45 KSP preconditioned resid norm 1.871777684116e-01 true resid norm 8.140591972598e+00 ||r(i)||/||b|| 1.589959369648e-02</div><div> 46 KSP preconditioned resid norm 4.316459571157e-01 true resid norm 7.640223567698e+00 ||r(i)||/||b|| 1.492231165566e-02</div><div> 47 KSP preconditioned resid norm 9.563142801536e-02 true resid norm 7.094762567710e+00 ||r(i)||/||b|| 1.385695814006e-02</div><div> 48 KSP preconditioned resid norm 2.380088757747e-01 true resid norm 6.064559746487e+00 ||r(i)||/||b|| 1.184484325486e-02</div><div> 49 KSP preconditioned resid norm 2.230779501200e-01 true resid norm 4.923827478633e+00 ||r(i)||/||b|| 9.616850544205e-03</div><div> 50 KSP preconditioned resid norm 2.905071000609e-01 true resid norm 4.426620956264e+00 ||r(i)||/||b|| 8.645744055203e-03</div><div> 51 KSP preconditioned resid norm 3.430194707482e-02 true resid norm 3.873957688918e+00 ||r(i)||/||b|| 7.566323611167e-03</div><div> 52 KSP preconditioned resid norm 4.329652082337e-02 true resid norm 3.430571122778e+00 ||r(i)||/||b|| 6.700334224177e-03</div><div> 53 KSP preconditioned resid norm 1.610976212900e-01 true resid norm 3.052757228648e+00 ||r(i)||/||b|| 5.962416462203e-03</div><div> 54 KSP preconditioned resid norm 6.113252183681e-02 true resid norm 2.876793151138e+00 ||r(i)||/||b|| 5.618736623317e-03</div><div> 55 KSP preconditioned resid norm 2.731463237482e-02 true resid norm 2.441017091077e+00 ||r(i)||/||b|| 4.767611506010e-03</div><div> 56 KSP preconditioned resid norm 5.193746161496e-02 true resid norm 2.114917193241e+00 ||r(i)||/||b|| 4.130697643049e-03</div><div> 57 KSP preconditioned resid norm 2.959513516137e-01 true resid norm 1.903828747377e+00 ||r(i)||/||b|| 3.718415522220e-03</div><div> 58 KSP preconditioned resid norm 8.093802579621e-02 true resid norm 1.759070727559e+00 ||r(i)||/||b|| 3.435685014763e-03</div><div> 59 KSP preconditioned resid norm 3.558590388480e-02 true resid norm 1.356337866126e+00 ||r(i)||/||b|| 2.649097394777e-03</div><div> 60 KSP preconditioned resid norm 6.506508837044e-02 true resid norm 1.214979249890e+00 ||r(i)||/||b|| 2.373006347441e-03</div><div> 61 KSP preconditioned resid norm 3.120758675191e-02 true resid norm 9.993321163196e-01 ||r(i)||/||b|| 1.951820539687e-03</div><div> 62 KSP preconditioned resid norm 1.034431089486e-01 true resid norm 9.193137244810e-01 ||r(i)||/||b|| 1.795534618127e-03</div><div> 63 KSP preconditioned resid norm 2.763120051285e-02 true resid norm 8.479698661132e-01 ||r(i)||/||b|| 1.656191144752e-03</div><div> 64 KSP preconditioned resid norm 1.937546528918e-02 true resid norm 7.431839535619e-01 ||r(i)||/||b|| 1.451531159301e-03</div><div> 65 KSP preconditioned resid norm 2.133391792161e-02 true resid norm 7.089428437765e-01 ||r(i)||/||b|| 1.384653991751e-03</div><div> 66 KSP preconditioned resid norm 8.676771000819e-03 true resid norm 6.511166875850e-01 ||r(i)||/||b|| 1.271712280439e-03</div><div>KSP Object: 16 MPI processes</div><div>  type: cg</div><div>  maximum iterations=10000, initial guess is zero</div><div>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>  left preconditioning</div><div>  using PRECONDITIONED norm type for convergence test</div><div>PC Object: 16 MPI processes</div><div>  type: mg</div><div>    MG: type is MULTIPLICATIVE, levels=5 cycles=v</div><div>      Cycles per PCApply=1</div><div>      Not using Galerkin computed coarse grid matrices</div><div>  Coarse grid solver -- level ------------------------------<wbr>-</div><div>    KSP Object:    (mg_coarse_)     16 MPI processes</div><div>      type: preonly</div><div>      maximum iterations=10000, initial guess is zero</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>      left preconditioning</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_coarse_)     16 MPI processes</div><div>      type: bjacobi</div><div>        block Jacobi: number of blocks = 16</div><div>        Local solve is same for all blocks, in the following KSP and PC objects:</div><div>      KSP Object:      (mg_coarse_sub_)       1 MPI processes</div><div>        type: preonly</div><div>        maximum iterations=10000, initial guess is zero</div><div>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>        left preconditioning</div><div>        using NONE norm type for convergence test</div><div>      PC Object:      (mg_coarse_sub_)       1 MPI processes</div><div>        type: lu</div><div>          LU: out-of-place factorization</div><div>          tolerance for zero pivot 2.22045e-14</div><div>          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]</div><div>          matrix ordering: nd</div><div>          factor fill ratio given 5., needed 2.3125</div><div>            Factored matrix follows:</div><div>              Mat Object:               1 MPI processes</div><div>                type: seqaij</div><div>                rows=32, cols=32</div><div>                package used to perform factorization: petsc</div><div>                total: nonzeros=370, allocated nonzeros=370</div><div>                total number of mallocs used during MatSetValues calls =0</div><div>                  not using I-node routines</div><div>        linear system matrix = precond matrix:</div><div>        Mat Object:         1 MPI processes</div><div>          type: seqaij</div><div>          rows=32, cols=32</div><div>          total: nonzeros=160, allocated nonzeros=160</div><div>          total number of mallocs used during MatSetValues calls =0</div><div>            not using I-node routines</div><div>      linear system matrix = precond matrix:</div><div>      Mat Object:       16 MPI processes</div><div>        type: mpiaij</div><div>        rows=512, cols=512</div><div>        total: nonzeros=3200, allocated nonzeros=3200</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>  Down solver (pre-smoother) on level 1 ------------------------------<wbr>-</div><div>    KSP Object:    (mg_levels_1_)     16 MPI processes</div><div>      type: chebyshev</div><div>        Chebyshev: eigenvalue estimates:  min = 0.153005, max = 1.68306</div><div>        Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>        KSP Object:        (mg_levels_1_esteig_)         16 MPI processes</div><div>          type: gmres</div><div>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>            GMRES: happy breakdown tolerance 1e-30</div><div>          maximum iterations=10, initial guess is zero</div><div>          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>          left preconditioning</div><div>          using PRECONDITIONED norm type for convergence test</div><div>      maximum iterations=2</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_levels_1_)     16 MPI processes</div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>      linear system matrix = precond matrix:</div><div>      Mat Object:       16 MPI processes</div><div>        type: mpiaij</div><div>        rows=4096, cols=4096</div><div>        total: nonzeros=27136, allocated nonzeros=27136</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>  Up solver (post-smoother) same as down solver (pre-smoother)</div><div>  Down solver (pre-smoother) on level 2 ------------------------------<wbr>-</div><div>    KSP Object:    (mg_levels_2_)     16 MPI processes</div><div>      type: chebyshev</div><div>        Chebyshev: eigenvalue estimates:  min = 0.152793, max = 1.68072</div><div>        Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>        KSP Object:        (mg_levels_2_esteig_)         16 MPI processes</div><div>          type: gmres</div><div>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>            GMRES: happy breakdown tolerance 1e-30</div><div>          maximum iterations=10, initial guess is zero</div><div>          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>          left preconditioning</div><div>          using PRECONDITIONED norm type for convergence test</div><div>      maximum iterations=2</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_levels_2_)     16 MPI processes</div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>      linear system matrix = precond matrix:</div><div>      Mat Object:       16 MPI processes</div><div>        type: mpiaij</div><div>        rows=32768, cols=32768</div><div>        total: nonzeros=223232, allocated nonzeros=223232</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>  Up solver (post-smoother) same as down solver (pre-smoother)</div><div>  Down solver (pre-smoother) on level 3 ------------------------------<wbr>-</div><div>    KSP Object:    (mg_levels_3_)     16 MPI processes</div><div>      type: chebyshev</div><div>        Chebyshev: eigenvalue estimates:  min = 0.144705, max = 1.59176</div><div>        Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>        KSP Object:        (mg_levels_3_esteig_)         16 MPI processes</div><div>          type: gmres</div><div>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>            GMRES: happy breakdown tolerance 1e-30</div><div>          maximum iterations=10, initial guess is zero</div><div>          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>          left preconditioning</div><div>          using PRECONDITIONED norm type for convergence test</div><div>      maximum iterations=2</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_levels_3_)     16 MPI processes</div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>      linear system matrix = precond matrix:</div><div>      Mat Object:       16 MPI processes</div><div>        type: mpiaij</div><div>        rows=262144, cols=262144</div><div>        total: nonzeros=1810432, allocated nonzeros=1810432</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>  Up solver (post-smoother) same as down solver (pre-smoother)</div><div>  Down solver (pre-smoother) on level 4 ------------------------------<wbr>-</div><div>    KSP Object:    (mg_levels_4_)     16 MPI processes</div><div>      type: chebyshev</div><div>        Chebyshev: eigenvalue estimates:  min = 0.140039, max = 1.54043</div><div>        Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>        KSP Object:        (mg_levels_4_esteig_)         16 MPI processes</div><div>          type: gmres</div><div>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>            GMRES: happy breakdown tolerance 1e-30</div><div>          maximum iterations=10, initial guess is zero</div><div>          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>          left preconditioning</div><div>          using PRECONDITIONED norm type for convergence test</div><div>      maximum iterations=2</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_levels_4_)     16 MPI processes</div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>      linear system matrix = precond matrix:</div><div>      Mat Object:       16 MPI processes</div><div>        type: mpiaij</div><div>        rows=2097152, cols=2097152</div><div>        total: nonzeros=14581760, allocated nonzeros=14581760</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>  Up solver (post-smoother) same as down solver (pre-smoother)</div><div>  linear system matrix = precond matrix:</div><div>  Mat Object:   16 MPI processes</div><div>    type: mpiaij</div><div>    rows=2097152, cols=2097152</div><div>    total: nonzeros=14581760, allocated nonzeros=14581760</div><div>    total number of mallocs used during MatSetValues calls =0</div><div>Residual 2 norm 0.651117</div><div>Residual infinity norm 0.00799571</div><div><br></div><div><br></div><div>I did a diff on the ksp_view from the above run from the output from the run with -pc_type gamg and the only differences include the needed factor fill ratio (gamg: 1, mg: 2.3125), the size and non-zero counts of the matrices used in the multi-grid levels, the Chebyshev eigenvalue estimates, and the usage of I-node routines (gamg:  using I-node routines: found 3 nodes, limit used is 5, mg: not using I-node routines).</div></div><div><br></div><div>Adding -pc_mg_galerkin results in some improvement but still not as good as with gamg:</div><div><br></div><div><div>$ mpirun -n 16 ./solver_test -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -ksp_view -ksp_monitor_true_residual -ksp_type cg -pc_type mg -pc_mg_levels 5 -mg_coarse_ksp_type preonly -mg_coarse_pc_type bjacobi -mg_coarse_sub_ksp_type preonly -mg_coarse_sub_pc_type lu -mg_coarse_sub_pc_factor_<wbr>shift_type INBLOCKS -mg_levels_ksp_type chebyshev -mg_levels_pc_type sor -mg_levels_esteig_ksp_type gmres -pc_mg_galerkin</div><div>right hand side 2 norm: 512.</div><div>right hand side infinity norm: 0.999097</div><div>building operator with Dirichlet boundary conditions, global grid size: 128 x 128 x 128</div><div>  0 KSP preconditioned resid norm 1.073621701581e+00 true resid norm 5.120000000000e+02 ||r(i)||/||b|| 1.000000000000e+00</div><div>  1 KSP preconditioned resid norm 2.316341151889e-01 true resid norm 1.169096072553e+03 ||r(i)||/||b|| 2.283390766706e+00</div><div>  2 KSP preconditioned resid norm 1.054910990128e-01 true resid norm 4.444993518786e+02 ||r(i)||/||b|| 8.681627966378e-01</div><div>  3 KSP preconditioned resid norm 3.671488511570e-02 true resid norm 1.169431518627e+02 ||r(i)||/||b|| 2.284045934818e-01</div><div>  4 KSP preconditioned resid norm 1.055769111265e-02 true resid norm 3.161333456265e+01 ||r(i)||/||b|| 6.174479406767e-02</div><div>  5 KSP preconditioned resid norm 2.557907008002e-03 true resid norm 9.319742572653e+00 ||r(i)||/||b|| 1.820262221221e-02</div><div>  6 KSP preconditioned resid norm 5.039866236685e-04 true resid norm 2.418858575838e+00 ||r(i)||/||b|| 4.724333155934e-03</div><div>  7 KSP preconditioned resid norm 1.132965683654e-04 true resid norm 4.979511177091e-01 ||r(i)||/||b|| 9.725607767757e-04</div><div>  8 KSP preconditioned resid norm 5.458028025084e-05 true resid norm 1.150321233127e-01 ||r(i)||/||b|| 2.246721158452e-04</div><div>  9 KSP preconditioned resid norm 3.742558792121e-05 true resid norm 8.485603638598e-02 ||r(i)||/||b|| 1.657344460664e-04</div><div> 10 KSP preconditioned resid norm 1.121838737544e-05 true resid norm 4.699890661073e-02 ||r(i)||/||b|| 9.179473947407e-05</div><div> 11 KSP preconditioned resid norm 4.452473763175e-06 true resid norm 1.071140093264e-02 ||r(i)||/||b|| 2.092070494657e-05</div><div>KSP Object: 16 MPI processes</div><div>  type: cg</div><div>  maximum iterations=10000, initial guess is zero</div><div>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>  left preconditioning</div><div>  using PRECONDITIONED norm type for convergence test</div><div>PC Object: 16 MPI processes</div><div>  type: mg</div><div>    MG: type is MULTIPLICATIVE, levels=5 cycles=v</div><div>      Cycles per PCApply=1</div><div>      Using Galerkin computed coarse grid matrices</div><div>  Coarse grid solver -- level ------------------------------<wbr>-</div><div>    KSP Object:    (mg_coarse_)     16 MPI processes</div><div>      type: preonly</div><div>      maximum iterations=10000, initial guess is zero</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>      left preconditioning</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_coarse_)     16 MPI processes</div><div>      type: bjacobi</div><div>        block Jacobi: number of blocks = 16</div><div>        Local solve is same for all blocks, in the following KSP and PC objects:</div><div>      KSP Object:      (mg_coarse_sub_)       1 MPI processes</div><div>        type: preonly</div><div>        maximum iterations=10000, initial guess is zero</div><div>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>        left preconditioning</div><div>        using NONE norm type for convergence test</div><div>      PC Object:      (mg_coarse_sub_)       1 MPI processes</div><div>        type: lu</div><div>          LU: out-of-place factorization</div><div>          tolerance for zero pivot 2.22045e-14</div><div>          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]</div><div>          matrix ordering: nd</div><div>          factor fill ratio given 5., needed 2.3125</div><div>            Factored matrix follows:</div><div>              Mat Object:               1 MPI processes</div><div>                type: seqaij</div><div>                rows=32, cols=32</div><div>                package used to perform factorization: petsc</div><div>                total: nonzeros=370, allocated nonzeros=370</div><div>                total number of mallocs used during MatSetValues calls =0</div><div>                  not using I-node routines</div><div>        linear system matrix = precond matrix:</div><div>        Mat Object:         1 MPI processes</div><div>          type: seqaij</div><div>          rows=32, cols=32</div><div>          total: nonzeros=160, allocated nonzeros=160</div><div>          total number of mallocs used during MatSetValues calls =0</div><div>            not using I-node routines</div><div>      linear system matrix = precond matrix:</div><div>      Mat Object:       16 MPI processes</div><div>        type: mpiaij</div><div>        rows=512, cols=512</div><div>        total: nonzeros=3200, allocated nonzeros=3200</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>          not using I-node (on process 0) routines</div><div>  Down solver (pre-smoother) on level 1 ------------------------------<wbr>-</div><div>    KSP Object:    (mg_levels_1_)     16 MPI processes</div><div>      type: chebyshev</div><div>        Chebyshev: eigenvalue estimates:  min = 0.153005, max = 1.68306</div><div>        Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>        KSP Object:        (mg_levels_1_esteig_)         16 MPI processes</div><div>          type: gmres</div><div>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>            GMRES: happy breakdown tolerance 1e-30</div><div>          maximum iterations=10, initial guess is zero</div><div>          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>          left preconditioning</div><div>          using PRECONDITIONED norm type for convergence test</div><div>      maximum iterations=2</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_levels_1_)     16 MPI processes</div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>      linear system matrix = precond matrix:</div><div>      Mat Object:       16 MPI processes</div><div>        type: mpiaij</div><div>        rows=4096, cols=4096</div><div>        total: nonzeros=27136, allocated nonzeros=27136</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>          not using I-node (on process 0) routines</div><div>  Up solver (post-smoother) same as down solver (pre-smoother)</div><div>  Down solver (pre-smoother) on level 2 ------------------------------<wbr>-</div><div>    KSP Object:    (mg_levels_2_)     16 MPI processes</div><div>      type: chebyshev</div><div>        Chebyshev: eigenvalue estimates:  min = 0.152793, max = 1.68072</div><div>        Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>        KSP Object:        (mg_levels_2_esteig_)         16 MPI processes</div><div>          type: gmres</div><div>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>            GMRES: happy breakdown tolerance 1e-30</div><div>          maximum iterations=10, initial guess is zero</div><div>          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>          left preconditioning</div><div>          using PRECONDITIONED norm type for convergence test</div><div>      maximum iterations=2</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_levels_2_)     16 MPI processes</div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>      linear system matrix = precond matrix:</div><div>      Mat Object:       16 MPI processes</div><div>        type: mpiaij</div><div>        rows=32768, cols=32768</div><div>        total: nonzeros=223232, allocated nonzeros=223232</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>          not using I-node (on process 0) routines</div><div>  Up solver (post-smoother) same as down solver (pre-smoother)</div><div>  Down solver (pre-smoother) on level 3 ------------------------------<wbr>-</div><div>    KSP Object:    (mg_levels_3_)     16 MPI processes</div><div>      type: chebyshev</div><div>        Chebyshev: eigenvalue estimates:  min = 0.144705, max = 1.59176</div><div>        Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>        KSP Object:        (mg_levels_3_esteig_)         16 MPI processes</div><div>          type: gmres</div><div>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>            GMRES: happy breakdown tolerance 1e-30</div><div>          maximum iterations=10, initial guess is zero</div><div>          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>          left preconditioning</div><div>          using PRECONDITIONED norm type for convergence test</div><div>      maximum iterations=2</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_levels_3_)     16 MPI processes</div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>      linear system matrix = precond matrix:</div><div>      Mat Object:       16 MPI processes</div><div>        type: mpiaij</div><div>        rows=262144, cols=262144</div><div>        total: nonzeros=1810432, allocated nonzeros=1810432</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>          not using I-node (on process 0) routines</div><div>  Up solver (post-smoother) same as down solver (pre-smoother)</div><div>  Down solver (pre-smoother) on level 4 ------------------------------<wbr>-</div><div>    KSP Object:    (mg_levels_4_)     16 MPI processes</div><div>      type: chebyshev</div><div>        Chebyshev: eigenvalue estimates:  min = 0.140039, max = 1.54043</div><div>        Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>        KSP Object:        (mg_levels_4_esteig_)         16 MPI processes</div><div>          type: gmres</div><div>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>            GMRES: happy breakdown tolerance 1e-30</div><div>          maximum iterations=10, initial guess is zero</div><div>          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>          left preconditioning</div><div>          using PRECONDITIONED norm type for convergence test</div><div>      maximum iterations=2</div><div>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>      left preconditioning</div><div>      using nonzero initial guess</div><div>      using NONE norm type for convergence test</div><div>    PC Object:    (mg_levels_4_)     16 MPI processes</div><div>      type: sor</div><div>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>      linear system matrix = precond matrix:</div><div>      Mat Object:       16 MPI processes</div><div>        type: mpiaij</div><div>        rows=2097152, cols=2097152</div><div>        total: nonzeros=14581760, allocated nonzeros=14581760</div><div>        total number of mallocs used during MatSetValues calls =0</div><div>  Up solver (post-smoother) same as down solver (pre-smoother)</div><div>  linear system matrix = precond matrix:</div><div>  Mat Object:   16 MPI processes</div><div>    type: mpiaij</div><div>    rows=2097152, cols=2097152</div><div>    total: nonzeros=14581760, allocated nonzeros=14581760</div><div>    total number of mallocs used during MatSetValues calls =0</div><div>Residual 2 norm 0.0107114</div><div>Residual infinity norm 6.84843e-05</div><div><br></div><div>What are the differences between gamg and mg with -pc_mg_galerkin option (apart from the default smoother/coarse grid solver options, which I identified by comparing the ksp_view output)? Perhaps there’s an issue with the restriction, as you suggest?</div></div></div></div></div></div></blockquote><div><br></div><div>Okay, when you say a Poisson problem, I assumed you meant</div><div><br></div><div>  div grad phi = f</div><div><br></div><div>However, now it appears that you have</div><div><br></div><div>  div D grad phi = f</div><div><br></div><div>Is this true? It would explain your results. Your coarse operator is inaccurate. AMG makes the coarse operator directly</div><div>from the matrix, so it incorporates coefficient variation. Galerkin projection makes the coarse operator using R A P</div><div>from your original operator A, and this is accurate enough to get good convergence. So your coefficient representation</div><div>on the coarse levels is really bad. If you want to use GMG, you need to figure out how to represent the coefficient on</div><div>coarser levels, which is sometimes called "renormalization".</div><div><br></div><div>   Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div dir="auto" style="word-wrap:break-word"><div dir="auto" style="word-wrap:break-word"><div><div><div>Thanks!</div></div><br><blockquote type="cite"><div><div dir="ltr" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div>  Thanks,</div><div><br></div><div>    Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div style="word-wrap:break-word">$ mpirun -n 16 ./solver_test -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -ksp_view -ksp_monitor_true_residual -pc_type gamg -ksp_type cg<div><div>right hand side 2 norm: 512.</div><div>right hand side infinity norm: 0.999097</div><div>building operator with Dirichlet boundary conditions, global grid size: 128 x 128 x 128</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>0 KSP preconditioned resid norm 2.600515167901e+00 true resid norm 5.120000000000e+02 ||r(i)||/||b|| 1.000000000000e+00</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>1 KSP preconditioned resid norm 6.715532962879e-02 true resid norm 7.578946422553e+02 ||r(i)||/||b|| 1.480262973155e+00</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>2 KSP preconditioned resid norm 1.127682308441e-02 true resid norm 3.247852182315e+01 ||r(i)||/||b|| 6.343461293584e-02</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>3 KSP preconditioned resid norm 7.760468503025e-04 true resid norm 3.304142895659e+00 ||r(i)||/||b|| 6.453404093085e-03</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>4 KSP preconditioned resid norm 6.419777870067e-05 true resid norm 2.662993775521e-01 ||r(i)||/||b|| 5.201159717815e-04</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>5 KSP preconditioned resid norm 5.107540549482e-06 true resid norm 2.309528369351e-02 ||r(i)||/||b|| 4.510797596388e-05</div><div>KSP Object: 16 MPI processes</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>type: cg</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=10000, initial guess is zero</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>using PRECONDITIONED norm type for convergence test</div><div>PC Object: 16 MPI processes</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>type: gamg</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>MG: type is MULTIPLICATIVE, levels=5 cycles=v</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Cycles per PCApply=1</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Using Galerkin computed coarse grid matrices</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>GAMG specific options</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Threshold for dropping small values from graph 0.</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>AGG specific options</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>Symmetric graph false</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Coarse grid solver -- level ------------------------------<wbr>-</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:    (mg_coarse_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: preonly</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=10000, initial guess is zero</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:    (mg_coarse_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: bjacobi</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>block Jacobi: number of blocks = 16</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Local solve is same for all blocks, in the following KSP and PC objects:</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:      (mg_coarse_sub_)       1 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: preonly</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=1, initial guess is zero</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:      (mg_coarse_sub_)       1 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: lu</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>LU: out-of-place factorization</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>tolerance for zero pivot 2.22045e-14</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>using diagonal shift on blocks to prevent zero pivot [INBLOCKS]</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>matrix ordering: nd</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>factor fill ratio given 5., needed 1.</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>Factored matrix follows:</div><div>             <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:               1 MPI processes</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>type: seqaij</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>rows=13, cols=13</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>package used to perform factorization: petsc</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=169, allocated nonzeros=169</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div>                 <span class="m_7684181004605111704Apple-converted-space"> </span>using I-node routines: found 3 nodes, limit used is 5</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:         1 MPI processes</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>type: seqaij</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>rows=13, cols=13</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=169, allocated nonzeros=169</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>using I-node routines: found 3 nodes, limit used is 5</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:       16 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>rows=13, cols=13</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=169, allocated nonzeros=169</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>using I-node (on process 0) routines: found 3 nodes, limit used is 5</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Down solver (pre-smoother) on level 1 ------------------------------<wbr>-</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:    (mg_levels_1_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: chebyshev</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Chebyshev: eigenvalue estimates:  min = 0.136516, max = 1.50168</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:        (mg_levels_1_esteig_)         16 MPI processes</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>type: gmres</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>GMRES: happy breakdown tolerance 1e-30</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=10, initial guess is zero</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>using PRECONDITIONED norm type for convergence test</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=2</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using nonzero initial guess</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:    (mg_levels_1_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: sor</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:       16 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>rows=467, cols=467</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=68689, allocated nonzeros=68689</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>not using I-node (on process 0) routines</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Up solver (post-smoother) same as down solver (pre-smoother)</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Down solver (pre-smoother) on level 2 ------------------------------<wbr>-</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:    (mg_levels_2_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: chebyshev</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Chebyshev: eigenvalue estimates:  min = 0.148872, max = 1.63759</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:        (mg_levels_2_esteig_)         16 MPI processes</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>type: gmres</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>GMRES: happy breakdown tolerance 1e-30</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=10, initial guess is zero</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>using PRECONDITIONED norm type for convergence test</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=2</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using nonzero initial guess</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:    (mg_levels_2_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: sor</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:       16 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>rows=14893, cols=14893</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=1856839, allocated nonzeros=1856839</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>not using I-node (on process 0) routines</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Up solver (post-smoother) same as down solver (pre-smoother)</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Down solver (pre-smoother) on level 3 ------------------------------<wbr>-</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:    (mg_levels_3_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: chebyshev</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Chebyshev: eigenvalue estimates:  min = 0.135736, max = 1.49309</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:        (mg_levels_3_esteig_)         16 MPI processes</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>type: gmres</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>GMRES: happy breakdown tolerance 1e-30</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=10, initial guess is zero</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>using PRECONDITIONED norm type for convergence test</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=2</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using nonzero initial guess</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:    (mg_levels_3_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: sor</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:       16 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>rows=190701, cols=190701</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=6209261, allocated nonzeros=6209261</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>not using I-node (on process 0) routines</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Up solver (post-smoother) same as down solver (pre-smoother)</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Down solver (pre-smoother) on level 4 ------------------------------<wbr>-</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:    (mg_levels_4_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: chebyshev</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Chebyshev: eigenvalue estimates:  min = 0.140039, max = 1.54043</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Chebyshev: eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:        (mg_levels_4_esteig_)         16 MPI processes</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>type: gmres</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>GMRES: happy breakdown tolerance 1e-30</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=10, initial guess is zero</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>using PRECONDITIONED norm type for convergence test</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=2</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using nonzero initial guess</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:    (mg_levels_4_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: sor</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:       16 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>rows=2097152, cols=2097152</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=14581760, allocated nonzeros=14581760</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Up solver (post-smoother) same as down solver (pre-smoother)</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:   16 MPI processes</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>rows=2097152, cols=2097152</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=14581760, allocated nonzeros=14581760</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div>Residual 2 norm 0.0230953</div><div>Residual infinity norm 0.000240174</div></div><div><br></div><div><br></div><div><br></div><div>$ mpirun -n 16 ./solver_test -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -ksp_view -ksp_monitor_true_residual -pc_type mg -ksp_type cg -pc_mg_levels 5 -mg_levels_ksp_type richardson -mg_levels_ksp_richardson_self<wbr>_scale -mg_levels_ksp_max_it 5</div><div><div>right hand side 2 norm: 512.</div><div>right hand side infinity norm: 0.999097</div><div>building operator with Dirichlet boundary conditions, global grid size: 128 x 128 x 128</div><div>building operator with Dirichlet boundary conditions, global grid size: 16 x 16 x 16</div><div>building operator with Dirichlet boundary conditions, global grid size: 32 x 32 x 32</div><div>building operator with Dirichlet boundary conditions, global grid size: 64 x 64 x 64</div><div>building operator with Dirichlet boundary conditions, global grid size: 8 x 8 x 8</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>0 KSP preconditioned resid norm 1.957390963372e+03 true resid norm 5.120000000000e+02 ||r(i)||/||b|| 1.000000000000e+00</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>1 KSP preconditioned resid norm 7.501162328351e+02 true resid norm 3.373318498950e+02 ||r(i)||/||b|| 6.588512693262e-01</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>2 KSP preconditioned resid norm 7.658993705113e+01 true resid norm 1.827365322620e+02 ||r(i)||/||b|| 3.569072895742e-01</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>3 KSP preconditioned resid norm 9.059824824329e+02 true resid norm 1.426474831278e+02 ||r(i)||/||b|| 2.786083654840e-01</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>4 KSP preconditioned resid norm 4.091168582134e+02 true resid norm 1.292495057977e+02 ||r(i)||/||b|| 2.524404410112e-01</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>5 KSP preconditioned resid norm 7.422110759274e+01 true resid norm 1.258028404461e+02 ||r(i)||/||b|| 2.457086727463e-01</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>6 KSP preconditioned resid norm 4.619015396949e+01 true resid norm 1.213792421102e+02 ||r(i)||/||b|| 2.370688322464e-01</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>7 KSP preconditioned resid norm 6.391009527793e+01 true resid norm 1.124510270422e+02 ||r(i)||/||b|| 2.196309121917e-01</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>8 KSP preconditioned resid norm 7.446926604265e+01 true resid norm 1.077567310933e+02 ||r(i)||/||b|| 2.104623654166e-01</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>9 KSP preconditioned resid norm 4.220904319642e+01 true resid norm 9.988181971539e+01 ||r(i)||/||b|| 1.950816791316e-01</div><div> 10 KSP preconditioned resid norm 2.394387980018e+01 true resid norm 9.127579669592e+01 ||r(i)||/||b|| 1.782730404217e-01</div><div> 11 KSP preconditioned resid norm 1.360843954226e+01 true resid norm 8.771762326371e+01 ||r(i)||/||b|| 1.713234829369e-01</div><div> 12 KSP preconditioned resid norm 4.128223286694e+01 true resid norm 8.529182941649e+01 ||r(i)||/||b|| 1.665856043291e-01</div><div> 13 KSP preconditioned resid norm 2.183532094447e+01 true resid norm 8.263211340769e+01 ||r(i)||/||b|| 1.613908464994e-01</div><div> 14 KSP preconditioned resid norm 1.304178992338e+01 true resid norm 7.971822602122e+01 ||r(i)||/||b|| 1.556996601977e-01</div><div> 15 KSP preconditioned resid norm 7.573349141411e+00 true resid norm 7.520975377445e+01 ||r(i)||/||b|| 1.468940503407e-01</div><div> 16 KSP preconditioned resid norm 9.314890793459e+00 true resid norm 7.304954328407e+01 ||r(i)||/||b|| 1.426748892267e-01</div><div> 17 KSP preconditioned resid norm 4.445933446231e+00 true resid norm 6.978356031428e+01 ||r(i)||/||b|| 1.362960162388e-01</div><div> 18 KSP preconditioned resid norm 5.349719054065e+00 true resid norm 6.667516877214e+01 ||r(i)||/||b|| 1.302249390081e-01</div><div> 19 KSP preconditioned resid norm 3.295861671942e+00 true resid norm 6.182140339659e+01 ||r(i)||/||b|| 1.207449285090e-01</div><div> 20 KSP preconditioned resid norm 1.035616277789e+01 true resid norm 5.734720030036e+01 ||r(i)||/||b|| 1.120062505866e-01</div><div> 21 KSP preconditioned resid norm 3.211186072853e+01 true resid norm 5.552393909940e+01 ||r(i)||/||b|| 1.084451935535e-01</div><div> 22 KSP preconditioned resid norm 1.305589450595e+01 true resid norm 5.499062776214e+01 ||r(i)||/||b|| 1.074035698479e-01</div><div> 23 KSP preconditioned resid norm 2.686432456763e+00 true resid norm 5.207613218582e+01 ||r(i)||/||b|| 1.017111956754e-01</div><div> 24 KSP preconditioned resid norm 2.824784197849e+00 true resid norm 4.838619801451e+01 ||r(i)||/||b|| 9.450429299708e-02</div><div> 25 KSP preconditioned resid norm 1.071690618667e+00 true resid norm 4.607851421273e+01 ||r(i)||/||b|| 8.999709807174e-02</div><div> 26 KSP preconditioned resid norm 1.881879145107e+00 true resid norm 4.001593265961e+01 ||r(i)||/||b|| 7.815611847581e-02</div><div> 27 KSP preconditioned resid norm 1.572862295402e+00 true resid norm 3.838282973517e+01 ||r(i)||/||b|| 7.496646432650e-02</div><div> 28 KSP preconditioned resid norm 1.470751639074e+00 true resid norm 3.480847634691e+01 ||r(i)||/||b|| 6.798530536506e-02</div><div> 29 KSP preconditioned resid norm 1.024975253805e+01 true resid norm 3.242161363347e+01 ||r(i)||/||b|| 6.332346412788e-02</div><div> 30 KSP preconditioned resid norm 2.548780607710e+00 true resid norm 3.146609403253e+01 ||r(i)||/||b|| 6.145721490728e-02</div><div> 31 KSP preconditioned resid norm 1.560691471465e+00 true resid norm 2.970265802267e+01 ||r(i)||/||b|| 5.801300395052e-02</div><div> 32 KSP preconditioned resid norm 2.596714997356e+00 true resid norm 2.766969046763e+01 ||r(i)||/||b|| 5.404236419458e-02</div><div> 33 KSP preconditioned resid norm 7.034818331385e+00 true resid norm 2.684572557056e+01 ||r(i)||/||b|| 5.243305775501e-02</div><div> 34 KSP preconditioned resid norm 1.494072683898e+00 true resid norm 2.475430030960e+01 ||r(i)||/||b|| 4.834824279219e-02</div><div> 35 KSP preconditioned resid norm 2.080781323538e+01 true resid norm 2.334859550417e+01 ||r(i)||/||b|| 4.560272559409e-02</div><div> 36 KSP preconditioned resid norm 2.046655096031e+00 true resid norm 2.240354154839e+01 ||r(i)||/||b|| 4.375691708669e-02</div><div> 37 KSP preconditioned resid norm 7.606846976760e-01 true resid norm 2.109556419574e+01 ||r(i)||/||b|| 4.120227381981e-02</div><div> 38 KSP preconditioned resid norm 2.521301363193e+00 true resid norm 1.843497075964e+01 ||r(i)||/||b|| 3.600580226493e-02</div><div> 39 KSP preconditioned resid norm 3.726976470079e+00 true resid norm 1.794209917279e+01 ||r(i)||/||b|| 3.504316244686e-02</div><div> 40 KSP preconditioned resid norm 8.959884762705e-01 true resid norm 1.573137783532e+01 ||r(i)||/||b|| 3.072534733461e-02</div><div> 41 KSP preconditioned resid norm 1.227682448888e+00 true resid norm 1.501346415860e+01 ||r(i)||/||b|| 2.932317218476e-02</div><div> 42 KSP preconditioned resid norm 1.452770736534e+00 true resid norm 1.433942919922e+01 ||r(i)||/||b|| 2.800669765473e-02</div><div> 43 KSP preconditioned resid norm 5.675352390898e-01 true resid norm 1.216437815936e+01 ||r(i)||/||b|| 2.375855109250e-02</div><div> 44 KSP preconditioned resid norm 4.949409351772e-01 true resid norm 1.042812110399e+01 ||r(i)||/||b|| 2.036742403123e-02</div><div> 45 KSP preconditioned resid norm 2.002853875915e+00 true resid norm 9.309183650084e+00 ||r(i)||/||b|| 1.818199931657e-02</div><div> 46 KSP preconditioned resid norm 3.745525627399e-01 true resid norm 8.522457639380e+00 ||r(i)||/||b|| 1.664542507691e-02</div><div> 47 KSP preconditioned resid norm 1.811694613170e-01 true resid norm 7.531206553361e+00 ||r(i)||/||b|| 1.470938779953e-02</div><div> 48 KSP preconditioned resid norm 1.782171623447e+00 true resid norm 6.764441307706e+00 ||r(i)||/||b|| 1.321179942911e-02</div><div> 49 KSP preconditioned resid norm 2.299828236176e+00 true resid norm 6.702407994976e+00 ||r(i)||/||b|| 1.309064061519e-02</div><div> 50 KSP preconditioned resid norm 1.273834849543e+00 true resid norm 6.053797247633e+00 ||r(i)||/||b|| 1.182382274928e-02</div><div> 51 KSP preconditioned resid norm 2.719578737249e-01 true resid norm 5.470925517497e+00 ||r(i)||/||b|| 1.068540140136e-02</div><div> 52 KSP preconditioned resid norm 4.663757145206e-01 true resid norm 5.005785517882e+00 ||r(i)||/||b|| 9.776924839614e-03</div><div> 53 KSP preconditioned resid norm 1.292565284376e+00 true resid norm 4.881780753946e+00 ||r(i)||/||b|| 9.534728035050e-03</div><div> 54 KSP preconditioned resid norm 1.867369610632e-01 true resid norm 4.496564950399e+00 ||r(i)||/||b|| 8.782353418749e-03</div><div> 55 KSP preconditioned resid norm 5.249392115789e-01 true resid norm 4.092757959067e+00 ||r(i)||/||b|| 7.993667888803e-03</div><div> 56 KSP preconditioned resid norm 1.924525961621e-01 true resid norm 3.780501481010e+00 ||r(i)||/||b|| 7.383791955098e-03</div><div> 57 KSP preconditioned resid norm 5.779420386829e-01 true resid norm 3.213189014725e+00 ||r(i)||/||b|| 6.275759794385e-03</div><div> 58 KSP preconditioned resid norm 5.955339076981e-01 true resid norm 3.112032435949e+00 ||r(i)||/||b|| 6.078188351463e-03</div><div> 59 KSP preconditioned resid norm 3.750139060970e-01 true resid norm 2.999193364090e+00 ||r(i)||/||b|| 5.857799539239e-03</div><div> 60 KSP preconditioned resid norm 1.384679712935e-01 true resid norm 2.745891157615e+00 ||r(i)||/||b|| 5.363068667216e-03</div><div> 61 KSP preconditioned resid norm 7.632834890339e-02 true resid norm 2.176299405671e+00 ||r(i)||/||b|| 4.250584776702e-03</div><div> 62 KSP preconditioned resid norm 3.147491994853e-01 true resid norm 1.832893972188e+00 ||r(i)||/||b|| 3.579871039430e-03</div><div> 63 KSP preconditioned resid norm 5.052243308649e-01 true resid norm 1.775115122392e+00 ||r(i)||/||b|| 3.467021723421e-03</div><div> 64 KSP preconditioned resid norm 8.956523831283e-01 true resid norm 1.731441975933e+00 ||r(i)||/||b|| 3.381722609244e-03</div><div> 65 KSP preconditioned resid norm 7.897527588669e-01 true resid norm 1.682654829619e+00 ||r(i)||/||b|| 3.286435214100e-03</div><div> 66 KSP preconditioned resid norm 5.770941160165e-02 true resid norm 1.560734518349e+00 ||r(i)||/||b|| 3.048309606150e-03</div><div> 67 KSP preconditioned resid norm 3.553024960194e-02 true resid norm 1.389699750667e+00 ||r(i)||/||b|| 2.714257325521e-03</div><div> 68 KSP preconditioned resid norm 4.316233667769e-02 true resid norm 1.147051776028e+00 ||r(i)||/||b|| 2.240335500054e-03</div><div> 69 KSP preconditioned resid norm 3.793691994632e-02 true resid norm 1.012385825627e+00 ||r(i)||/||b|| 1.977316065678e-03</div><div> 70 KSP preconditioned resid norm 2.383460701011e-02 true resid norm 8.696480161436e-01 ||r(i)||/||b|| 1.698531281530e-03</div><div> 71 KSP preconditioned resid norm 6.376655007996e-02 true resid norm 7.779779636534e-01 ||r(i)||/||b|| 1.519488210261e-03</div><div> 72 KSP preconditioned resid norm 5.714768085413e-02 true resid norm 7.153671793501e-01 ||r(i)||/||b|| 1.397201522168e-03</div><div> 73 KSP preconditioned resid norm 1.708395350387e-01 true resid norm 6.312992319936e-01 ||r(i)||/||b|| 1.233006312487e-03</div><div> 74 KSP preconditioned resid norm 1.498516783452e-01 true resid norm 6.006527781743e-01 ||r(i)||/||b|| 1.173149957372e-03</div><div> 75 KSP preconditioned resid norm 1.218071938641e-01 true resid norm 5.769463903876e-01 ||r(i)||/||b|| 1.126848418726e-03</div><div> 76 KSP preconditioned resid norm 2.682030144251e-02 true resid norm 5.214035118381e-01 ||r(i)||/||b|| 1.018366234059e-03</div><div> 77 KSP preconditioned resid norm 9.794744927328e-02 true resid norm 4.660318995939e-01 ||r(i)||/||b|| 9.102185538943e-04</div><div> 78 KSP preconditioned resid norm 3.311394355245e-01 true resid norm 4.581129176231e-01 ||r(i)||/||b|| 8.947517922325e-04</div><div> 79 KSP preconditioned resid norm 7.771705063438e-02 true resid norm 4.103510898511e-01 ||r(i)||/||b|| 8.014669723654e-04</div><div> 80 KSP preconditioned resid norm 3.078123608908e-02 true resid norm 3.918493012988e-01 ||r(i)||/||b|| 7.653306665991e-04</div><div> 81 KSP preconditioned resid norm 2.759088686744e-02 true resid norm 3.289360804743e-01 ||r(i)||/||b|| 6.424532821763e-04</div><div> 82 KSP preconditioned resid norm 1.147671489846e-01 true resid norm 3.190902200515e-01 ||r(i)||/||b|| 6.232230860381e-04</div><div> 83 KSP preconditioned resid norm 1.101306468440e-02 true resid norm 2.900815313985e-01 ||r(i)||/||b|| 5.665654910126e-04</div><div>KSP Object: 16 MPI processes</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>type: cg</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=10000, initial guess is zero</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>using PRECONDITIONED norm type for convergence test</div><div>PC Object: 16 MPI processes</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>type: mg</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>MG: type is MULTIPLICATIVE, levels=5 cycles=v</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Cycles per PCApply=1</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Not using Galerkin computed coarse grid matrices</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Coarse grid solver -- level ------------------------------<wbr>-</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:    (mg_coarse_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: preonly</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=10000, initial guess is zero</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:    (mg_coarse_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: redundant</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Redundant preconditioner: First (color=0) of 16 PCs follows</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:        (mg_coarse_redundant_)         1 MPI processes</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>type: preonly</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=10000, initial guess is zero</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:        (mg_coarse_redundant_)         1 MPI processes</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>type: lu</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>LU: out-of-place factorization</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>tolerance for zero pivot 2.22045e-14</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>using diagonal shift on blocks to prevent zero pivot [INBLOCKS]</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>matrix ordering: nd</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>factor fill ratio given 5., needed 7.56438</div><div>             <span class="m_7684181004605111704Apple-converted-space"> </span>Factored matrix follows:</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:                 1 MPI processes</div><div>                 <span class="m_7684181004605111704Apple-converted-space"> </span>type: seqaij</div><div>                 <span class="m_7684181004605111704Apple-converted-space"> </span>rows=512, cols=512</div><div>                 <span class="m_7684181004605111704Apple-converted-space"> </span>package used to perform factorization: petsc</div><div>                 <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=24206, allocated nonzeros=24206</div><div>                 <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>not using I-node routines</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>         <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:           1 MPI processes</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>type: seqaij</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>rows=512, cols=512</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=3200, allocated nonzeros=3200</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div>             <span class="m_7684181004605111704Apple-converted-space"> </span>not using I-node routines</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:       16 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>rows=512, cols=512</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=3200, allocated nonzeros=3200</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Down solver (pre-smoother) on level 1 ------------------------------<wbr>-</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:    (mg_levels_1_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: richardson</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Richardson: using self-scale best computed damping factor</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=5</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using nonzero initial guess</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:    (mg_levels_1_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: sor</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:       16 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>rows=4096, cols=4096</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=27136, allocated nonzeros=27136</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Up solver (post-smoother) same as down solver (pre-smoother)</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Down solver (pre-smoother) on level 2 ------------------------------<wbr>-</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:    (mg_levels_2_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: richardson</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Richardson: using self-scale best computed damping factor</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=5</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using nonzero initial guess</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:    (mg_levels_2_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: sor</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:       16 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>rows=32768, cols=32768</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=223232, allocated nonzeros=223232</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Up solver (post-smoother) same as down solver (pre-smoother)</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Down solver (pre-smoother) on level 3 ------------------------------<wbr>-</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:    (mg_levels_3_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: richardson</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Richardson: using self-scale best computed damping factor</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=5</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using nonzero initial guess</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:    (mg_levels_3_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: sor</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:       16 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>rows=262144, cols=262144</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=1810432, allocated nonzeros=1810432</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Up solver (post-smoother) same as down solver (pre-smoother)</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Down solver (pre-smoother) on level 4 ------------------------------<wbr>-</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>KSP Object:    (mg_levels_4_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: richardson</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>Richardson: using self-scale best computed damping factor</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>maximum iterations=5</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>left preconditioning</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using nonzero initial guess</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>using NONE norm type for convergence test</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PC Object:    (mg_levels_4_)     16 MPI processes</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>type: sor</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div>     <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:       16 MPI processes</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>rows=2097152, cols=2097152</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=14581760, allocated nonzeros=14581760</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Up solver (post-smoother) same as down solver (pre-smoother)</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>linear system matrix = precond matrix:</div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>Mat Object:   16 MPI processes</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>type: mpiaij</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>rows=2097152, cols=2097152</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>total: nonzeros=14581760, allocated nonzeros=14581760</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>total number of mallocs used during MatSetValues calls =0</div><div>Residual 2 norm 0.290082</div><div>Residual infinity norm 0.00192869</div></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div>solver_test.c:</div><div><br></div><div><div>// modified version of ksp/ksp/examples/tutorials/ex3<wbr>4.c</div><div>// related: ksp/ksp/examples/tutorials/ex2<wbr>9.c</div><div>//          ksp/ksp/examples/tutorials/ex<wbr>32.c</div><div>//          ksp/ksp/examples/tutorials/ex<wbr>50.c</div><div><br></div><div>#include <petscdm.h></div><div>#include <petscdmda.h></div><div>#include <petscksp.h></div><div><br></div><div>extern PetscErrorCode ComputeMatrix(KSP,Mat,Mat,void<wbr>*);</div><div>extern PetscErrorCode ComputeRHS(KSP,Vec,void*);</div><div><br></div><div>typedef enum</div><div>{</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>DIRICHLET,</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>NEUMANN</div><div>} BCType;</div><div><br></div><div>#undef __FUNCT__</div><div>#define __FUNCT__ "main"</div><div>int main(int argc,char **argv)</div><div>{</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>KSP            ksp;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>DM             da;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscReal      norm;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscErrorCode ierr;</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscInt    i,j,k,mx,my,mz,xm,ym,zm,xs,ys<wbr>,zs;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscScalar Hx,Hy,Hz;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscScalar ***array;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>Vec         x,b,r;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>Mat         J;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>const char* bcTypes[2] = { "dirichlet", "neumann" };</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscInt    bcType = (PetscInt)DIRICHLET;</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscInitialize(&argc,&argv,<wbr>(char*)0,0);</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = PetscOptionsBegin(PETSC_COMM_W<wbr>ORLD, "", "", "");CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = PetscOptionsEList("-bc_type", "Type of boundary condition", "", bcTypes, 2, bcTypes[0], &bcType, NULL);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = PetscOptionsEnd();CHKERRQ(ierr<wbr>);</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = KSPCreate(PETSC_COMM_WORLD,&ks<wbr>p);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = DMDACreate3d(PETSC_COMM_WORLD,<wbr>DM_BOUNDARY_NONE,DM_BOUNDARY_N<wbr>ONE,DM_BOUNDARY_NONE,DMDA_STEN<wbr>CIL_STAR,-12,-12,-12,PETSC_<wbr>DECIDE,PETSC_DECIDE,PETSC_<wbr>DECIDE,1,1,0,0,0,&da);CHKERRQ(<wbr>ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = DMDASetInterpolationType(da, DMDA_Q0);CHKERRQ(ierr);</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = KSPSetDM(ksp,da);CHKERRQ(ierr)<wbr>;</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = KSPSetComputeRHS(ksp,ComputeRH<wbr>S,&bcType);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = KSPSetComputeOperators(ksp,Com<wbr>puteMatrix,&bcType);CHKERRQ(<wbr>ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = KSPSetFromOptions(ksp);CHKERRQ<wbr>(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = KSPSolve(ksp,NULL,NULL);CHKERR<wbr>Q(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = KSPGetSolution(ksp,&x);CHKERRQ<wbr>(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = KSPGetRhs(ksp,&b);CHKERRQ(ierr<wbr>);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = KSPGetOperators(ksp,NULL,&J);C<wbr>HKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = VecDuplicate(b,&r);CHKERRQ(ier<wbr>r);</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = MatMult(J,x,r);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = VecAYPX(r,-1.0,b);CHKERRQ(ierr<wbr>);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = VecNorm(r,NORM_2,&norm);CHKERR<wbr>Q(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = PetscPrintf(PETSC_COMM_WORLD,"<wbr>Residual 2 norm %g\n",(double)norm);CHKERRQ(ie<wbr>rr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = VecNorm(r,NORM_INFINITY,&norm)<wbr>;CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = PetscPrintf(PETSC_COMM_WORLD,"<wbr>Residual infinity norm %g\n",(double)norm);CHKERRQ(ie<wbr>rr);</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = VecDestroy(&r);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = KSPDestroy(&ksp);CHKERRQ(ierr)<wbr>;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = DMDestroy(&da);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = PetscFinalize();</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>return 0;</div><div>}</div><div><br></div><div>#undef __FUNCT__</div><div>#define __FUNCT__ "ComputeRHS"</div><div>PetscErrorCode ComputeRHS(KSP ksp,Vec b,void *ctx)</div><div>{</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscErrorCode ierr;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscInt       i,j,k,mx,my,mz,xm,ym,zm,xs,ys,<wbr>zs;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscScalar    Hx,Hy,Hz;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscScalar    ***array;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>DM             da;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>BCType bcType = *(BCType*)ctx;</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscFunctionBeginUser;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = KSPGetDM(ksp,&da);CHKERRQ(ierr<wbr>);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = DMDAGetInfo(da, 0, &mx, &my, &mz, 0,0,0,0,0,0,0,0,0);CHKERRQ(ier<wbr>r);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>Hx   = 1.0 / (PetscReal)(mx);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>Hy   = 1.0 / (PetscReal)(my);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>Hz   = 1.0 / (PetscReal)(mz);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = DMDAGetCorners(da,&xs,&ys,&zs,<wbr>&xm,&ym,&zm);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = DMDAVecGetArray(da, b, &array);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>for (k = zs; k < zs + zm; k++)</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>for (j = ys; j < ys + ym; j++)</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>for (i = xs; i < xs + xm; i++)</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>PetscReal x = ((PetscReal)i + 0.5) * Hx;</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>PetscReal y = ((PetscReal)j + 0.5) * Hy;</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>PetscReal z = ((PetscReal)k + 0.5) * Hz;</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>array[k][j][i] = PetscSinReal(x * 2.0 * PETSC_PI) * PetscCosReal(y * 2.0 * PETSC_PI) * PetscSinReal(z * 2.0 * PETSC_PI);</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = DMDAVecRestoreArray(da, b, &array);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = VecAssemblyBegin(b);CHKERRQ(ie<wbr>rr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = VecAssemblyEnd(b);CHKERRQ(ierr<wbr>);</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscReal norm;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>VecNorm(b, NORM_2, &norm);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscPrintf(PETSC_COMM_<wbr>WORLD, "right hand side 2 norm: %g\n", (double)norm);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>VecNorm(b, NORM_INFINITY, &norm);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscPrintf(PETSC_COMM_<wbr>WORLD, "right hand side infinity norm: %g\n", (double)norm);</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>/* force right hand side to be consistent for singular matrix */</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>/* note this is really a hack, normally the model would provide you with a consistent right handside */</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>if (bcType == NEUMANN)</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>MatNullSpace nullspace;</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = MatNullSpaceCreate(PETSC_COMM_<wbr>WORLD,PETSC_TRUE,0,0,&nullspac<wbr>e);CHKERRQ(ierr);</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = MatNullSpaceRemove(nullspace,b<wbr>);CHKERRQ(ierr);</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = MatNullSpaceDestroy(&nullspace<wbr>);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscFunctionReturn(0);</div><div>}</div><div><br></div><div><br></div><div>#undef __FUNCT__</div><div>#define __FUNCT__ "ComputeMatrix"</div><div>PetscErrorCode ComputeMatrix(KSP ksp, Mat J,Mat jac, void *ctx)</div><div>{</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscErrorCode ierr;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscInt       i,j,k,mx,my,mz,xm,ym,zm,xs,ys,<wbr>zs,num, numi, numj, numk;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscScalar    v[7],Hx,Hy,Hz;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>MatStencil     row, col[7];</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>DM             da;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>BCType bcType = *(BCType*)ctx;</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscFunctionBeginUser;</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>if (bcType == DIRICHLET)</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>PetscPrintf(PETSC_COMM_<wbr>WORLD, "building operator with Dirichlet boundary conditions, ");</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>else if (bcType == NEUMANN)</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>PetscPrintf(PETSC_COMM_<wbr>WORLD, "building operator with Neumann boundary conditions, ");</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>else</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>SETERRQ(PETSC_COMM_WORLD, PETSC_ERR_SUP, "unrecognized boundary condition type\n");</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr    = KSPGetDM(ksp,&da);CHKERRQ(ierr<wbr>);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr    = DMDAGetInfo(da,0,&mx,&my,&mz,0<wbr>,0,0,0,0,0,0,0,0);CHKERRQ(ierr<wbr>);</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscPrintf(PETSC_COMM_<wbr>WORLD, "global grid size: %d x %d x %d\n", mx, my, mz);</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>Hx      = 1.0 / (PetscReal)(mx);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>Hy      = 1.0 / (PetscReal)(my);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>Hz      = 1.0 / (PetscReal)(mz);</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscReal Hx2 = Hx * Hx;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscReal Hy2 = Hy * Hy;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscReal Hz2 = Hz * Hz;</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscReal scaleX = 1.0 / Hx2;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscReal scaleY = 1.0 / Hy2;</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscReal scaleZ = 1.0 / Hz2;</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr    = DMDAGetCorners(da,&xs,&ys,&zs,<wbr>&xm,&ym,&zm);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>for (k = zs; k < zs + zm; k++)</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>for (j = ys; j < ys + ym; j++)</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>for (i = xs; i < xs + xm; i++)</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>row.i = i;</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>row.j = j;</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>row.k = k;</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>if (i == 0 || j == 0 || k == 0 || i == mx - 1 || j == my - 1 || k == mz - 1)</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>num = 0;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>numi = 0;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>numj = 0;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>numk = 0;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>if (k != 0)</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>v[num] = -scaleZ;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].i = i;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].j = j;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].k = k - 1;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>num++;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>numk++;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>if (j != 0)</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>v[num] = -scaleY;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].i = i;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].j = j - 1;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].k = k;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>num++;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>numj++;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>if (i != 0)</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>v[num] = -scaleX;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].i = i - 1;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].j = j;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].k = k;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>num++;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>numi++;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>if (i != mx - 1)</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>v[num] = -scaleX;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].i = i + 1;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].j = j;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].k = k;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>num++;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>numi++;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>if (j != my - 1)</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>v[num] = -scaleY;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].i = i;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].j = j + 1;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].k = k;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>num++;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>numj++;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>if (k != mz - 1)</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>v[num] = -scaleZ;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].i = i;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].j = j;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].k = k + 1;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>num++;</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>numk++;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div><br></div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>if (bcType == NEUMANN)</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>v[num] = (PetscReal) (numk) * scaleZ + (PetscReal) (numj) * scaleY + (PetscReal) (numi) * scaleX;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>else if (bcType == DIRICHLET)</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>                       <span class="m_7684181004605111704Apple-converted-space"> </span>v[num] = 2.0 * (scaleX + scaleY + scaleZ);</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div><br></div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].i = i;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].j = j;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[num].k = k;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>num++;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = MatSetValuesStencil(jac, 1, &row, num, col, v, INSERT_VALUES);</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>CHKERRQ(ierr);</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>else</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>v[0] = -scaleZ;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[0].i = i;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[0].j = j;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[0].k = k - 1;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>v[1] = -scaleY;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[1].i = i;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[1].j = j - 1;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[1].k = k;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>v[2] = -scaleX;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[2].i = i - 1;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[2].j = j;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[2].k = k;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>v[3] = 2.0 * (scaleX + scaleY + scaleZ);</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[3].i = i;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[3].j = j;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[3].k = k;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>v[4] = -scaleX;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[4].i = i + 1;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[4].j = j;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[4].k = k;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>v[5] = -scaleY;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[5].i = i;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[5].j = j + 1;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[5].k = k;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>v[6] = -scaleZ;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[6].i = i;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[6].j = j;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>col[6].k = k + 1;</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = MatSetValuesStencil(jac, 1, &row, 7, col, v, INSERT_VALUES);</div><div>                   <span class="m_7684181004605111704Apple-converted-space"> </span>CHKERRQ(ierr);</div><div>               <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>           <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = MatAssemblyBegin(jac,MAT_FINAL<wbr>_ASSEMBLY);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = MatAssemblyEnd(jac,MAT_FINAL_A<wbr>SSEMBLY);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>if (bcType == NEUMANN)</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>{</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>MatNullSpace   nullspace;</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = MatNullSpaceCreate(PETSC_COMM_<wbr>WORLD,PETSC_TRUE,0,0,&nullspac<wbr>e);CHKERRQ(ierr);</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = MatSetNullSpace(J,nullspace);C<wbr>HKERRQ(ierr);</div><div>       <span class="m_7684181004605111704Apple-converted-space"> </span>ierr = MatNullSpaceDestroy(&nullspace<wbr>);CHKERRQ(ierr);</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>}</div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>PetscFunctionReturn(0);</div><div>}</div></div><div><br></div><div><br></div><div><div><blockquote type="cite"><div>On Jun 22, 2017, at 9:23 AM, Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>> wrote:</div><br class="m_7684181004605111704m_-5560899730941051774Apple-interchange-newline"><div><div dir="ltr" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px"><div class="gmail_extra"><div class="gmail_quote">On Wed, Jun 21, 2017 at 8:12 PM, Jason Lefley<span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span><span dir="ltr"><<a href="mailto:jason.lefley@aclectic.com" target="_blank">jason.lefley@aclectic.<wbr>com</a>></span><span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Hello,<br><br>We are attempting to use the PETSc KSP solver framework in a fluid dynamics simulation we developed. The solution is part of a pressure projection and solves a Poisson problem. We use a cell-centered layout with a regular grid in 3d. We started with ex34.c from the KSP tutorials since it has the correct calls for the 3d DMDA, uses a cell-centered layout, and states that it works with multi-grid. We modified the operator construction function to match the coefficients and Dirichlet boundary conditions used in our problem (we’d also like to support Neumann but left those out for now to keep things simple). As a result of the modified boundary conditions, our version does not perform a null space removal on the right hand side or operator as the original did. We also modified the right hand side to contain a sinusoidal pattern for testing. Other than these changes, our code is the same as the original ex34.c<br><br>With the default KSP options and using CG with the default pre-conditioner and without a pre-conditioner, we see good convergence. However, we’d like to accelerate the time to solution further and target larger problem sizes (>= 1024^3) if possible. Given these objectives, multi-grid as a pre-conditioner interests us. To understand the improvement that multi-grid provides, we ran ex45 from the KSP tutorials. ex34 with CG and no pre-conditioner appears to converge in a single iteration and we wanted to compare against a problem that has similar convergence patterns to our problem. Here’s the tests we ran with ex45:<br><br>mpirun -n 16 ./ex45 -da_grid_x 129 -da_grid_y 129 -da_grid_z 129<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>time in KSPSolve(): 7.0178e+00<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>solver iterations: 157<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>KSP final norm of residual: 3.16874e-05<br><br>mpirun -n 16 ./ex45 -da_grid_x 129 -da_grid_y 129 -da_grid_z 129 -ksp_type cg -pc_type none<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>time in KSPSolve(): 4.1072e+00<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>solver iterations: 213<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>KSP final norm of residual: 0.000138866<br><br>mpirun -n 16 ./ex45 -da_grid_x 129 -da_grid_y 129 -da_grid_z 129 -ksp_type cg<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>time in KSPSolve(): 3.3962e+00<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>solver iterations: 88<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>KSP final norm of residual: 6.46242e-05<br><br>mpirun -n 16 ./ex45 -da_grid_x 129 -da_grid_y 129 -da_grid_z 129 -pc_type mg -pc_mg_levels 5 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -mg_levels_pc_type bjacobi<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>time in KSPSolve(): 1.3201e+00<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>solver iterations: 4<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>KSP final norm of residual: 8.13339e-05<br><br>mpirun -n 16 ./ex45 -da_grid_x 129 -da_grid_y 129 -da_grid_z 129 -pc_type mg -pc_mg_levels 5 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -mg_levels_pc_type bjacobi -ksp_type cg<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>time in KSPSolve(): 1.3008e+00<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>solver iterations: 4<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>KSP final norm of residual: 2.21474e-05<br><br>We found the multi-grid pre-conditioner options in the KSP tutorials makefile. These results make sense; both the default GMRES and CG solvers converge and CG without a pre-conditioner takes more iterations. The multi-grid pre-conditioned runs are pretty dramatically accelerated and require only a handful of iterations.<br><br>We ran our code (modified ex34.c as described above) with the same parameters:<br><br>mpirun -n 16 ./solver_test -da_grid_x 128 -da_grid_y 128 -da_grid_z 128<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>time in KSPSolve(): 5.3729e+00<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>solver iterations: 123<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>KSP final norm of residual: 0.00595066<br><br>mpirun -n 16 ./solver_test -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -ksp_type cg -pc_type none<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>time in KSPSolve(): 3.6154e+00<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>solver iterations: 188<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>KSP final norm of residual: 0.00505943<br><br>mpirun -n 16 ./solver_test -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -ksp_type cg<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>time in KSPSolve(): 3.5661e+00<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>solver iterations: 98<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>KSP final norm of residual: 0.00967462<br><br>mpirun -n 16 ./solver_test -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -pc_type mg -pc_mg_levels 5 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -mg_levels_pc_type bjacobi<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>time in KSPSolve(): 4.5606e+00<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>solver iterations: 44<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>KSP final norm of residual: 949.553<br></blockquote><div><br></div><div>1) Dave is right</div><div><br></div><div>2) In order to see how many iterates to expect, first try using algebraic multigrid</div><div><br></div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>-pc_type gamg</div><div><br></div><div>This should work out of the box for Poisson</div><div><br></div><div>3) For questions like this, we really need to see</div><div><br></div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>-ksp_view -ksp_monitor_true_residual</div><div><br></div><div>4) It sounds like you smoother is not strong enough. You could try</div><div><br></div><div> <span class="m_7684181004605111704Apple-converted-space"> </span>-mg_levels_ksp_type richardson -mg_levels_ksp_richardson_self<wbr>_scale -mg_levels_ksp_max_it 5</div><div><br></div><div>or maybe GMRES until it works.</div><div><br></div><div> Thanks,</div><div><br></div><div>   <span class="m_7684181004605111704Apple-converted-space"> </span>Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">mpirun -n 16 ./solver_test -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -pc_type mg -pc_mg_levels 5 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -mg_levels_pc_type bjacobi -ksp_type cg<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>time in KSPSolve(): 1.5481e+01<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>solver iterations: 198<br>       <span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span>KSP final norm of residual: 0.916558<br><br>We performed all tests with petsc-3.7.6.<br><br>The trends with CG and GMRES seem consistent with the results from ex45. However, with multi-grid, something doesn’t seem right. Convergence seems poor and the solves run for many more iterations than ex45 with multi-grid as a pre-conditioner. I extensively validated the code that builds the matrix and also confirmed that the solution produced by CG, when evaluated with the system of equations elsewhere in our simulation, produces the same residual as indicated by PETSc. Given that we only made minimal modifications to the original example code, it seems likely that the operators constructed for the multi-grid levels are ok.<br><br>We also tried a variety of other suggested parameters for the multi-grid pre-conditioner as suggested in related mailing list posts but we didn’t observe any significant improvements over the results above.<br><br>Is there anything we can do to check the validity of the coefficient matrices built for the different multi-grid levels? Does it look like there could be problems there? Or any other suggestions to achieve better results with multi-grid? I have the -log_view, -ksp_view, and convergence monitor output from the above tests and can post any of it if it would assist.<br><br>Thanks</blockquote></div><br><br clear="all"><span class="HOEnZb"><font color="#888888"><span class="m_7684181004605111704HOEnZb"><font color="#888888"><div><br></div>--<span class="m_7684181004605111704m_-5560899730941051774Apple-converted-space"> </span><br><div class="m_7684181004605111704m_-5560899730941051774gmail-m_-5669117813700017836gmail_signature"><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.caam.rice.edu/~mk51/" target="_blank">http://www.caam.rice.edu/~mk51<wbr>/</a></div></div></div></font></span></font></span></div></div></div></blockquote></div><span class="HOEnZb"><font color="#888888"><br></font></span></div></div></blockquote></div><span class="HOEnZb"><font color="#888888"><br><br clear="all"><div><br></div>--<span class="m_7684181004605111704Apple-converted-space"> </span><br><div class="m_7684181004605111704gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.caam.rice.edu/~mk51/" target="_blank">http://www.caam.rice.edu/~<wbr>mk51/</a></div></div></div></font></span></div></div></div></blockquote></div><br></div></div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.caam.rice.edu/~mk51/" target="_blank">http://www.caam.rice.edu/~mk51/</a><br></div></div></div>

</div></div>