Mark,<br><br>I am using petsc-dev that I pulled after you made the changes for the non-symmetric discretization allocations last week.<br>I think the difference in our results comes from different convergence tolerances. I&#39;m using an rtol of 1.D-012. It seems to be converging very nicely now.<br>

I think I dropped the option to set ksp and pc on levels after a bit, and that seems to have made the difference. GAMG should scale much better than HYPRE and ML right?<br>


They both seem to work efficiently for really small core counts, but deteriorate with impressive speed as you go up the ladder.<br><br>    [0]PCSetUp_GAMG level 0 N=46330, n data rows=1, n data cols=1, nnz/row (ave)=6, np=4<br>

    [0]scaleFilterGraph 75.5527% nnz after filtering, with threshold 0.05, 6.95957 nnz ave.<br>    [0]maxIndSetAgg removed 0 of 46330 vertices. (0 local)<br>        [0]PCGAMGprolongator_AGG New grid 5903 nodes<br>            PCGAMGoptprol_AGG smooth P0: max eigen=1.923098e+00 min=3.858220e-02 PC=jacobi<br>

        [0]PCSetUp_GAMG 1) N=5903, n data cols=1, nnz/row (ave)=13, 4 active pes<br>    [0]scaleFilterGraph 52.8421% nnz after filtering, with threshold 0.05, 13.3249 nnz ave.<br>    [0]maxIndSetAgg removed 0 of 5903 vertices. (0 local)<br>

        [0]PCGAMGprolongator_AGG New grid 615 nodes<br>            PCGAMGoptprol_AGG smooth P0: max eigen=1.575363e+00 min=2.167886e-03 PC=jacobi<br>        [0]PCSetUp_GAMG 2) N=615, n data cols=1, nnz/row (ave)=21, 4 active pes<br>

    [0]scaleFilterGraph 24.7174% nnz after filtering, with threshold 0.05, 21.722 nnz ave.<br>    [0]maxIndSetAgg removed 0 of 615 vertices. (0 local)<br>        [0]PCGAMGprolongator_AGG New grid 91 nodes<br>            PCGAMGoptprol_AGG smooth P0: max eigen=1.676442e+00 min=2.270745e-03 PC=jacobi<br>

    [0]createLevel aggregate processors: npe: 4 --&gt; 1, neq=91<br>        [0]PCSetUp_GAMG 3) N=91, n data cols=1, nnz/row (ave)=37, 1 active pes<br>    [0]scaleFilterGraph 16.4384% nnz after filtering, with threshold 0.05, 37.7033 nnz ave.<br>

    [0]maxIndSetAgg removed 0 of 91 vertices. (0 local)<br>        [0]PCGAMGprolongator_AGG New grid 10 nodes<br>            PCGAMGoptprol_AGG smooth P0: max eigen=1.538313e+00 min=8.923063e-04 PC=jacobi<br>        [0]PCSetUp_GAMG 4) N=10, n data cols=1, nnz/row (ave)=10, 1 active pes<br>

    [0]PCSetUp_GAMG 5 levels, grid compexity = 1.29633<br>  Residual norms for pres_ solve.<br>  0 KSP preconditioned resid norm 4.680688832182e+06 true resid norm 2.621342052504e+03 ||r(i)||/||b|| 1.000000000000e+00<br>  2 KSP preconditioned resid norm 1.728993898497e+04 true resid norm 2.888375221014e+03 ||r(i)||/||b|| 1.101868876004e+00<br>

  4 KSP preconditioned resid norm 4.510102902646e+02 true resid norm 5.677727287161e+01 ||r(i)||/||b|| 2.165962004744e-02<br>  6 KSP preconditioned resid norm 3.959846836748e+01 true resid norm 1.973580779699e+00 ||r(i)||/||b|| 7.528894513455e-04<br>

  8 KSP preconditioned resid norm 3.175473803927e-01 true resid norm 4.315977395174e-02 ||r(i)||/||b|| 1.646476235732e-05<br> 10 KSP preconditioned resid norm 7.502408552205e-04 true resid norm 1.016040400933e-04 ||r(i)||/||b|| 3.876031363257e-08<br>

 12 KSP preconditioned resid norm 2.868067261023e-06 true resid norm 1.194542164810e-06 ||r(i)||/||b|| 4.556986997056e-10<br>KSP Object:(pres_) 4 MPI processes<br>  type: bcgsl<br>    BCGSL: Ell = 2<br>    BCGSL: Delta = 0<br>

  maximum iterations=5000<br>  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000<br>  left preconditioning<br>  has attached null space<br>  using nonzero initial guess<br>  using PRECONDITIONED norm type for convergence test<br>

PC Object:(pres_) 4 MPI processes<br>  type: gamg<br>    MG: type is MULTIPLICATIVE, levels=5 cycles=v<br>      Cycles per PCApply=1<br>      Using Galerkin computed coarse grid matrices<br>  Coarse grid solver -- level -------------------------------<br>

    KSP Object:    (pres_mg_coarse_)     4 MPI processes<br>      type: richardson<br>        Richardson: damping factor=1<br>      maximum iterations=1, initial guess is zero<br>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>

      left preconditioning<br>      using NONE norm type for convergence test<br>    PC Object:    (pres_mg_coarse_)     4 MPI processes<br>      type: sor<br>        SOR: type = local_symmetric, iterations = 8, local iterations = 1, omega = 1<br>

      linear system matrix = precond matrix:<br>      Matrix Object:       4 MPI processes<br>        type: mpiaij<br>        rows=10, cols=10<br>        total: nonzeros=100, allocated nonzeros=100<br>        total number of mallocs used during MatSetValues calls =0<br>

          using I-node (on process 0) routines: found 2 nodes, limit used is 5<br>  Down solver (pre-smoother) on level 1 -------------------------------<br>    KSP Object:    (pres_mg_levels_1_)     4 MPI processes<br>      type: richardson<br>

        Richardson: damping factor=1<br>      maximum iterations=1<br>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>      left preconditioning<br>      using nonzero initial guess<br>      using NONE norm type for convergence test<br>

    PC Object:    (pres_mg_levels_1_)     4 MPI processes<br>      type: sor<br>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1<br>      linear system matrix = precond matrix:<br>      Matrix Object:       4 MPI processes<br>

        type: mpiaij<br>        rows=91, cols=91<br>        total: nonzeros=3431, allocated nonzeros=3431<br>        total number of mallocs used during MatSetValues calls =0<br>          not using I-node (on process 0) routines<br>

  Up solver (post-smoother) same as down solver (pre-smoother)<br>  Down solver (pre-smoother) on level 2 -------------------------------<br>    KSP Object:    (pres_mg_levels_2_)     4 MPI processes<br>      type: richardson<br>

        Richardson: damping factor=1<br>      maximum iterations=1<br>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>      left preconditioning<br>      using nonzero initial guess<br>      using NONE norm type for convergence test<br>

    PC Object:    (pres_mg_levels_2_)     4 MPI processes<br>      type: sor<br>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1<br>      linear system matrix = precond matrix:<br>      Matrix Object:       4 MPI processes<br>

        type: mpiaij<br>        rows=615, cols=615<br>        total: nonzeros=13359, allocated nonzeros=13359<br>        total number of mallocs used during MatSetValues calls =0<br>          not using I-node (on process 0) routines<br>

  Up solver (post-smoother) same as down solver (pre-smoother)<br>  Down solver (pre-smoother) on level 3 -------------------------------<br>    KSP Object:    (pres_mg_levels_3_)     4 MPI processes<br>      type: richardson<br>

        Richardson: damping factor=1<br>      maximum iterations=1<br>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>      left preconditioning<br>      using nonzero initial guess<br>      using NONE norm type for convergence test<br>

    PC Object:    (pres_mg_levels_3_)     4 MPI processes<br>      type: sor<br>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1<br>      linear system matrix = precond matrix:<br>      Matrix Object:       4 MPI processes<br>

        type: mpiaij<br>        rows=5903, cols=5903<br>        total: nonzeros=78657, allocated nonzeros=78657<br>        total number of mallocs used during MatSetValues calls =0<br>          not using I-node (on process 0) routines<br>

  Up solver (post-smoother) same as down solver (pre-smoother)<br>  Down solver (pre-smoother) on level 4 -------------------------------<br>    KSP Object:    (pres_mg_levels_4_)     4 MPI processes<br>      type: richardson<br>

        Richardson: damping factor=1<br>      maximum iterations=1<br>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>      left preconditioning<br>      has attached null space<br>      using nonzero initial guess<br>

      using NONE norm type for convergence test<br>    PC Object:    (pres_mg_levels_4_)     4 MPI processes<br>      type: sor<br>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1<br>      linear system matrix = precond matrix:<br>

      Matrix Object:       4 MPI processes<br>        type: mpiaij<br>        rows=46330, cols=46330<br>        total: nonzeros=322437, allocated nonzeros=615417<br>        total number of mallocs used during MatSetValues calls =0<br>

          not using I-node (on process 0) routines<br>  Up solver (post-smoother) same as down solver (pre-smoother)<br>  linear system matrix = precond matrix:<br>  Matrix Object:   4 MPI processes<br>    type: mpiaij<br>

    rows=46330, cols=46330<br>    total: nonzeros=322437, allocated nonzeros=615417<br>    total number of mallocs used during MatSetValues calls =0<br>      not using I-node (on process 0) routines<br><br><br><br><br><div class="gmail_quote">

On Tue, Mar 20, 2012 at 2:21 PM, Mark F. Adams <span dir="ltr">&lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div style="word-wrap:break-word">John,<div><br></div><div>I had dome diagonal scaling stuff in my input which seemed to mess things up.  I don&#39;t understand that.  With your hypre parameters I get</div><div><br></div>


<div><div>     Complexity:    grid = 1.408828</div><div>                operator = 1.638900</div><div>                   cycle = 3.277856</div><div><br></div><div><div>  0 KSP preconditioned resid norm 2.246209947341e+06 true resid norm 2.621342052504e+03 ||r(i)||/||b|| 1.000000000000e+00</div>


<div>  2 KSP preconditioned resid norm 6.591054866442e+04 true resid norm 5.518411654910e+03 ||r(i)||/||b|| 2.105185643224e+00</div><div>  4 KSP preconditioned resid norm 2.721184454964e+03 true resid norm 1.937153214559e+03 ||r(i)||/||b|| 7.389929188022e-01</div>


<div>  6 KSP preconditioned resid norm 2.942012838854e+02 true resid norm 5.614763956317e+01 ||r(i)||/||b|| 2.141942502679e-02</div><div>  8 KSP preconditioned resid norm 2.143421596353e+01 true resid norm 5.306843482279e+00 ||r(i)||/||b|| 2.024475774617e-03</div>


<div> 10 KSP preconditioned resid norm 3.689048280659e+00 true resid norm 2.482945300243e-01 ||r(i)||/||b|| 9.472038560826e-05</div><div>Linear solve converged due to CONVERGED_RTOL iterations 10</div></div><div><br></div>


<div>with ML I get 18 iterations but if I add -pc_ml_Threshold .01 I get it to 12:</div><div><br></div><div><span style="white-space:pre-wrap">        </span>-@${MPIEXEC} -n 1 ./ex10 -f ./binaryoutput -pc_type ml -ksp_type bcgsl -pc_gamg_coarse_eq_limit 10 -pc_gamg_agg_nsmooths 1 -pc_gamg_sym_graph -mg_coarse_ksp_type richardson -mg_coarse_pc_type sor -mg_coarse_pc_sor_its 8 -ksp_monitor_true_residual -pc_gamg_verbose 2 -ksp_converged_reason -options_left -mg_levels_ksp_type richardson -mg_levels_pc_type sor -pc_ml_maxNlevels 5 -pc_ml_Threshold .01 </div>


<div><br></div></div><div><div>  0 KSP preconditioned resid norm 1.987800354481e+06 true resid norm 2.621342052504e+03 ||r(i)||/||b|| 1.000000000000e+00</div><div>  2 KSP preconditioned resid norm 4.845840795806e+04 true resid norm 9.664923970856e+03 ||r(i)||/||b|| 3.687013666005e+00</div>


<div>  4 KSP preconditioned resid norm 4.086337251141e+03 true resid norm 1.111442892542e+03 ||r(i)||/||b|| 4.239976585582e-01</div><div>  6 KSP preconditioned resid norm 1.496117919395e+03 true resid norm 4.243682354730e+02 ||r(i)||/||b|| 1.618896835946e-01</div>


<div>  8 KSP preconditioned resid norm 1.019912311314e+02 true resid norm 6.165476121107e+01 ||r(i)||/||b|| 2.352030371320e-02</div><div> 10 KSP preconditioned resid norm 1.282179114927e+01 true resid norm 4.434755525096e+00 ||r(i)||/||b|| 1.691788189512e-03</div>


<div> 12 KSP preconditioned resid norm 2.801790417375e+00 true resid norm 4.876299030996e-01 ||r(i)||/||b|| 1.860229963632e-04</div><div>Linear solve converged due to CONVERGED_RTOL iterations 12</div></div><div><br></div>


<div>and gamg:</div><div><br></div><div><span style="white-space:pre-wrap">        </span>-@${MPIEXEC} -n 1 ./ex10 -f ./binaryoutput -pc_type gamg -ksp_type bcgsl -pc_gamg_coarse_eq_limit 10 -pc_gamg_agg_nsmooths 1 -pc_gamg_sym_graph -mg_coarse_ksp_type richardson -mg_coarse_pc_type sor -mg_coarse_pc_sor_its 8 -ksp_monitor_true_residual -pc_gamg_verbose 2 -ksp_converged_reason -options_left -mg_levels_ksp_type richardson -mg_levels_pc_type sor </div>


<div><br></div><div>        [0]PCSetUp_GAMG 5 levels, grid compexity = 1.2916</div><div><div>  0 KSP preconditioned resid norm 6.288750978813e+06 true resid norm 2.621342052504e+03 ||r(i)||/||b|| 1.000000000000e+00</div>


<div>

  2 KSP preconditioned resid norm 3.009668424006e+04 true resid norm 4.394363256786e+02 ||r(i)||/||b|| 1.676379186222e-01</div><div>  4 KSP preconditioned resid norm 2.079756553216e+01 true resid norm 5.094584609440e+00 ||r(i)||/||b|| 1.943502414946e-03</div>


<div>  6 KSP preconditioned resid norm 4.323447593442e+00 true resid norm 3.146656048880e-01 ||r(i)||/||b|| 1.200398874261e-04</div><div><div>Linear solve converged due to CONVERGED_RTOL iterations 6</div></div>

</div><div><br></div><div>So this looks pretty different from what you are getting.  Is your code hardwiring anything?  Can you reproduce my results with ksp ex10.c?</div><div><br></div><div>Actually, I just realized that I am using petsc-dev.  What version of PETSc are you using?</div>


<div><br></div><div>Also, here is the makefile that I use to run this jobs:</div><div><br></div><div><div>ALL: runex10</div><div><br></div><div>include ${PETSC_DIR}/conf/variables</div><div>include ${PETSC_DIR}/conf/rules</div>


<div><br></div><div>runex10:</div><div><span style="white-space:pre-wrap">        </span>-@${MPIEXEC} -n 1 ./ex10 -f ./binaryoutput -pc_type gamg -ksp_type bcgsl -pc_gamg_coarse_eq_limit 10 -pc_gamg_agg_nsmooths 1 -pc_gamg_sym_graph -mg_coarse_ksp_type richardson -mg_coarse_pc_type sor -mg_coarse_pc_sor_its 8 -ksp_monitor_true_residual -pc_gamg_verbose 2 -ksp_converged_reason -options_left -mg_levels_ksp_type richardson -mg_levels_pc_type sor -pc_ml_maxNlevels 5 -pc_ml_Threshold .01 -pc_hypre_boomeramg_relax_type_coarse symmetric-SOR/Jacobi -pc_hypre_boomeramg_grid_sweeps_coarse 4 -pc_hypre_boomeramg_coarsen_type PMIS </div>


</div><div><br></div><div>You just need to run &#39;make ex10&#39; and then &#39;make -f this-file&#39;.</div><span><font color="#888888"><div><br></div><div>Mark</div></font></span><div><div><div>

<br><div><div>On Mar 20, 2012, at 2:45 PM, John Mousel wrote:</div><br><blockquote type="cite">Mark,<br><br>I run ML with the following options. <br><br>-ksp_type bcgsl -pc_type ml -pc_ml_maxNlevels 5 -mg_coarse_ksp_type richardson -mg_coarse_pc_type sor -mg_coarse_pc_sor_its 8 -ksp_monitor -ksp_view<br>


<br>Note the lack of scaling. For some reason scaling seems to mess with ML.<br>

As you can see below, ML converges very nicely.<br><br>With regards to HYPRE, this one took a bit of work to get convergence. The options that work are:<br><br>-ksp_type bcgsl -pc_type hypre -pc_hypre_type boomeramg -ksp_monitor_true_residual -pc_hypre_boomeramg_relax_type_coarse symmetric-SOR/Jacobi -pc_hypre_boomeramg_grid_sweeps_coarse 4  -pc_hypre_boomeramg_coarsen_type PMIS<br>


<br>The problem is that neither of ML or HYPRE seem to scale at all.<br><br>ML output:<br>  0 KSP preconditioned resid norm 1.538968715109e+06 true resid norm 2.621342052504e+03 ||r(i)||/||b|| 1.000000000000e+00<br>  2 KSP preconditioned resid norm 1.263129058693e+05 true resid norm 1.096298699671e+04 ||r(i)||/||b|| 4.182203915830e+00<br>


  4 KSP preconditioned resid norm 2.277379585186e+04 true resid norm 2.999721659930e+03 ||r(i)||/||b|| 1.144345758717e+00<br>  6 KSP preconditioned resid norm 4.766504457975e+03 true resid norm 6.733421603796e+02 ||r(i)||/||b|| 2.568692474667e-01<br>


  8 KSP preconditioned resid norm 2.139020425406e+03 true resid norm 1.360842101250e+02 ||r(i)||/||b|| 5.191394613876e-02<br> 10 KSP preconditioned resid norm 6.621380459944e+02 true resid norm 1.522758800025e+02 ||r(i)||/||b|| 5.809080881188e-02<br>


 12 KSP preconditioned resid norm 2.973409610262e+01 true resid norm 1.161046206089e+01 ||r(i)||/||b|| 4.429205280479e-03<br> 14 KSP preconditioned resid norm 2.532665482573e+00 true resid norm 2.557425874623e+00 ||r(i)||/||b|| 9.756170020543e-04<br>


 16 KSP preconditioned resid norm 2.375585214826e+00 true resid norm 2.441783841415e+00 ||r(i)||/||b|| 9.315014189327e-04<br> 18 KSP preconditioned resid norm 1.436338060675e-02 true resid norm 1.305304828818e-02 ||r(i)||/||b|| 4.979528816437e-06<br>


 20 KSP preconditioned resid norm 4.088293864561e-03 true resid norm 9.841243465634e-04 ||r(i)||/||b|| 3.754276728683e-07<br> 22 KSP preconditioned resid norm 6.140822977383e-04 true resid norm 1.682184150207e-04 ||r(i)||/||b|| 6.417263052718e-08<br>


 24 KSP preconditioned resid norm 2.685415483430e-05 true resid norm 1.065041542336e-05 ||r(i)||/||b|| 4.062962867890e-09<br> 26 KSP preconditioned resid norm 1.620776166579e-06 true resid norm 9.563268703474e-07 ||r(i)||/||b|| 3.648233809982e-10<br>


 28 KSP preconditioned resid norm 2.823291105652e-07 true resid norm 7.705418741178e-08 ||r(i)||/||b|| 2.939493811507e-11<br>KSP Object:(pres_) 4 MPI processes<br>  type: bcgsl<br>    BCGSL: Ell = 2<br>    BCGSL: Delta = 0<br>


  maximum iterations=5000<br>  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000<br>  left preconditioning<br>  has attached null space<br>  using nonzero initial guess<br>  using PRECONDITIONED norm type for convergence test<br>


PC Object:(pres_) 4 MPI processes<br>  type: ml<br>    MG: type is MULTIPLICATIVE, levels=5 cycles=v<br>      Cycles per PCApply=1<br>      Using Galerkin computed coarse grid matrices<br>  Coarse grid solver -- level -------------------------------<br>


    KSP Object:    (pres_mg_coarse_)     4 MPI processes<br>      type: richardson<br>        Richardson: damping factor=1<br>      maximum iterations=1, initial guess is zero<br>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>


      left preconditioning<br>      using PRECONDITIONED norm type for convergence test<br>    PC Object:    (pres_mg_coarse_)     4 MPI processes<br>      type: sor<br>        SOR: type = local_symmetric, iterations = 8, local iterations = 1, omega = 1<br>


      linear system matrix = precond matrix:<br>      Matrix Object:       4 MPI processes<br>        type: mpiaij<br>        rows=4, cols=4<br>        total: nonzeros=16, allocated nonzeros=16<br>        total number of mallocs used during MatSetValues calls =0<br>


          not using I-node (on process 0) routines<br>  Down solver (pre-smoother) on level 1 -------------------------------<br>    KSP Object:    (pres_mg_levels_1_)     4 MPI processes<br>      type: richardson<br>        Richardson: damping factor=1<br>


      maximum iterations=1<br>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>      left preconditioning<br>      using nonzero initial guess<br>      using PRECONDITIONED norm type for convergence test<br>


    PC Object:    (pres_mg_levels_1_)     4 MPI processes<br>      type: sor<br>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1<br>      linear system matrix = precond matrix:<br>      Matrix Object:       4 MPI processes<br>


        type: mpiaij<br>        rows=25, cols=25<br>        total: nonzeros=303, allocated nonzeros=303<br>        total number of mallocs used during MatSetValues calls =0<br>          using I-node (on process 0) routines: found 4 nodes, limit used is 5<br>


  Up solver (post-smoother) same as down solver (pre-smoother)<br>  Down solver (pre-smoother) on level 2 -------------------------------<br>    KSP Object:    (pres_mg_levels_2_)     4 MPI processes<br>      type: richardson<br>


        Richardson: damping factor=1<br>      maximum iterations=1<br>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>      left preconditioning<br>      using nonzero initial guess<br>      using PRECONDITIONED norm type for convergence test<br>


    PC Object:    (pres_mg_levels_2_)     4 MPI processes<br>      type: sor<br>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1<br>      linear system matrix = precond matrix:<br>      Matrix Object:       4 MPI processes<br>


        type: mpiaij<br>        rows=423, cols=423<br>        total: nonzeros=7437, allocated nonzeros=7437<br>        total number of mallocs used during MatSetValues calls =0<br>          not using I-node (on process 0) routines<br>


  Up solver (post-smoother) same as down solver (pre-smoother)<br>  Down solver (pre-smoother) on level 3 -------------------------------<br>    KSP Object:    (pres_mg_levels_3_)     4 MPI processes<br>      type: richardson<br>


        Richardson: damping factor=1<br>      maximum iterations=1<br>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>      left preconditioning<br>      using nonzero initial guess<br>      using PRECONDITIONED norm type for convergence test<br>


    PC Object:    (pres_mg_levels_3_)     4 MPI processes<br>      type: sor<br>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1<br>      linear system matrix = precond matrix:<br>      Matrix Object:       4 MPI processes<br>


        type: mpiaij<br>        rows=6617, cols=6617<br>        total: nonzeros=88653, allocated nonzeros=88653<br>        total number of mallocs used during MatSetValues calls =0<br>          not using I-node (on process 0) routines<br>


  Up solver (post-smoother) same as down solver (pre-smoother)<br>  Down solver (pre-smoother) on level 4 -------------------------------<br>    KSP Object:    (pres_mg_levels_4_)     4 MPI processes<br>      type: richardson<br>


        Richardson: damping factor=1<br>      maximum iterations=1<br>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000<br>      left preconditioning<br>      has attached null space<br>      using nonzero initial guess<br>


      using PRECONDITIONED norm type for convergence test<br>    PC Object:    (pres_mg_levels_4_)     4 MPI processes<br>      type: sor<br>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1<br>


      linear system matrix = precond matrix:<br>      Matrix Object:       4 MPI processes<br>        type: mpiaij<br>        rows=46330, cols=46330<br>        total: nonzeros=322437, allocated nonzeros=615417<br>        total number of mallocs used during MatSetValues calls =0<br>


          not using I-node (on process 0) routines<br>  Up solver (post-smoother) same as down solver (pre-smoother)<br>  linear system matrix = precond matrix:<br>  Matrix Object:   4 MPI processes<br>    type: mpiaij<br>


    rows=46330, cols=46330<br>    total: nonzeros=322437, allocated nonzeros=615417<br>    total number of mallocs used during MatSetValues calls =0<br>      not using I-node (on process 0) routines<br><br><br><br>

John<br><br><br><br><div class="gmail_quote">On Tue, Mar 20, 2012 at 1:33 PM, Mark F. Adams <span dir="ltr">&lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div style="word-wrap:break-word">John, <div><br></div><div>I am getting poor results (diverging) from ML also:<div><br></div><div><div>  0 KSP preconditioned resid norm 3.699832960909e+22 true resid norm 1.310674055116e+03 ||r(i)||/||b|| 1.000000000000e+00</div>


<div>  2 KSP preconditioned resid norm 5.706378365783e+11 true resid norm 1.563902233018e+03 ||r(i)||/||b|| 1.193204539995e+00</div><div>  4 KSP preconditioned resid norm 5.570291685152e+11 true resid norm 1.564235542744e+03 ||r(i)||/||b|| 1.193458844050e+00</div>


<div>  6 KSP preconditioned resid norm 5.202150407298e+10 true resid norm 1.749929789082e+03 ||r(i)||/||b|| 1.335137277077e+00</div><div>Linear solve converged due to CONVERGED_RTOL iterations 6</div><div><br></div><div>


With GAMG I get:</div>

<div><br></div><div><div>  0 KSP preconditioned resid norm 7.731260075891e+06 true resid norm 1.310674055116e+03 ||r(i)||/||b|| 1.000000000000e+00</div><div>  2 KSP preconditioned resid norm 2.856415184685e+05 true resid norm 1.310410242531e+03 ||r(i)||/||b|| 9.997987199150e-01</div>


<div>  4 KSP preconditioned resid norm 1.528467019258e+05 true resid norm 1.284856538976e+03 ||r(i)||/||b|| 9.803021078816e-01</div><div>  6 KSP preconditioned resid norm 1.451091957899e+05 true resid norm 1.564309254168e+03 ||r(i)||/||b|| 1.193515083375e+00</div>


<div><br></div><div>&lt;snip&gt; </div><div><br></div><div>122 KSP preconditioned resid norm 2.486245341783e+01 true resid norm 1.404397185367e+00 ||r(i)||/||b|| 1.071507580306e-03</div><div>124 KSP preconditioned resid norm 1.482316853621e+01 true resid norm 4.488661881759e-01 ||r(i)||/||b|| 3.424697287811e-04</div>


<div>126 KSP preconditioned resid norm 1.481941150253e+01 true resid norm 4.484480100832e-01 ||r(i)||/||b|| 3.421506730318e-04</div><div>128 KSP preconditioned resid norm 8.191887347033e+00 true resid norm 6.678630367218e-01 ||r(i)||/||b|| 5.095569215816e-04</div>


</div><div><br></div><div>And HYPRE:</div><div><br></div><div><div>  0 KSP preconditioned resid norm 3.774510769907e+04 true resid norm 1.310674055116e+03 ||r(i)||/||b|| 1.000000000000e+00</div><div>  2 KSP preconditioned resid norm 1.843165835831e+04 true resid norm 8.502433792869e+02 ||r(i)||/||b|| 6.487069580482e-01</div>


<div>  4 KSP preconditioned resid norm 1.573176624705e+04 true resid norm 1.167264367302e+03 ||r(i)||/||b|| 8.905832558033e-01</div><div>  6 KSP preconditioned resid norm 1.657958380765e+04 true resid norm 8.684701624902e+02 ||r(i)||/||b|| 6.626133775216e-01</div>


<div>  8 KSP preconditioned resid norm 2.190304455083e+04 true resid norm 6.969893263600e+02 ||r(i)||/||b|| 5.317792960344e-01</div><div> 10 KSP preconditioned resid norm 2.485714630000e+04 true resid norm 6.642641436830e+02 ||r(i)||/||b|| 5.068110878446e-01</div>


<div> </div><div>&lt;snip&gt; </div><div><br></div><div> 62 KSP preconditioned resid norm 6.432516040957e+00 true resid norm 2.124960171419e-01 ||r(i)||/||b|| 1.621272781837e-04</div><div> 64 KSP preconditioned resid norm 5.731033176541e+00 true resid norm 1.338816774003e-01 ||r(i)||/||b|| 1.021471943216e-04</div>


<div> 66 KSP preconditioned resid norm 1.600285935522e-01 true resid norm 3.352408932031e-03 ||r(i)||/||b|| 2.557774695353e-06</div></div><div><br></div><div>ML and GAMG should act similarly, but ML seems to have a problem (see the preconditioned norm difference and its diverging).  ML has a parameter:</div>


<div><br></div><div> -pc_ml_Threshold [.0] </div><div><br></div><div>I setting this to 0.05 (GAMG default) helps a bit but it still diverges.</div><div><br></div><div>So it would be nice to figure out the difference between ML and GAMG, but that is secondary for you as the both suck.</div>


<div><br></div><div>HYPRE is a very different algorithm.  It looks like the smoothing in GAMG (and ML) may be the problem.  If I turn smoothing off (-pc_gamg_agg_nsmooths 0) and I get for GAMG:</div><div><br></div><div><div>


  0 KSP preconditioned resid norm 2.186148437534e+05 true resid norm 1.310674055116e+03 ||r(i)||/||b|| 1.000000000000e+00</div><div>  2 KSP preconditioned resid norm 2.916843959765e+04 true resid norm 3.221533667508e+03 ||r(i)||/||b|| 2.457921292432e+00</div>


<div>  4 KSP preconditioned resid norm 2.396374655925e+04 true resid norm 1.834299897412e+03 ||r(i)||/||b|| 1.399508817812e+00</div><div>  6 KSP preconditioned resid norm 2.509576275453e+04 true resid norm 1.035475461174e+03 ||r(i)||/||b|| 7.900327752214e-01</div>


<div>  </div><div>&lt;snip&gt; </div><div><br></div><div> 64 KSP preconditioned resid norm 1.973859758284e+01 true resid norm 7.322674977169e+00 ||r(i)||/||b|| 5.586953482895e-03</div><div> 66 KSP preconditioned resid norm 3.371598890438e+01 true resid norm 7.152754930495e+00 ||r(i)||/||b|| 5.457310231004e-03</div>


<div> 68 KSP preconditioned resid norm 4.687839294418e+00 true resid norm 4.605726307025e-01 ||r(i)||/||b|| 3.514013487219e-04</div><div> 70 KSP preconditioned resid norm 1.487545519493e+00 true resid norm 1.558723789416e-01 ||r(i)||/||b|| 1.189253562571e-04</div>


<div> 72 KSP preconditioned resid norm 5.317329808718e-01 true resid norm 5.027178038177e-02 ||r(i)||/||b|| 3.835566911967e-05</div><div> 74 KSP preconditioned resid norm 3.405339702462e-01 true resid norm 1.897059263835e-02 ||r(i)||/||b|| 1.447392092969e-05</div>


</div><div><br></div><div>This is almost as good as HYPRE.</div><div><br></div><div>An other thing to keep in mind is the cost of each iteration, not just then number of iterations,  You can use  -pc_hypre_boomeramg_print_statistics to get some data on this from HYPRE:</div>


<div><br></div><div><div> Average Convergence Factor = 0.537664</div><div><br></div><div>     Complexity:    grid = 1.780207</div><div>                operator = 2.624910</div><div>                   cycle = 5.249670</div>


</div><div><br></div><div>And GAMG prints this with verbose set:</div><div><br></div><div>[0]PCSetUp_GAMG 6 levels, grid compexity [sic] = 1.1316</div><div><br></div><div>I believe that the hypre &quot;Complexity:    grid&quot; is the same as my &quot;grid complexity&quot;.  So hypre actually looks more expensive at this point. </div>


<div><br></div><div>I&#39;ve worked on optimizing parameters for hypre with the hypre people and here are a set of arguments that I&#39;ve used:</div><div><br></div><div>-pc_hypre_boomeramg_no_CF -pc_hypre_boomeramg_agg_nl 1 -pc_hypre_boomeramg_coarsen_type HMIS -pc_hypre_boomeramg_interp_type ext+i -pc_hypre_boomeramg_P_max 4 -pc_hypre_boomeramg_agg_num_paths 2</div>


<div><br></div><div>With these parmeters I get:</div><div><br></div><div><div>     Complexity:    grid = 1.244140</div><div>                operator = 1.396722</div><div>                   cycle = 2.793442</div></div><div>


<br></div><div>and:</div><div><br></div><div><div>  0 KSP preconditioned resid norm 4.698624821403e+04 true resid norm 1.310674055116e+03 ||r(i)||/||b|| 1.000000000000e+00</div><div>  2 KSP preconditioned resid norm 2.207967626172e+04 true resid norm 3.466160021150e+03 ||r(i)||/||b|| 2.644562931280e+00</div>


<div>  4 KSP preconditioned resid norm 2.278468320876e+04 true resid norm 1.246784122467e+03 ||r(i)||/||b|| 9.512541410282e-01</div><div>   </div><div>&lt;snip&gt; </div><div><br></div><div> 56 KSP preconditioned resid norm 1.108460232262e+00 true resid norm 8.276869475681e-02 ||r(i)||/||b|| 6.314971631105e-05</div>


<div> 58 KSP preconditioned resid norm 3.617217454336e-01 true resid norm 3.764556404754e-02 ||r(i)||/||b|| 2.872229285428e-05</div><div> 60 KSP preconditioned resid norm 1.666532560770e-01 true resid norm 5.149302513338e-03 ||r(i)||/||b|| 3.928743758404e-06</div>


<div>Linear solve converged due to CONVERGED_RTOL iterations 60</div></div><div><br></div><div>So this actually converged faster with lower complexity.</div><div><br></div><div>Anyway these result seem different from what you are getting, so I&#39;ve appended my options.  This uses ex10 in the KSP tutorials to read in your binary file.</div>


<div><br></div><div>Mark</div><div><br></div><div><div>#PETSc Option Table entries:</div><div>-f ./binaryoutput</div><div>-ksp_converged_reason</div><div>-ksp_diagonal_scale</div><div>-ksp_diagonal_scale_fix</div><div>-ksp_monitor_true_residual</div>


<div>-ksp_type bcgsl</div><div>-mg_coarse_ksp_type richardson</div><div>-mg_coarse_pc_sor_its 8</div><div>-mg_coarse_pc_type sor</div><div>-mg_levels_ksp_type richardson</div><div>-mg_levels_pc_type sor</div><div>-options_left</div>


<div>-pc_gamg_agg_nsmooths 0</div><div>-pc_gamg_coarse_eq_limit 10</div><div>-pc_gamg_sym_graph</div><div>-pc_gamg_verbose 2</div><div>-pc_hypre_boomeramg_P_max 4</div><div>-pc_hypre_boomeramg_agg_nl 1</div><div>-pc_hypre_boomeramg_agg_num_paths 2</div>


<div>-pc_hypre_boomeramg_coarsen_type HMIS</div><div>-pc_hypre_boomeramg_interp_type ext+i</div><div>-pc_hypre_boomeramg_no_CF</div><div>-pc_ml_Threshold .01</div><div>-pc_type gamg</div><div>-vecload_block_size 1</div><div>


#End of PETSc Option Table entries</div><div>There are 7 unused database options. They are:</div><div>Option left: name:-pc_hypre_boomeramg_P_max value: 4</div><div>Option left: name:-pc_hypre_boomeramg_agg_nl value: 1</div>


<div>Option left: name:-pc_hypre_boomeramg_agg_num_paths value: 2</div><div>Option left: name:-pc_hypre_boomeramg_coarsen_type value: HMIS</div><div>Option left: name:-pc_hypre_boomeramg_interp_type value: ext+i</div><div>


Option left: name:-pc_hypre_boomeramg_no_CF no value </div><div>Option left: name:-pc_ml_Threshold value: .01</div></div><div><br></div><div><br><div><div><div><div>On Mar 20, 2012, at 10:19 AM, John Mousel wrote:</div>

<br></div></div><blockquote type="cite"><div><div>Mark,<br><br>Sorry for the late reply. I&#39;ve been on travel and hadn&#39;t had a chance to pick this back up. I&#39;ve tried running with the suggested options:<br>

<br>-ksp_type bcgsl -pc_type gamg  -pc_gamg_coarse_eq_limit 10 -pc_gamg_agg_nsmooths 1 -pc_gamg_sym_graph -mg_coarse_ksp_type richardson -mg_coarse_pc_type sor -mg_coarse_pc_sor_its 8 -ksp_diagonal_scale -ksp_diagonal_scale_fix -ksp_monitor_true_residual -ksp_view -pc_gamg_verbose 1<br>


<br>With these options, the convergence starts to hang (see attached GAMG_kspview.txt). The hanging happens for both -mg_coarse_ksp_type richardson and preonly. It was my understanding from previous emails that using preonly made it so that only the preconditioner was run, which in this case would be 8 sweeps of SOR. If I get rid of the -pc_gamg_agg_nsmooths 1 (see GAMG_kspview_nosmooth.txt), the problem converges, but again the convergence is slow. Without this option, both Richardson and preonly converge in 172 iterations. <br>


<br>Matt, I&#39;ve checked and the problem does converge in the true residual using GAMG, ML, HYPRE, and ILU preconditioned BiCG. I explicitly ensure that a solution exists by projecting the rhs vector out of the nullity of the transpose of operator.<br>


<br>John<br><br><br><div class="gmail_quote">On Fri, Mar 16, 2012 at 2:04 PM, Mark F. Adams <span dir="ltr">&lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div style="word-wrap:break-word">John, did this get resolved?<span><font color="#888888"><div>Mark</div></font></span><div><div><div><br><div><div>On Mar 15, 2012, at 4:24 PM, John Mousel wrote:</div>

<br><blockquote type="cite">Mark,<div><br></div><div>Running without the options you mentioned before leads to slightly worse performance (175 iterations). </div><div>I have not been able to get run coarse grid solve to work with LU while running ML. It keeps experiencing a zero pivot, and all the combinations of shifting i&#39;ve tried haven&#39;t lead me anywhere, hence the SOR on the course grid. Also, the ML manual suggests limiting the number of levels to 3 or 4 and performing a few sweeps of an iterative method as opposed to a direct solve. </div>


<div><br></div><div>John<br><br><div class="gmail_quote">On Thu, Mar 15, 2012 at 12:04 PM, Mark F. Adams <span dir="ltr">&lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div style="word-wrap:break-word">You also want:  -pc_gamg_agg_nsmooths 1<div><br></div><div>You are running plain aggregation.  If it is Poisson then smoothing is good.</div><div><br></div><div>Is this problem singular?  Can you try running ML with these parameters and see if its performance degrades?  The ML implementation uses the PETSC infrastructure and uses a very similar algorithm to GAMG-SA.  We should be able to get these two to match pretty well.</div>


<div><br></div><div>Mark<br><div><br></div><div><br><div><div><div><div>On Mar 15, 2012, at 12:21 PM, John Mousel wrote:</div><br></div></div><blockquote type="cite"><div><div>Mark,<br><br>I ran with those options removed (see the run options listed below). Things actually got slightly worse. Now it&#39;s up to 142 iterations. I have attached the ksp_view output.<br>


<br>-ksp_type bcgsl -pc_type gamg -pc_gamg_sym_graph -ksp_diagonal_scale -ksp_diagonal_scale_fix -mg_levels_ksp_type richardson -mg_levels_pc_type sor -pc_gamg_verbose 1 <br>

<br><br>John<br><br><br><div class="gmail_quote">On Thu, Mar 15, 2012 at 10:55 AM, Mark F. Adams <span dir="ltr">&lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div style="word-wrap:break-word">John, can you run again with:  -pc_gamg_verbose 1<div><br></div><div>And I would not use: -pc_mg_levels 4 -mg_coarse_ksp_type preonly -mg_coarse_pc_type sor -mg_coarse_pc_sor_its 8</div>


<div>

<br></div><div>1) I think -mg_coarse_ksp_type preonly and -mg_coarse_pc_sor_its 8 do not do what you think.  I think this is the same as 1 iteration.  I think you want &#39;richardson&#39; not &#39;preonly&#39;.</div><div>


<br></div><div>2) Why are you using sor as the coarse solver?  If your problem is singular then you want to use as many levels as possible to get the coarse grid to be tiny.  I&#39;m pretty sure HYPRE ignores the coarse solver parameters.  But ML uses them and it is converging well.</div>


<div><br></div><div>3) I would not specify the number of levels.  GAMG, and I think the rest, have internal logic for stopping a the right level.  If the coarse level is large and you use just 8 iterations of sor then convergence will suffer.</div>


<div><br></div><div>Mark</div><div><br><div><div><div><div>On Mar 15, 2012, at 11:13 AM, John Mousel wrote:</div><br></div></div><blockquote type="cite"><div><div>Mark,<br><br>The changes pulled through this morning. I&#39;ve run it with the options<br>


<br>-ksp_type bcgsl -pc_type gamg -pc_gamg_sym_graph -ksp_diagonal_scale -ksp_diagonal_scale_fix -pc_mg_levels 4 -mg_levels_ksp_type richardson -mg_levels_pc_type sor -mg_coarse_ksp_type preonly -mg_coarse_pc_type sor -mg_coarse_pc_sor_its 8<br>


<br>and it converges in the true residual, but it&#39;s not converging as fast as anticpated. The matrix arises from a non-symmetric discretization of the Poisson equation. The solve takes GAMG 114 iterations, whereas ML takes 24 iterations, BoomerAMG takes 22 iterations, and -ksp_type bcgsl -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 4 takes around 170. I&#39;ve attached the -ksp_view results for ML,GAMG, and HYPRE. I&#39;ve attempted to make all the options the same on all levels for ML and GAMG. <br>


<br>Any thoughts?<br><br>John<br><br><br><div class="gmail_quote">On Wed, Mar 14, 2012 at 6:04 PM, Mark F. Adams <span dir="ltr">&lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>Humm, I see it with hg view (appended).  </div><div><br></div><div>Satish, my main repo looks hosed.  I see this:</div>


<div><br></div><div><div>~/Codes/petsc-dev&gt;hg update</div><div><div>abort: crosses branches (merge branches or use --clean to discard changes)</div></div><div>~/Codes/petsc-dev&gt;hg merge</div><div>abort: branch &#39;default&#39; has 3 heads - please merge with an explicit rev</div>


<div>(run &#39;hg heads .&#39; to see heads)</div><div>~/Codes/petsc-dev&gt;hg heads</div><div>changeset:   22496:8e2a98268179</div><div>tag:         tip</div><div>user:        Barry Smith &lt;<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>&gt;</div>


<div>date:        Wed Mar 14 16:42:25 2012 -0500</div><div>files:       src/vec/is/interface/f90-custom/zindexf90.c src/vec/vec/interface/f90-custom/zvectorf90.c</div><div>description:</div><div>undoing manually changes I put in because Satish had a better fix</div>


<div><br></div><div><br></div><div>changeset:   22492:bda4df63072d</div><div>user:        Mark F. Adams &lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt;</div><div>date:        Wed Mar 14 17:39:52 2012 -0400</div>


<div>files:       src/ksp/pc/impls/gamg/tools.c</div><div>description:</div><div>fix for unsymmetric matrices.</div><div><br></div><div><br></div><div>changeset:   22469:b063baf366e4</div><div>user:        Mark F. Adams &lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt;</div>


<div>date:        Wed Mar 14 14:22:28 2012 -0400</div><div>files:       src/ksp/pc/impls/gamg/tools.c</div><div>description:</div><div>added fix for preallocation for unsymetric matrices.</div><div><br></div></div><div>Mark</div>


<div><br></div><div>my &#39;hg view&#39; on my merge repo:</div><div><br></div><div>Revision: 22492</div><div>Branch: default</div><div>Author: Mark F. Adams &lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt;  2012-03-14 17:39:52</div>


<div>Committer: Mark F. Adams &lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt;  2012-03-14 17:39:52</div><div>Tags: tip</div><div>Parent: 22491:451bbbd291c2 (Small fixes to the BT linesearch)</div>


<div><br></div><div>    fix for unsymmetric matrices.</div><div><br></div><div><br></div><div>------------------------ src/ksp/pc/impls/gamg/tools.c ------------------------</div><div>@@ -103,7 +103,7 @@</div><div>   PetscErrorCode ierr;</div>


<div>   PetscInt       Istart,Iend,Ii,jj,ncols,nnz0,nnz1, NN, MM, nloc;</div><div>   PetscMPIInt    mype, npe;</div><div>-  Mat            Gmat = *a_Gmat, tGmat;</div><div>+  Mat            Gmat = *a_Gmat, tGmat, matTrans;</div>


<div>   MPI_Comm       wcomm = ((PetscObject)Gmat)-&gt;comm;</div><div>   const PetscScalar *vals;</div><div>   const PetscInt *idx;</div><div>@@ -127,6 +127,10 @@</div><div>   ierr = MatDiagonalScale( Gmat, diag, diag ); CHKERRQ(ierr);</div>


<div>   ierr = VecDestroy( &amp;diag );           CHKERRQ(ierr);</div><div> </div><div>+  if( symm ) {</div><div>+    ierr = MatTranspose( Gmat, MAT_INITIAL_MATRIX, &amp;matTrans );    CHKERRQ(ierr);</div><div>+  }</div>


<div>

+</div><div>   /* filter - dup zeros out matrix */</div><div>   ierr = PetscMalloc( nloc*sizeof(PetscInt), &amp;d_nnz ); CHKERRQ(ierr);</div><div>   ierr = PetscMalloc( nloc*sizeof(PetscInt), &amp;o_nnz ); CHKERRQ(ierr);</div>


<div>@@ -135,6 +139,12 @@</div><div>     d_nnz[jj] = ncols;</div><div>     o_nnz[jj] = ncols;</div><div>     ierr = MatRestoreRow(Gmat,Ii,&amp;ncols,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr);</div><div>+    if( symm ) {</div>


<div>+      ierr = MatGetRow(matTrans,Ii,&amp;ncols,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr);</div><div>+      d_nnz[jj] += ncols;</div><div>+      o_nnz[jj] += ncols;</div><div>+      ierr = MatRestoreRow(matTrans,Ii,&amp;ncols,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr);</div>


<div>+    }</div><div>     if( d_nnz[jj] &gt; nloc ) d_nnz[jj] = nloc;</div><div>     if( o_nnz[jj] &gt; (MM-nloc) ) o_nnz[jj] = MM - nloc;</div><div>   }</div><div>@@ -142,6 +152,9 @@</div><div>   CHKERRQ(ierr);</div><div>


   ierr = PetscFree( d_nnz ); CHKERRQ(ierr); </div><div>   ierr = PetscFree( o_nnz ); CHKERRQ(ierr); </div><div>+  if( symm ) {</div><div>+    ierr = MatDestroy( &amp;matTrans );  CHKERRQ(ierr);</div><div>+  }</div><div>


<div>

<div> </div><div><br></div><div><br></div><div><br></div><div><div>On Mar 14, 2012, at 5:53 PM, John Mousel wrote:</div><br><blockquote type="cite">Mark,<br><br>No change. Can you give me the location that you patched so I can check to make sure it pulled?<br>


I don&#39;t see it on the petsc-dev change log.<br><br>John<br><br><div class="gmail_quote">On Wed, Mar 14, 2012 at 4:40 PM, Mark F. Adams <span dir="ltr">&lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">John, I&#39;ve committed these changes, give a try.<br>

<span><font color="#888888"><br>

Mark<br>

</font></span><div><div><br>

On Mar 14, 2012, at 3:46 PM, Satish Balay wrote:<br>

<br>

&gt; This is the usual merge [with uncommited changes] issue.<br>

&gt;<br>

&gt; You could use &#39;hg shelf&#39; extension to shelve your local changes and<br>

&gt; then do a merge [as Sean would suggest] - or do the merge in a<br>

&gt; separate/clean clone [I normally do this..]<br>

&gt;<br>

&gt; i.e<br>

&gt; cd ~/Codes<br>

&gt; hg clone petsc-dev petsc-dev-merge<br>

&gt; cd petsc-dev-merge<br>

&gt; hg pull ssh://<a href="mailto:petsc@petsc.cs.iit.edu" target="_blank">petsc@petsc.cs.iit.edu</a>//hg/petsc/petsc-dev   #just to be sure, look for latest chagnes before merge..<br>

&gt; hg merge<br>

&gt; hg commit<br>

&gt; hg push ssh://<a href="mailto:petsc@petsc.cs.iit.edu" target="_blank">petsc@petsc.cs.iit.edu</a>//hg/petsc/petsc-dev<br>

&gt;<br>

&gt; [now update your petsc-dev to latest]<br>

&gt; cd ~/Codes/petsc-dev<br>

&gt; hg pull<br>

&gt; hg update<br>

&gt;<br>

&gt; Satish<br>

&gt;<br>

&gt; On Wed, 14 Mar 2012, Mark F. Adams wrote:<br>

&gt;<br>

&gt;&gt; Great, that seems to work.<br>

&gt;&gt;<br>

&gt;&gt; I did a &#39;hg commit tools.c&#39;<br>

&gt;&gt;<br>

&gt;&gt; and I want to push this file only.  I guess its the only thing in the change set so &#39;hg push&#39; should be fine.  But I see this:<br>

&gt;&gt;<br>

&gt;&gt; ~/Codes/petsc-dev/src/ksp/pc/impls/gamg&gt;hg update<br>

&gt;&gt; abort: crosses branches (merge branches or use --clean to discard changes)<br>

&gt;&gt; ~/Codes/petsc-dev/src/ksp/pc/impls/gamg&gt;hg merge<br>

&gt;&gt; abort: outstanding uncommitted changes (use &#39;hg status&#39; to list changes)<br>

&gt;&gt; ~/Codes/petsc-dev/src/ksp/pc/impls/gamg&gt;hg status<br>

&gt;&gt; M include/petscmat.h<br>

&gt;&gt; M include/private/matimpl.h<br>

&gt;&gt; M src/ksp/pc/impls/gamg/agg.c<br>

&gt;&gt; M src/ksp/pc/impls/gamg/gamg.c<br>

&gt;&gt; M src/ksp/pc/impls/gamg/gamg.h<br>

&gt;&gt; M src/ksp/pc/impls/gamg/geo.c<br>

&gt;&gt; M src/mat/coarsen/coarsen.c<br>

&gt;&gt; M src/mat/coarsen/impls/hem/hem.c<br>

&gt;&gt; M src/mat/coarsen/impls/mis/mis.c<br>

&gt;&gt;<br>

&gt;&gt; Am I ready to do a push?<br>

&gt;&gt;<br>

&gt;&gt; Thanks,<br>

&gt;&gt; Mark<br>

&gt;&gt;<br>

&gt;&gt; On Mar 14, 2012, at 2:44 PM, Satish Balay wrote:<br>

&gt;&gt;<br>

&gt;&gt;&gt; If commit is the last hg operation that you&#39;ve done - then &#39;hg rollback&#39; would undo this commit.<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; Satish<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt; On Wed, 14 Mar 2012, Mark F. Adams wrote:<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; Damn, I&#39;m not preallocating the graph perfectly for unsymmetric matrices and PETSc now dies on this.<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; I have a fix but I committed it with other changes that I do not want to commit.  The changes are all in one file so I should be able to just commit this file.<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; Anyone know how to delete a commit?<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; I&#39;ve tried:<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; ~/Codes/petsc-dev/src/ksp/pc/impls/gamg&gt;hg strip 22487:26ffb9eef17f<br>

&gt;&gt;&gt;&gt; hg: unknown command &#39;strip&#39;<br>

&gt;&gt;&gt;&gt; &#39;strip&#39; is provided by the following extension:<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;   mq  manage a stack of patches<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; use &quot;hg help extensions&quot; for information on enabling extensions<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; But have not figured out how to load extensions.<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; Mark<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt; On Mar 14, 2012, at 12:54 PM, John Mousel wrote:<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; Mark,<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; I have a non-symmetric matrix. I am running with the following options.<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; -pc_type gamg -pc_gamg_sym_graph -ksp_monitor_true_residual<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; and with the inclusion of -pc_gamg_sym_graph, I get a new malloc error:<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; 0]PETSC ERROR: --------------------- Error Message ------------------------------------<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: Argument out of range!<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: New nonzero at (5150,9319) caused a malloc!<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: ------------------------------------------------------------------------<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: Petsc Development HG revision: 587b25035091aaa309c87c90ac64c13408ecf34e  HG Date: Wed Mar 14 09:22:54 2012 -0500<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: See docs/changes/index.html for recent updates.<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: See docs/index.html for manual pages.<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: ------------------------------------------------------------------------<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: ../JohnRepo/VFOLD_exe on a linux-deb named <a href="http://wv.iihr.uiowa.edu/" target="_blank">wv.iihr.uiowa.edu</a> by jmousel Wed Mar 14 11:51:35 2012<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: Libraries linked from /home/jmousel/NumericalLibraries/petsc-hg/petsc-dev/linux-debug/lib<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: Configure run at Wed Mar 14 09:46:39 2012<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: Configure options --download-blacs=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mpich=1 --download-parmetis=1 --download-scalapack=1 --with-blas-lapack-dir=/opt/intel11/mkl/lib/em64t --with-cc=gcc --with-cmake=/usr/local/bin/cmake --with-cxx=g++ --with-fc=ifort PETSC_ARCH=linux-debug<br>


&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: ------------------------------------------------------------------------<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: MatSetValues_MPIAIJ() line 506 in /home/jmousel/NumericalLibraries/petsc-hg/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: MatSetValues() line 1141 in /home/jmousel/NumericalLibraries/petsc-hg/petsc-dev/src/mat/interface/matrix.c<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: scaleFilterGraph() line 155 in /home/jmousel/NumericalLibraries/petsc-hg/petsc-dev/src/ksp/pc/impls/gamg/tools.c<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: PCGAMGgraph_AGG() line 865 in /home/jmousel/NumericalLibraries/petsc-hg/petsc-dev/src/ksp/pc/impls/gamg/agg.c<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: PCSetUp_GAMG() line 516 in /home/jmousel/NumericalLibraries/petsc-hg/petsc-dev/src/ksp/pc/impls/gamg/gamg.c<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: PCSetUp() line 832 in /home/jmousel/NumericalLibraries/petsc-hg/petsc-dev/src/ksp/pc/interface/precon.c<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: KSPSetUp() line 261 in /home/jmousel/NumericalLibraries/petsc-hg/petsc-dev/src/ksp/ksp/interface/itfunc.c<br>

&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: KSPSolve() line 385 in /home/jmousel/NumericalLibraries/petsc-hg/petsc-dev/src/ksp/ksp/interface/itfunc.c<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; John<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; On Wed, Mar 14, 2012 at 11:27 AM, Mark F. Adams &lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt; wrote:<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; On Mar 14, 2012, at 11:56 AM, John Mousel wrote:<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; Mark,<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; The matrix is asymmetric. Does this require the setting of an option?<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; Yes:  -pc_gamg_sym_graph<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt; Mark<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; I pulled petsc-dev this morning, so I should have (at least close to) the latest code.<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; John<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; On Wed, Mar 14, 2012 at 10:54 AM, Mark F. Adams &lt;<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>&gt; wrote:<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; On Mar 14, 2012, at 11:08 AM, John Mousel wrote:<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; I&#39;m getting the following error when using GAMG.<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; petsc-dev/src/ksp/pc/impls/gamg/agg.c:508: smoothAggs: Assertion `sgid==-1&#39; failed.<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; Is it possible that your matrix is structurally asymmetric?<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; This code is evolving fast and so you will need to move to the dev version if you are not already using it. (I think I fixed a bug that hit this assert).<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; When I try to alter the type of aggregation at the command line using -pc_gamg_type pa, I&#39;m getting<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message ------------------------------------<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; [1]PETSC ERROR: Unknown type. Check for miss-spelling or missing external package needed for type:<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; see <a href="http://www.mcs.anl.gov/petsc/documentation/installation.html#external" target="_blank">http://www.mcs.anl.gov/petsc/documentation/installation.html#external</a>!<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; [1]PETSC ERROR: Unknown GAMG type pa given!<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; Has there been a change in the aggregation options? I just pulled petsc-dev this morning.<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; Yes, this option is gone now.  You can use -pc_gamg_type agg for now.<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt; Mark<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;&gt; John<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;&gt;<br>

&gt;&gt;&gt;<br>

&gt;&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;<br>

&gt;<br>

<br>

</div></div></blockquote></div><br>

</blockquote></div><br></div></div></div></blockquote></div><br>

</div></div><span>&lt;GAMG_kspview.txt&gt;</span><span>&lt;ML_kspview.txt&gt;</span><span>&lt;HYPRE_kspview.txt&gt;</span></blockquote></div><br></div></div></blockquote></div><br>

</div></div><span>&lt;GAMG_kspview.txt&gt;</span></blockquote></div><br></div></div></div></blockquote></div><br></div>

</blockquote></div><br></div></div></div></div></blockquote></div><br>

</div></div><span>&lt;GAMG_kspview.txt&gt;</span><span>&lt;GAMG_kspview_nosmooth.txt&gt;</span></blockquote></div><br></div></div></div></div></blockquote></div><br>

</blockquote></div><br></div></div></div></div></blockquote></div><br>