<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Sep 29, 2014 at 3:47 PM, Filippo Leonardi <span dir="ltr"><<a href="mailto:filippo.leonardi@sam.math.ethz.ch" target="_blank">filippo.leonardi@sam.math.ethz.ch</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">@ Barry: It may be that I forgot to set the number of levels for the runs.<br>

<br>

New experiment with the following options:<br>

<br>

-da_refine 5 -pc_type mg -ksp_monitor -log_summary -pc_mg_type full -ksp_view -<br>

pc_mg_log -pc_mg_levels 5 -pc_mg_galerkin -ksp_monitor_true_residual -<br>

ksp_converged_reason<br>

<br>

on 128^3, and it looks nice:<br>

<br>

  0 KSP Residual norm 5.584601494955e+01<br>

  0 KSP preconditioned resid norm 5.584601494955e+01 true resid norm<br>

1.370259979011e+01 ||r(i)||/||b|| 1.000000000000e+00<br>

  1 KSP Residual norm 9.235021247277e+00<br>

  1 KSP preconditioned resid norm 9.235021247277e+00 true resid norm<br>

8.185195475443e-01 ||r(i)||/||b|| 5.973461679404e-02<br>

  2 KSP Residual norm 6.344253555076e-02<br>

  2 KSP preconditioned resid norm 6.344253555076e-02 true resid norm<br>

1.108015805992e-01 ||r(i)||/||b|| 8.086172134956e-03<br>

  3 KSP Residual norm 1.084530268454e-03<br>

  3 KSP preconditioned resid norm 1.084530268454e-03 true resid norm<br>

3.228589340041e-03 ||r(i)||/||b|| 2.356187431214e-04<br>

  4 KSP Residual norm 2.345341850850e-05<br>

  4 KSP preconditioned resid norm 2.345341850850e-05 true resid norm<br>

9.362117433445e-05 ||r(i)||/||b|| 6.832365811489e-06<br>

Linear solve converged due to CONVERGED_RTOL iterations 4<br>

<br>

I'll try on more processors. Is it correct what I am doing? If so, is there<br>

some tweak I am missing.<br>

<br>

Any suggestion on optimal number of levels VS number of processors?<br>

<br>

Btw, thanks a lot, you are always so helpful.<br>

<div><div class="h5"><br>

On Monday 29 September 2014 14:59:49 you wrote:<br>

> Filippo Leonardi <<a href="mailto:filippo.leonardi@sam.math.ethz.ch">filippo.leonardi@sam.math.ethz.ch</a>> writes:<br>

> > Thank you.<br>

> ><br>

> > Actually I had the feeling that it wasn't my problem with Bjacobi and CG.<br>

> ><br>

> > So I'll stick to MG. Problem with MG is that there are a lot of parameters<br>

> > to be tuned, so I leave the defaults (expect I select CG as Krylow<br>

> > method). I post just results for 64^3 and 128^3. Tell me if I'm missing<br>

> > some useful detail. (I get similar results with BoomerAMG).<br>

> ><br>

> > Time for one KSP iteration (-ksp_type cg -log_summary -pc_mg_galerkin<br>

> > -pc_type mg):<br>

> > 32^3 and 1 proc: 1.01e-1<br>

> > 64^3 and 8 proc: 6.56e-01<br>

> > 128^3 and 64 proc: 1.05e+00<br>

> > Number of PCSetup per KSPSolve:<br>

> > 15<br>

> > 39<br>

> > 65<br>

><br>

> Presumably you mean PCApply.  Something is wrong here because this<br>

> iteration count is way too high.  Perhaps your boundary conditions are<br>

> nonsymmetric or interpolation is not compatible with the discretization.<br>

><br>

> > With BoomerAMG:<br>

> > stable 8 iterations per KSP but time per iteration greater than PETSc MG<br>

> > and still increases:<br>

> > 64^3:  3.17e+00<br>

> > 128^3: 9.99e+00<br>

><br>

> > --> For instance with 64^3 (256 iterations):<br>

> In the first pass with geometric multigrid, don't worry about timing and<br>

> get the iterations figured out.  Are you using a cell-centered or<br>

> vertex-centered discretization.  When you say 128^3, is that counting<br>

> the number of elements or the number of vertices?  Note that if you have<br>

> a vertex-centered discretization, you will want a 129^3 grid.<br>

<br>

</div></div>Cell-centered, counting elements.<br>

<span class=""><br>

> With<br>

> PCMG, make sure you are getting the number of levels of refinement that<br>

> you expect.<br>

<br>

<br>

<br>

><br>

> You should see something like the following (this is 193^3).<br>

><br>

> $ mpiexec -n 4 ./ex45 -da_refine 5 -pc_type mg -ksp_monitor -pc_mg_type full<br>

> -mg_levels_ksp_type richardson -mg_levels_pc_type sor -ksp_type richardson<br>

> 0 KSP Residual norm 2.653722249919e+03<br>

>   1 KSP Residual norm 1.019366121923e+02<br>

>   2 KSP Residual norm 2.364558296616e-01<br>

>   3 KSP Residual norm 7.438761746501e-04<br>

> Residual norm 1.47939e-06<br>

<br>

><br>

> You can actually do better than this by using higher order FMG<br>

> interpolation, by going matrix-free, etc.  For example, HPGMG<br>

> (finite-element or finite-volume, see <a href="https://hpgmg.org" target="_blank">https://hpgmg.org</a>) will solve more<br>

> than a million equations/second per core.  Is your application really<br>

> solving the constant-coefficient Poisson problem on a Cartesian grid, or<br>

> is that just a test?<br>

<br>

</span>I actually just need a cell centered Poisson solver on cartesian grids.<br>

(various boundary conditions).<br>

<br>

Matrix free you mean AMG (like -pc_mg_galerkin)? Does it reach the same<br>

scalability as GMG?</blockquote><div><br></div><div>No, Jed means calculating the action of the operator matrix-free instead of using</div><div>an assembled sparse matrix.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>

><br>

> > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012<br>

><br>

> And a reminder to please upgrade to the current version of PETSc.<br>

<br>

</span>Sadly this is not in my power. I had actually had to rollback all the APIs to<br>

be able to do this test runs.</blockquote></div><br>   Matt<br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener

</div></div>