[petsc-users] Guidance on GAMG preconditioning

Justin Chang jychang48 at gmail.com
Sun Jun 7 21:08:28 CDT 2015

Matt (Knepley),

I see what you're saying and it makes perfect sense. The point of my work
isn't necessarily to compare CG/Jacobi with GAMG. Rather I am trying to
compare both the numerical solution and the computational performance of my
"correction" methodology (through optimization) with just solving the FEM
problem normally. Of course this methodology is going to be more expensive
but I think it would be nice to have some "benchmark" to compare against. I
have examples that show where the parallel efficiency of TAO
overtakes CG/Jacobi, and I also have the AI that shows how TAO is higher
than CG/Jacobi and that both are invariant with respect to problem size.

I ran some (smaller) experiments with GAMG and have noticed problems in
which GAMG wall-clock time is less than CG/Jacobi (though not by much).
However the problem is that it seems I cannot compute the arithmetic
intensity for GAMG.

The way I see it I have these three options:

1) Stick with what I have and acknowledge that GAMG can be better for
larger problems. Since I have compared TAO with CG/Jacobi, somebody else
can compare GAMG with CG/Jacobi.

2) Do strong scaling studies with GAMG and TAO and forget about the AI
stuff. If I do this, then IMHO the paper will lose much of its flavor.

3) Use a different performance model that can be used to measure GAMG. I
can only imagine that the complexity in applying any other model would
proliferate for GAMG

4) Simply report FLOPS/s and the associated wall-clock times with respect
to each solver. Yes this is easily gamed but I would think that this can at
least tell you something (I.e., if this metric drops for a given problem
size, it can be an indicator that the program is losing some efficiency)



On Saturday, June 6, 2015, Matthew Knepley <knepley at gmail.com> wrote:

> On Sat, Jun 6, 2015 at 4:29 AM, Justin Chang <jychang48 at gmail.com
> <javascript:_e(%7B%7D,'cvml','jychang48 at gmail.com');>> wrote:
>> Matt and Mark thank you guys for your responses.
>> The reason I brought up GAMG was because it seems to me that this is the
>> preconditioner to use for elliptic problems. However, I am using CG/Jacobi
>> for my larger problems and the solver converges (with -ksp_atol and
>> -ksp_rtol set to 1e-8). Using GAMG I get rough the same wall-clock time,
>> but significantly fewer solver iterations.
>> As I also kind of mentioned in another mail, the ultimate purpose is to
>> compare how this "correction" methodology using the TAO solver (with
>> bounded constraints) performs compared to the original methodology using
>> the KSP solver (without constraints). I have the A for BLMVM and CG/Jacobi
>> and they are roughly 0.3 and 0.2 respectively (do these sound about
>> right?). Although the AI is higher for TAO , the ratio of actual FLOPS/s
>> over the AI*STREAMS BW is smaller, though I am not sure what conclusions to
>> make of that. This was also partly why I wanted to see what kind of metrics
>> another KSP solver/preconditioner produces.
>> Point being, if I were to draw such comparisons between TAO and KSP,
>> would I get crucified if people find out I am using CG/Jacobi and not GAMG?
> Here is what someone like me reviewing your paper would say first. I can
> believe that a well-conditioned problem would
> converge using CG/Jacobi. However, if the highest order derivative looks
> like the Laplacian, then the condition number of
> the equations will be O(h^2), and even with CG it will be O(h), so the
> number of iterations should increase as the square root
> of the problem size (in 2D), where GAMG should be constant. Thus at some
> size GAMG will be more efficient. I would want
> to see where the crossover is for your problem. If you do not get the O(h)
> dependence, I would think that there is a problem
> in the formulation.
>   Thanks,
>      Matt
>> Thanks,
>> Justin
>> On Fri, Jun 5, 2015 at 2:02 PM, Mark Adams <mfadams at lbl.gov
>> <javascript:_e(%7B%7D,'cvml','mfadams at lbl.gov');>> wrote:
>>>> The overwhleming cost of AMG is the Galerkin triple-product RAP.
>>> That is overstating it a bit.  It can be if you have a hard 3D operator
>>> and coarsening slowly is best.
>>> Rule of thumb is you spend 50% time is the solver and 50% in the setup,
>>> which is often mostly RAP (in 3D, 2D is much faster).  That way you are
>>> within 2x of optimal and it often works out that way anyway.
>>> Mark
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150607/e4c71702/attachment.html>

More information about the petsc-users mailing list