[petsc-users] Guidance on GAMG preconditioning

Mon Jun 8 05:52:16 CDT 2015

On Sun, Jun 7, 2015 at 9:08 PM, Justin Chang <jychang48 at gmail.com> wrote:

> Matt (Knepley),
>
> I see what you're saying and it makes perfect sense. The point of my work
> isn't necessarily to compare CG/Jacobi with GAMG. Rather I am trying to
> compare both the numerical solution and the computational performance of my
> "correction" methodology (through optimization) with just solving the FEM
> problem normally. Of course this methodology is going to be more expensive
> but I think it would be nice to have some "benchmark" to compare against. I
> have examples that show where the parallel efficiency of TAO
> overtakes CG/Jacobi, and I also have the AI that shows how TAO is higher
> than CG/Jacobi and that both are invariant with respect to problem size.
>
> I ran some (smaller) experiments with GAMG and have noticed problems in
> which GAMG wall-clock time is less than CG/Jacobi (though not by much).
> However the problem is that it seems I cannot compute the arithmetic
> intensity for GAMG.
>
> The way I see it I have these three options:
>
> 1) Stick with what I have and acknowledge that GAMG can be better for
> larger problems. Since I have compared TAO with CG/Jacobi, somebody else
> can compare GAMG with CG/Jacobi.
>

Do this. Include the performance data you do have with GAMG. It is
perfectly acceptable to say "I care about these problems sizes for my
problems, and I have
done careful perofrmance analysis", and then acknowledge that at some point
GAMG will likely win.

  Thanks,

     Matt

> 2) Do strong scaling studies with GAMG and TAO and forget about the AI
> stuff. If I do this, then IMHO the paper will lose much of its flavor.
>
> 3) Use a different performance model that can be used to measure GAMG. I
> can only imagine that the complexity in applying any other model would
> proliferate for GAMG
>
> 4) Simply report FLOPS/s and the associated wall-clock times with respect
> to each solver. Yes this is easily gamed but I would think that this can at
> least tell you something (I.e., if this metric drops for a given problem
> size, it can be an indicator that the program is losing some efficiency)
>
> Thoughts?
>
> Justin
>
>
> On Saturday, June 6, 2015, Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Sat, Jun 6, 2015 at 4:29 AM, Justin Chang <jychang48 at gmail.com> wrote:
>>
>>> Matt and Mark thank you guys for your responses.
>>>
>>> The reason I brought up GAMG was because it seems to me that this is the
>>> preconditioner to use for elliptic problems. However, I am using CG/Jacobi
>>> for my larger problems and the solver converges (with -ksp_atol and
>>> -ksp_rtol set to 1e-8). Using GAMG I get rough the same wall-clock time,
>>> but significantly fewer solver iterations.
>>>
>>> As I also kind of mentioned in another mail, the ultimate purpose is to
>>> compare how this "correction" methodology using the TAO solver (with
>>> bounded constraints) performs compared to the original methodology using
>>> the KSP solver (without constraints). I have the A for BLMVM and CG/Jacobi
>>> and they are roughly 0.3 and 0.2 respectively (do these sound about
>>> right?). Although the AI is higher for TAO , the ratio of actual FLOPS/s
>>> over the AI*STREAMS BW is smaller, though I am not sure what conclusions to
>>> make of that. This was also partly why I wanted to see what kind of metrics
>>> another KSP solver/preconditioner produces.
>>>
>>> Point being, if I were to draw such comparisons between TAO and KSP,
>>> would I get crucified if people find out I am using CG/Jacobi and not GAMG?
>>>
>>
>> Here is what someone like me reviewing your paper would say first. I can
>> believe that a well-conditioned problem would
>> converge using CG/Jacobi. However, if the highest order derivative looks
>> like the Laplacian, then the condition number of
>> the equations will be O(h^2), and even with CG it will be O(h), so the
>> number of iterations should increase as the square root
>> of the problem size (in 2D), where GAMG should be constant. Thus at some
>> size GAMG will be more efficient. I would want
>> to see where the crossover is for your problem. If you do not get the
>> O(h) dependence, I would think that there is a problem
>> in the formulation.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Thanks,
>>> Justin
>>>
>>> On Fri, Jun 5, 2015 at 2:02 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>>
>>>>
>>>>>>
>>>>> The overwhleming cost of AMG is the Galerkin triple-product RAP.
>>>>>
>>>>>
>>>> That is overstating it a bit.  It can be if you have a hard 3D operator
>>>> and coarsening slowly is best.
>>>>
>>>> Rule of thumb is you spend 50% time is the solver and 50% in the setup,
>>>> which is often mostly RAP (in 3D, 2D is much faster).  That way you are
>>>> within 2x of optimal and it often works out that way anyway.
>>>>
>>>> Mark
>>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150608/c90641fb/attachment.html>