[petsc-users] strong-scaling vs weak-scaling

Mark Adams mfadams at lbl.gov
Wed Aug 31 10:46:49 CDT 2016


And you can't get much more detail with hypre because it does not record
performance data. Or can you get hypre to print its performance data?

ML uses more PETSc stuff, you can get the PtAP time, which is most of the
matrix setup.

GAMG is native and has more timers. In addition to PtAP there is P0
smoothing, which is listed along with some other parts of the GAMG (mesh)
setup.



On Wed, Aug 31, 2016 at 6:28 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, Aug 31, 2016 at 5:23 AM, Justin Chang <jychang48 at gmail.com> wrote:
>
>> Matt,
>>
>> So is the "solve phase" going to be KSPSolve() - PCSetUp()?
>>
>
> Setup Phase: KSPSetUp + PCSetup
>
> Solve Phase:  SNESSolve
>   This contains SNESFunctionEval, SNESJacobianEval, KSPSolve
>
>    Matt
>
> In other words, if I want to look at time/iterations, should it just be
>> over KSPSolve or should I exclude the PC setup?
>>
>> Justin
>>
>>
>>
>> On Wed, Aug 31, 2016 at 5:13 AM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Wed, Aug 31, 2016 at 2:01 AM, Justin Chang <jychang48 at gmail.com>
>>> wrote:
>>>
>>>> Attached is the -log_view output (from firedrake). Event Stage 1:
>>>> Linear_solver is where I assemble and solve the linear system of equations.
>>>>
>>>> I am using the HYPRE BoomerAMG preconditioner so log_view cannot "see
>>>> into" the exact steps, but based on what it can see, how do I distinguish
>>>> between these various setup and timing phases?
>>>>
>>>> For example, when I look at these lines:
>>>>
>>>> PCSetUp                1 1.0 2.2858e+00 1.0 0.00e+00 0.0 0.0e+00
>>>> 0.0e+00 0.0e+00  9  0  0  0  0  11  0  0  0  0     0
>>>> PCApply               38 1.0 1.4102e+01 1.0 0.00e+00 0.0 0.0e+00
>>>> 0.0e+00 0.0e+00 56  0  0  0  0  66  0  0  0  0     0
>>>> KSPSetUp               1 1.0 9.9111e-04 1.0 0.00e+00 0.0 0.0e+00
>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>>> KSPSolve               1 1.0 1.7529e+01 1.0 2.44e+09 1.0 0.0e+00
>>>> 0.0e+00 0.0e+00 70  7  0  0  0  82  7  0  0  0   139
>>>> SNESSolve              1 1.0 2.1056e+01 1.0 3.75e+10 1.0 0.0e+00
>>>> 0.0e+00 0.0e+00 84100  0  0  0  99100  0  0  0  1781
>>>> SNESFunctionEval       1 1.0 1.0763e+00 1.0 1.07e+10 1.0 0.0e+00
>>>> 0.0e+00 0.0e+00  4 29  0  0  0   5 29  0  0  0  9954
>>>> SNESJacobianEval       1 1.0 2.4495e+00 1.0 2.43e+10 1.0 0.0e+00
>>>> 0.0e+00 0.0e+00 10 65  0  0  0  12 65  0  0  0  9937
>>>>
>>>> So how do I break down "mesh setup", "matrix setup", and "solve time"
>>>> phases? I am guessing "PCSetUp" has to do with one of the first two phases,
>>>> but how would I categorize the rest of the events? I see that HYPRE doesn't
>>>> have as much information as the other PCs like GAMG and ML but can one
>>>> still breakdown the timing phases through log_view alone?
>>>>
>>>
>>> 1) It looks like you call PCSetUp() yourself, since otherwise KSPSetUp()
>>> would contain that time. Notice that you can ignore KSPSetUp() here.
>>>
>>> 2) The setup time is usually KSPSetUp(), but if here you add to it
>>> PCSetUp() since you called it.
>>>
>>> 3) The solve time for SNES can be split into
>>>
>>>   a) KSPSolve() for the update calculation
>>>
>>>   b) SNESFunctionEval, SNESJacobianEval for everything else (conv check,
>>> line search, J calc, etc.) or you can just take SNESSolve() - KSPSolve()
>>>
>>>  4) Note that PCApply() is most of KSPSolve(), which is generally good
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>> Thanks,
>>>> Justin
>>>>
>>>> On Tue, Aug 30, 2016 at 11:14 PM, Jed Brown <jed at jedbrown.org> wrote:
>>>>
>>>>> Mark Adams <mfadams at lbl.gov> writes:
>>>>>
>>>>> >>
>>>>> >>
>>>>> >> Anyway, what I really wanted to say is, it's good to know that these
>>>>> >> "dynamic range/performance spectrum/static scaling" plots are
>>>>> designed to
>>>>> >> go past the sweet spots. I also agree that it would be interesting
>>>>> to see a
>>>>> >> time vs dofs*iterations/time plot. Would it then also be useful to
>>>>> look at
>>>>> >> the step to setting up the preconditioner?
>>>>> >>
>>>>> >>
>>>>> > Yes, I generally split up timing between "mesh setup" (symbolic
>>>>> > factorization of LU), "matrix setup" (eg, factorizations), and solve
>>>>> time.
>>>>> > The degree of amortization that you get for the two setup phases
>>>>> depends on
>>>>> > your problem and so it is useful to separate them.
>>>>>
>>>>> Right, there is nothing wrong with splitting up the phases, but if you
>>>>> never show a spectrum for the total, then I will be suspicious.  And if
>>>>> you only show "per iteration" instead of for a complete solve, then I
>>>>> will assume that you're only doing that because convergence is unusably
>>>>> slow.
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160831/b8fd7c38/attachment.html>


More information about the petsc-users mailing list