[petsc-users] Performance of Fieldsplit PC

Matthew Knepley knepley at gmail.com
Tue Nov 7 13:10:10 CST 2017


On Tue, Nov 7, 2017 at 10:55 AM, Bernardo Rocha <
bernardomartinsrocha at gmail.com> wrote:

> Thanks for the reply.
>
> 1) This is block-Jacobi, why not use PCBJACOBI? Is it because you want to
>> select rows?
>>
>
> I'm only using it to understand the performance behavior of PCFieldSplit
> since I'm also
> having the same issue in a large and more complex problem.
>>
>> 2) We cannot tell anything without knowing how many iterates were used
>>
>   -ksp_monitor_true_residual -ksp_converged_reason
>> -pc_fieldsplit_[0,1]_ksp_monitor_true_residual
>>
>> 3) We cannot say anything about performance without seeing the log for
>> both runs
>>   -log_view
>>
>
> I'm sending to you the log files with the recommended command line
> arguments for the three cases.
>

You did not print out the iterates for the field split solves:

  -pc_fieldsplit_[0,1]_ksp_monitor_true_residual


> 1-scalar case
> 2-PCFieldSplit (as we were initially running)
> 3-PCFieldSplit with Preonly/Jacobi in each block, as suggested by Patrick.
>
> As Patrick pointed out, with Preonly/Jacobi the behavior is closer to what
> I expected.
>
> Please note that the log was taken for 100 calls to KSPSolve, I just
> simplified it.
>
> What would be the proper way of creating this block preconditioner
>
> As you can see, the timing with PCFieldSplit is bigger for case 3.
> For case 2 it is nearly 2x, as I expected (I don't know if this idea makes
> sense).
>
> So for the case 2, the reason for the large timing is due to the
> inner/outer solver?
>
> ​Does the "machinery" behind the PCFieldSplit for a block preconditioner
> results
> in some performance overhead? (neglecting the efficiency of the PC itself)
>

No.

What you sent makes little sense to me. How do you have 29 iterates for the
solve, but 5900
MatMults in the log?

The number of MatMults in 2) is 4x, not 2x. I suspect that is because
convergence of the Krylov
solver on each block takes the same number of iterates that your global
Krylov solver takes. Thus
you have 2x, but you do 2 outer iterates, which is 4x. Thus you get your 2x
time.

You could try to play games with the inner tolerances (say run the blocks
only to 10^-4). However,
the fact remains that this is not even a credible solver for the problem.

  Thanks,

     Matt

Best regards,
> Bernardo​
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20171107/fe049311/attachment.html>


More information about the petsc-users mailing list