[petsc-users] Profile a matrix-free solver.

Barry Smith bsmith at mcs.anl.gov
Fri Jan 15 13:42:34 CST 2016


> On Jan 15, 2016, at 10:52 AM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> 
> Hello, Barry,
> 
> Thanks for your prompt reply.  I ran the matrix-based solver with matrix-based SGS precondioner. I see your point. The profiling table is below and attached. 
> 
> So Matmult takes 4% time and PCApply takes 43% time. 
> 
>  MatMult              636 1.0 9.0361e+00 1.0 9.21e+09 1.0 7.6e+03 1.1e+04 0.0e+00  4 85 52 17  0   4 85 52 17  0  3980
>  PCApply              636 1.0 8.7006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.9e+03 43  0  0  0 24  43  0  0  0 24     0
> 
> 
> The way I see it, the matrix-free solver spends most of the time (70%) on matmult or equivalently rhs evaluation. Every KSP iteration, one rhs evaluation is performed. This is much more costly than a matrix vector product in a matrix-based solver. 

  Sure,   but if the matrix-free SGS mimics all the work of the right hand side function evaluation (which is has to if it truly is a a SGS sweep and not some approximation (where you drop certain terms in the right hand side function when you compute the SGS)) then the matrix-free SGS should be at least as expensive as the right hand side evaluation.

   Barry


My guess is your SGS drops some terms so is only and approximation, but is still good enough as a preconditioner.

> Perhaps this is expected in matrix-free solver.
> 
> I will start look at the rhs evaluation since it takes the most time. 
> 
> Thanks.
> Song Gao
> 
>  
> 
> 2016-01-14 16:24 GMT-05:00 Barry Smith <bsmith at mcs.anl.gov>:
> 
>    So
> 
>     KSPSolve is 96 % and MatMult is 70 % + PCApply 24 % = 94 % so this makes sense; the solver time is essentially the
> multiply time plus the PCApply time.
> 
> compute_rhs         1823 1.0 4.2119e+02 1.0 0.00e+00 0.0 4.4e+04 5.4e+04 1.1e+04 71  0100100 39  71  0100100 39     0
> LU-SGS              1647 1.0 1.3590e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23  0  0  0  0  23  0  0  0  0     0
> SURFINT             1823 1.0 1.0647e+02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17  0  0  0  0  17  0  0  0  0     0
> VOLINT              1823 1.0 2.2373e+02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 35  0  0  0  0  35  0  0  0  0     0
> 
>    Depending on the "quality" of the preconditioner (if it is really good) one expects the preconditioner time to be larger than the MatMult(). Only for simple preconditioners (like Jacobi) does one see it being much less than the MatMult().  For matrix based solvers the amount of work in  SGS is as large as the amount of work in the MatMult() if not more, so I would expect the time of the preconditioner to be higher than the time of the multiply.
> 
>   So based on knowing almost nothing I think the MatMult_ is taking more time then it should unless you are ignoring (skipping) a lot of the terms in your matrix-free SGS; then it is probably reasonable.
> 
>   Barry
> 
> 
> 
> > On Jan 14, 2016, at 3:01 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> >
> > Hello,
> >
> > I am profiling a finite element Navier-Stokes solver. It uses the Jacobian-free Newton Krylov method and a custom preconditoner LU-SGS (a matrix-free version of Symmetic Gauss-Seidel ). The log summary is attached. Four events are registered.  compute_rhs is compute rhs (used by MatMult_MFFD). SURFINT and VOLINT are parts of compute_rhs. LU-SGS is the custom preconditioner. I didn't call PetscLogFlops so these flops are zeros.
> >
> >  I'm wondering, is the percent time of the events reasonable in the table? I see 69% time is spent on  matmult_mffd. Is it expected in matrix-free method? What might be a good starting point of profiling this solver? Thank you in advance.
> >
> >
> > Song Gao
> > <log_summary>
> 
> 
> <log_summary_matrix_based_version>



More information about the petsc-users mailing list