<div dir="ltr"><div><div>Hello, Barry,<br><br></div>Thanks for your prompt reply. I ran the matrix-based solver with matrix-based SGS precondioner. I see your point. The profiling table is below and attached. <br><br></div>So Matmult takes 4% time and PCApply takes 43% time. <br><div><br> MatMult 636 1.0 9.0361e+00 1.0 9.21e+09 1.0 7.6e+03 1.1e+04 0.0e+00 <span style="color:rgb(255,0,0)">4</span> 85 52 17 0 4 85 52 17 0 3980<br> PCApply 636 1.0 8.7006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.9e+03 <span style="color:rgb(255,0,0)">43</span> 0 0 0 24 43 0 0 0 24 0<br><br><br></div><div>The way I see it, the matrix-free solver spends most of the time (70%) on matmult or equivalently rhs evaluation. Every KSP iteration, one rhs evaluation is performed. This is much more costly than a matrix vector product in a matrix-based solver. Perhaps this is expected in matrix-free solver.<br><br></div><div>I will start look at the rhs evaluation since it takes the most time. <br><br></div><div>Thanks.<br></div><div>Song Gao<br></div><div><br> <br></div></div><div class="gmail_extra"><br><div class="gmail_quote">2016-01-14 16:24 GMT-05:00 Barry Smith <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
So<br>
<br>
KSPSolve is 96 % and MatMult is 70 % + PCApply 24 % = 94 % so this makes sense; the solver time is essentially the<br>
multiply time plus the PCApply time.<br>
<br>
compute_rhs 1823 1.0 4.2119e+02 1.0 0.00e+00 0.0 4.4e+04 5.4e+04 1.1e+04 71 0100100 39 71 0100100 39 0<br>
LU-SGS 1647 1.0 1.3590e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23 0 0 0 0 23 0 0 0 0 0<br>
SURFINT 1823 1.0 1.0647e+02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17 0 0 0 0 17 0 0 0 0 0<br>
VOLINT 1823 1.0 2.2373e+02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 35 0 0 0 0 35 0 0 0 0 0<br>
<br>
Depending on the "quality" of the preconditioner (if it is really good) one expects the preconditioner time to be larger than the MatMult(). Only for simple preconditioners (like Jacobi) does one see it being much less than the MatMult(). For matrix based solvers the amount of work in SGS is as large as the amount of work in the MatMult() if not more, so I would expect the time of the preconditioner to be higher than the time of the multiply.<br>
<br>
So based on knowing almost nothing I think the MatMult_ is taking more time then it should unless you are ignoring (skipping) a lot of the terms in your matrix-free SGS; then it is probably reasonable.<br>
<br>
Barry<br>
<div><div class="h5"><br>
<br>
<br>
> On Jan 14, 2016, at 3:01 PM, Song Gao <<a href="mailto:song.gao2@mail.mcgill.ca">song.gao2@mail.mcgill.ca</a>> wrote:<br>
><br>
> Hello,<br>
><br>
> I am profiling a finite element Navier-Stokes solver. It uses the Jacobian-free Newton Krylov method and a custom preconditoner LU-SGS (a matrix-free version of Symmetic Gauss-Seidel ). The log summary is attached. Four events are registered. compute_rhs is compute rhs (used by MatMult_MFFD). SURFINT and VOLINT are parts of compute_rhs. LU-SGS is the custom preconditioner. I didn't call PetscLogFlops so these flops are zeros.<br>
><br>
> I'm wondering, is the percent time of the events reasonable in the table? I see 69% time is spent on matmult_mffd. Is it expected in matrix-free method? What might be a good starting point of profiling this solver? Thank you in advance.<br>
><br>
><br>
> Song Gao<br>
</div></div>> <log_summary><br>
<br>
</blockquote></div><br></div>