Hi, All<div><br></div><div>I am testing the performance of snes_mf_operator against snes_fd.</div><div><br></div><div>I know snes_fd is for test/debugging and extremely slow, which is ok for my testing purpose. I then compared the code performance using snes_mf_operator against snes_fd. Of course, snes_mf_operator uses way less computing time then snes_fd, however, the snes_mf_operator non-linear solver performance is worse than snes_fd, in terms of non linear iteration in each time steps.</div>
<div><br></div><div>Here is the PETSc Options Table entries taken from the log_summary when using <font color="#cc0000" style="background-color:rgb(255,255,153)">snes_mf_operator</font></div><div><div>#PETSc Option Table entries:</div>
<div>-ksp_converged_reason</div><div>-ksp_gmres_restart 300</div><div>-ksp_monitor_true_residual</div><div>-log_summary</div><div>-m pipe_7eqn_2phase_step7_ps.i</div><div>-mat_fd_type ds</div><div>-pc_type lu</div><div><font color="#660000" style="background-color:rgb(153,255,255)">-snes_mf_operator</font></div>
<div>-snes_monitor</div><div>#End of PETSc Option Table entries</div></div><div><br></div><div>Here is the PETSc Options Table entries taken from the log_summary when using <font color="#990000" style="background-color:rgb(255,255,153)">snes_fd</font></div>
<div><div>#PETSc Option Table entries:</div><div>-ksp_converged_reason</div><div>-ksp_gmres_restart 300</div><div>-ksp_monitor_true_residual</div><div>-log_summary</div><div>-m pipe_7eqn_2phase_step7_ps.i</div><div>-mat_fd_type ds</div>
<div>-pc_type lu</div><div><span style="background-color:rgb(102,255,255)">-snes_fd</span></div><div>-snes_monitor</div><div>#End of PETSc Option Table entries</div></div><div><br></div><div><font color="#990000" style="background-color:rgb(255,255,51)">The full code output along with log_summary are attached.</font></div>
<div><br></div><div>I've noticed that when using <font color="#990000" style="background-color:rgb(255,255,153)">snes_fd</font>, the non-linear convergence is always good in each time step, around 3-4 non-linear steps with almost quadratic convergence rate. In each non-linear step, it uses only 1 linear step to converge as I used '-pc_type lu' and only 1 linear step is expected. Here is a piece of output I pulled out from the code output (very nice non-linear, linear performance but of course very expensive):</div>
<div><br></div><div><div>DT: 1.234568e-05</div><div> Solving time step 7, time=4.34568e-05...</div><div> Initial |residual|_2 = 3.547156e+00</div><div> NL step 0, |residual|_2 = 3.547156e+00</div><div> 0 SNES Function norm 3.547155872103e+00 </div>
<div> 0 KSP unpreconditioned resid norm 3.547155872103e+00 true resid norm 3.547155872103e+00 ||r(i)||/||b|| 1.000000000000e+00</div><div> 1 KSP unpreconditioned resid norm 3.128472759493e-15 true resid norm 2.343197746412e-15 ||r(i)||/||b|| 6.605849392864e-16</div>
<div> Linear solve converged due to CONVERGED_RTOL iterations 1</div><div> NL step 1, |residual|_2 = 4.900005e-04</div><div> 1 SNES Function norm 4.900004596844e-04 </div><div> 0 KSP unpreconditioned resid norm 4.900004596844e-04 true resid norm 4.900004596844e-04 ||r(i)||/||b|| 1.000000000000e+00</div>
<div> 1 KSP unpreconditioned resid norm 5.026229113909e-18 true resid norm 1.400595243895e-17 ||r(i)||/||b|| 2.858354959089e-14</div><div> Linear solve converged due to CONVERGED_RTOL iterations 1</div><div> NL step 2, |residual|_2 = 1.171419e-06</div>
<div> 2 SNES Function norm 1.171419468770e-06 </div><div> 0 KSP unpreconditioned resid norm 1.171419468770e-06 true resid norm 1.171419468770e-06 ||r(i)||/||b|| 1.000000000000e+00</div><div> 1 KSP unpreconditioned resid norm 5.679448617332e-21 true resid norm 4.763172202015e-21 ||r(i)||/||b|| 4.066154207782e-15</div>
<div> Linear solve converged due to CONVERGED_RTOL iterations 1</div><div> NL step 3, |residual|_2 = 1.860041e-08</div><div> 3 SNES Function norm 1.860041398803e-08 </div><div>Converged:1</div></div><div><br></div><div>
Back to the <font color="#990000" style="background-color:rgb(255,255,51)">snes_mf_operator</font> option, it behaviors differently. It generally takes more non-linear and linear steps. The 'KSP unpreconditioned resid norm' drops nicely however the 'true resid norm' seems to be a bit wired to me, drops then increases.</div>
<div><br></div><div><div>DT: 1.524158e-05</div><div> Solving time step 9, time=7.24158e-05...</div><div> Initial |residual|_2 = 3.601003e+00</div><div> NL step 0, |residual|_2 = 3.601003e+00</div><div> 0 SNES Function norm 3.601003423006e+00 </div>
<div> 0 KSP unpreconditioned resid norm 3.601003423006e+00 true resid norm 3.601003423006e+00 ||r(i)||/||b|| 1.000000000000e+00</div><div> 1 KSP unpreconditioned resid norm 5.931429724028e-02 true resid norm 5.931429724028e-02 ||r(i)||/||b|| 1.647160257092e-02</div>
<div> 2 KSP unpreconditioned resid norm 1.379343811770e-05 true resid norm 5.203950797327e+00 ||r(i)||/||b|| 1.445139086534e+00</div><div> 3 KSP unpreconditioned resid norm 4.432805478482e-08 true resid norm 5.203984109211e+00 ||r(i)||/||b|| 1.445148337256e+00</div>
<div> Linear solve converged due to CONVERGED_RTOL iterations 3</div><div> NL step 1, |residual|_2 = 5.928815e-02</div><div> 1 SNES Function norm 5.928815267199e-02 </div><div> 0 KSP unpreconditioned resid norm 5.928815267199e-02 true resid norm 5.928815267199e-02 ||r(i)||/||b|| 1.000000000000e+00</div>
<div> 1 KSP unpreconditioned resid norm 3.276993782949e-06 true resid norm 3.276993782949e-06 ||r(i)||/||b|| 5.527232061148e-05</div><div> 2 KSP unpreconditioned resid norm 2.082083269186e-08 true resid norm 1.551766076370e-05 ||r(i)||/||b|| 2.617329106129e-04</div>
<div> Linear solve converged due to CONVERGED_RTOL iterations 2</div><div> NL step 2, |residual|_2 = 3.340603e-05</div><div> 2 SNES Function norm 3.340603450829e-05 </div><div> 0 KSP unpreconditioned resid norm 3.340603450829e-05 true resid norm 3.340603450829e-05 ||r(i)||/||b|| 1.000000000000e+00</div>
<div> 1 KSP unpreconditioned resid norm 6.659426858789e-07 true resid norm 6.659426858789e-07 ||r(i)||/||b|| 1.993480207037e-02</div><div> 2 KSP unpreconditioned resid norm 6.115119674466e-07 true resid norm 2.887921320245e-06 ||r(i)||/||b|| 8.644909109246e-02</div>
<div> 3 KSP unpreconditioned resid norm 1.907116539439e-09 true resid norm 1.000874623281e-06 ||r(i)||/||b|| 2.996089293486e-02</div><div> 4 KSP unpreconditioned resid norm 3.383211446515e-12 true resid norm 1.005586686459e-06 ||r(i)||/||b|| 3.010194718591e-02</div>
<div> Linear solve converged due to CONVERGED_RTOL iterations 4</div><div> NL step 3, |residual|_2 = 2.126180e-05</div><div> 3 SNES Function norm 2.126179867301e-05 </div><div> 0 KSP unpreconditioned resid norm 2.126179867301e-05 true resid norm 2.126179867301e-05 ||r(i)||/||b|| 1.000000000000e+00</div>
<div> 1 KSP unpreconditioned resid norm 2.724944027954e-06 true resid norm 2.724944027954e-06 ||r(i)||/||b|| 1.281615008147e-01</div><div> 2 KSP unpreconditioned resid norm 7.933800605616e-10 true resid norm 2.776823963042e-06 ||r(i)||/||b|| 1.306015547295e-01</div>
<div> 3 KSP unpreconditioned resid norm 6.130449965920e-11 true resid norm 2.777694372634e-06 ||r(i)||/||b|| 1.306424924510e-01</div><div> 4 KSP unpreconditioned resid norm 2.090637685604e-13 true resid norm 2.777696567814e-06 ||r(i)||/||b|| 1.306425956963e-01</div>
<div> Linear solve converged due to CONVERGED_RTOL iterations 4</div><div> NL step 4, |residual|_2 = 2.863517e-06</div><div> 4 SNES Function norm 2.863517221239e-06 </div><div> 0 KSP unpreconditioned resid norm 2.863517221239e-06 true resid norm 2.863517221239e-06 ||r(i)||/||b|| 1.000000000000e+00</div>
<div> 1 KSP unpreconditioned resid norm 2.518692933040e-10 true resid norm 2.518692933039e-10 ||r(i)||/||b|| 8.795801590987e-05</div><div> 2 KSP unpreconditioned resid norm 2.165272180327e-12 true resid norm 1.136392813468e-09 ||r(i)||/||b|| 3.968520967987e-04</div>
<div> Linear solve converged due to CONVERGED_RTOL iterations 2</div><div> NL step 5, |residual|_2 = 9.132390e-08</div><div> 5 SNES Function norm 9.132390063388e-08 </div><div>Converged:1</div></div><div><br></div><div>
<br></div><div>My questions:</div><div>1, Is it true? when using snes_fd, the real Jacobian matrix, say J, is explicitly constructed. when combined with -pc_type lu, the problem</div><div>J (du) = -R</div><div>is directly solved as (du) = J^{-1} * (-R)</div>
<div>where J^{-1} is calculated from this explicitly constructed matrix J, using LU factorization.</div><div><br></div><div>2, what's the difference between snes_mf_operator and snes_fd?</div><div>What I understand (might be wrong) is snes_mf_operator does not *explicitly construct* the matrix J, as it is a matrix free method. Is the finite differencing methods behind the matrix free operator in snes_mf_operator and the matrix construction in snes_fd are the same?</div>
<div><br></div><div>3, It seems that snes_mf_operator is preconditioned, while snes_fd is not. Why it says ' KSP unpreconditioned resid norm ' but I am expecting 'KSP <span style="background-color:rgb(51,255,51)">preconditioned</span> resid norm'. Also if it is 'unpreconditioned', should it be identical to the 'true resid norm'? Is it my fault, for example, giving a bad preconditioning matrix, makes the KSP not working well?</div>
<div><br></div><div>I'd appreciate your help...there are too many (maybe bad) questions today. And please let me know if you may need more information.</div><div><br></div><div>Best,</div><div><br></div><div>Ling</div>