[petsc-users] interpreting data from SNESSolve profiling
Matteo Semplice
matteo.semplice at uninsubria.it
Wed Feb 8 06:56:23 CST 2023
Dear all,
I am trying to optimize the nonlinear solvers in a code of mine,
but I am having a hard time at interpreting the profiling data from the
SNES. In particular, if I run with -snesCorr_snes_lag_jacobian 5
-snesCorr_snes_linesearch_monitor -snesCorr_snes_monitor
-snesCorr_snes_linesearch_type basic -snesCorr_snes_view I get, for all
timesteps an output like
0 SNES Function norm 2.204257292307e+00
1 SNES Function norm 5.156376709750e-03
2 SNES Function norm 9.399026338316e-05
3 SNES Function norm 1.700505246874e-06
4 SNES Function norm 2.938127043559e-08
SNES Object: snesCorr (snesCorr_) 1 MPI process
type: newtonls
maximum iterations=50, maximum function evaluations=10000
tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
total number of linear solver iterations=4
total number of function evaluations=5
norm schedule ALWAYS
Jacobian is rebuilt every 5 SNES iterations
SNESLineSearch Object: (snesCorr_) 1 MPI process
type: basic
maxstep=1.000000e+08, minlambda=1.000000e-12
tolerances: relative=1.000000e-08, absolute=1.000000e-15,
lambda=1.000000e-08
maximum iterations=40
KSP Object: (snesCorr_) 1 MPI process
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (snesCorr_) 1 MPI process
type: ilu
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: (snesCorr_) 1 MPI process
type: seqaij
rows=1200, cols=1200
package used to perform factorization: petsc
total: nonzeros=17946, allocated nonzeros=17946
using I-node routines: found 400 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 1 MPI process
type: seqaij
rows=1200, cols=1200
total: nonzeros=17946, allocated nonzeros=17946
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 400 nodes, limit used is 5
I guess that this means that no linesearch is performed and the full
Newton step is always performed (I did not report the full output, but
all timesteps are alike). Also, with the default (bt) LineSearch, the
total CPU time does not change, which seems in line with this.
However, I'd have expected that the time spent in SNESLineSearch would
be negligible, but the flamegraph is showing that about 38% of the time
spent by SNESSolve is actually spent in SNESLineSearch. Furthermore,
SNESLineSearch seems to cause more SNESFunction evaluations (in terms of
CPU time) than the SNESSolve itself. The flamegraph is attached.
Could some expert help me in understanding these data? Is the LineSearch
actually performing the newton step? Given that the full step is always
taken, can the SNESFunction evaluations from the LineSearch be skipped?
Thanks a lot!
Matteo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230208/14f71c48/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: flame.svg
Type: image/svg+xml
Size: 35538 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230208/14f71c48/attachment-0001.svg>
More information about the petsc-users
mailing list