Weird (?) results obtained from experiments with the SNESMF module
Rafael Santos Coelho
rafaelsantoscoelho at gmail.com
Fri Dec 19 12:27:20 CST 2008
Hello to everyone,
Weeks ago I did some experiments with PETSc's SNESMF module using the Bratu
nonlinear PDE in
2D<http://www.mcs.anl.gov/petsc/petsc-2/snapshots/petsc-current/src/snes/examples/tutorials/ex5.c.html>example.
I ran a bunch of tests (varying the number of processors and the
mesh size) with the "-log_summary" command-line option to collect several
measures, namely: max runtime, total memory usage, floating point
operations, linear iterations count and MPI messages sent. My intention was
to compare SNESMF (the jacobian-free Newton-Krylov method, which I will call
JFNK) with SNES ("simple" Newton-Krylov method, which I will call NK). In
theory, we basically know that:
1. JFNK is supposed to consume less memory than NK since the true
elements of the jacobian matrix are never actually stored;
2. JFNK is supposed to do less or roughly the same amount of floating
point operations than NK since it does not calculate the jacobian matrix
entries in every Newton iteration.
Well, I have to admit that I was pretty surprised by the results. Shortly,
except in few cases, NK outperformed JFNK with regard to each one of the
metrics mentioned above. Before anything, some clarifications:
1. Each test was repeated 5 times and I summarized the results using the
arithmetic mean in order to attenuate possible fluctuations;
2. The tests were run on a Beowulf cluster with 30 processing nodes, and
each node is a Intel(R) Core(TM)2 CPU 4300 1.80GHz with 2MB of cache, 2GB of
RAM, Fedora Core 6 GNU/Linux (kernel version 2.6.26.6-49), PETSc version
2.3.3-p15 and LAM/MPI version 6.41;
3. The Walker-Pernice<http://www.dcsc.sdu.dk/docs/PETSC/src/snes/mf/wp.c.html>formula
was chosen to compute the differencing parameter "h" used with the
finite difference based matrix-free jacobian;
4. No preconditioners or matrix reorderings were employed.
Now here come my questions:
- How come JFNK used more memory than NK? Looking at the log files at the
end of this message, there were 4 matrix creations in JFNK. Why 4?! And also
why did JFNK create one vector more (46) than NK (45)? Where does that
particular vector come from? Is that vector the h parameter history? How can
I figure that out?
- Why did JFNK perform worse than NK in terms of max runtime, total
linear iterations count, MPI messages sent, MPI reductions and flops?
My answer: JFNK had a much higher linear iterations count than NK, and
that's probably why it turned out to be worse than NK in all those aspects.
As for the MPI reductions, I believe that JFNK topped NK because of all the
vector norm operations needed to compute the "h" parameter.
- Why did JFNK's KSP solver (GMRES) iterate way more per Newton iteration
than NK?
My answer: That probably has to do with the fact that JFNK approximately
computes the product J(x)a. If the precision of that approximation is poor
(and that's closely related to which approach was used to calculate the
differencing parameter, I think), then the linear solver must iterate more
to produce a "good" Newton correction.
I'd really appreciate if you guys could comment on my observations and
conclusions and help me out with my questions. I've pasted below some
excerpts of two log files generated by the "-log_summary" option.
Thanks in advance,
Rafael
--------------------------------------------------------------------------------------------------------
NK
KSP SOLVER: GMRES
MESH SIZE : 512 x 512 unknows
NUMBER OF PROCESSORS : 24
Linear solve converged due to CONVERGED_RTOL iterations 25038
Linear solve converged due to CONVERGED_RTOL iterations 25995
Linear solve converged due to CONVERGED_RTOL iterations 26769
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
Max Max/Min
Avg Total
*Time (sec): 4.597e+02 1.00008 4.597e+02*
Objects: 6.200e+01 1.00000 6.200e+01
*Flops: 6.553e+10 1.01230 6.501e+10
1.560e+12*
Flops/sec: 1.425e+08 1.01233 1.414e+08
3.394e+09
*Memory: 4.989e+06 1.01590
1.186e+08*
*MPI Messages: 3.216e+05 2.00000 2.546e+05 6.111e+06
*
MPI Message Lengths: 2.753e+08 2.00939 8.623e+02 5.269e+09
*MPI Reductions: 6.704e+03 1.00000*
Object Type Creations Destructions Memory Descendants' Mem.
--- Event Stage 0: Main Stage
SNES 1 1 124 0
Krylov Solver 1 1 16880 0
Preconditioner 1 1 0 0
Distributed array 1 1 46568 0
Index Set 6 6 135976 0
*Vec 45 45 3812684 0*
Vec Scatter 3 3 0 0
IS L to G Mapping 1 1 45092 0
*Matrix 3 3 1011036 0*
--------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
JFNK
KSP SOLVER: GMRES
MESH SIZE : 512 x 512 unknows
NUMBER OF PROCESSORS : 24
Linear solve converged due to CONVERGED_RTOL iterations 25042
Linear solve converged due to CONVERGED_RTOL iterations 33804
Linear solve converged due to CONVERGED_RTOL iterations 33047
Linear solve converged due to CONVERGED_RTOL iterations 21219
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
Max Max/Min
Avg Total
*Time (sec): 1.076e+03 1.00009 1.076e+03*
Objects: 6.500e+01 1.00000 6.500e+01
*Flops: 1.044e+11 1.01176 1.036e+11
2.485e+12*
Flops/sec: 9.702e+07 1.01185 9.626e+07
2.310e+09
*Memory: 5.076e+06
1.01530 1.207e+08*
*MPI Messages: 4.676e+05 2.00000 3.702e+05
8.884e+06*
MPI Message Lengths: 4.002e+08 2.00939 8.623e+02 7.661e+09
*MPI Reductions: 9.901e+03 1.00000*
Object Type Creations Destructions Memory Descendants' Mem.
--- Event Stage 0: Main Stage
SNES 1 1 124 0
Krylov Solver 1 1 16880 0
Preconditioner 1 1 0 0
Distributed array 1 1 46568 0
Index Set 6 6 135976 0
*Vec 46 46 3901252 0*
Vec Scatter 3 3 0
0
IS L to G Mapping 1 1 45092 0
MatMFFD 1 1 0 0
*Matrix 4 4 1011036
0*
--------------------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20081219/4417f117/attachment.htm>
More information about the petsc-users
mailing list