Weird (?) results obtained from experiments with the SNESMF module
Barry Smith
bsmith at mcs.anl.gov
Fri Dec 19 14:21:54 CST 2008
Send the output from using -snes_view -options_table for each run;
it may be
you are not running what you think you are running.
Barry
On Dec 19, 2008, at 12:27 PM, Rafael Santos Coelho wrote:
> Hello to everyone,
>
> Weeks ago I did some experiments with PETSc's SNESMF module using
> the Bratu nonlinear PDE in 2D example. I ran a bunch of tests
> (varying the number of processors and the mesh size) with the "-
> log_summary" command-line option to collect several measures,
> namely: max runtime, total memory usage, floating point operations,
> linear iterations count and MPI messages sent. My intention was to
> compare SNESMF (the jacobian-free Newton-Krylov method, which I will
> call JFNK) with SNES ("simple" Newton-Krylov method, which I will
> call NK). In theory, we basically know that:
>
> • JFNK is supposed to consume less memory than NK since the true
> elements of the jacobian matrix are never actually stored;
> • JFNK is supposed to do less or roughly the same amount of
> floating point operations than NK since it does not calculate the
> jacobian matrix entries in every Newton iteration.
> Well, I have to admit that I was pretty surprised by the results.
> Shortly, except in few cases, NK outperformed JFNK with regard to
> each one of the metrics mentioned above. Before anything, some
> clarifications:
>
> • Each test was repeated 5 times and I summarized the results using
> the arithmetic mean in order to attenuate possible fluctuations;
> • The tests were run on a Beowulf cluster with 30 processing nodes,
> and each node is a Intel(R) Core(TM)2 CPU 4300 1.80GHz with 2MB of
> cache, 2GB of RAM, Fedora Core 6 GNU/Linux (kernel version
> 2.6.26.6-49), PETSc version 2.3.3-p15 and LAM/MPI version 6.41;
> • The Walker-Pernice formula was chosen to compute the differencing
> parameter "h" used with the finite difference based matrix-free
> jacobian;
> • No preconditioners or matrix reorderings were employed.
> Now here come my questions:
>
> • How come JFNK used more memory than NK? Looking at the log files
> at the end of this message, there were 4 matrix creations in JFNK.
> Why 4?! And also why did JFNK create one vector more (46) than NK
> (45)? Where does that particular vector come from? Is that vector
> the h parameter history? How can I figure that out?
> • Why did JFNK perform worse than NK in terms of max runtime, total
> linear iterations count, MPI messages sent, MPI reductions and flops?
> My answer: JFNK had a much higher linear iterations count than NK,
> and that's probably why it turned out to be worse than NK in all
> those aspects. As for the MPI reductions, I believe that JFNK topped
> NK because of all the vector norm operations needed to compute the
> "h" parameter.
>
> • Why did JFNK's KSP solver (GMRES) iterate way more per Newton
> iteration than NK?
> My answer: That probably has to do with the fact that JFNK
> approximately computes the product J(x)a. If the precision of that
> approximation is poor (and that's closely related to which approach
> was used to calculate the differencing parameter, I think), then the
> linear solver must iterate more to produce a "good" Newton correction.
>
> I'd really appreciate if you guys could comment on my observations
> and conclusions and help me out with my questions. I've pasted below
> some excerpts of two log files generated by the "-log_summary" option.
>
> Thanks in advance,
>
> Rafael
>
> --------------------------------------------------------------------------------------------------------
> NK
> KSP SOLVER: GMRES
> MESH SIZE : 512 x 512 unknows
> NUMBER OF PROCESSORS : 24
>
> Linear solve converged due to CONVERGED_RTOL iterations 25038
> Linear solve converged due to CONVERGED_RTOL iterations 25995
> Linear solve converged due to CONVERGED_RTOL iterations 26769
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
>
> Max Max/Min
> Avg Total
> Time (sec): 4.597e+02 1.00008 4.597e+02
> Objects: 6.200e+01 1.00000 6.200e+01
> Flops: 6.553e+10 1.01230 6.501e
> +10 1.560e+12
> Flops/sec: 1.425e+08 1.01233 1.414e
> +08 3.394e+09
> Memory: 4.989e+06
> 1.01590 1.186e+08
> MPI Messages: 3.216e+05 2.00000 2.546e+05
> 6.111e+06
> MPI Message Lengths: 2.753e+08 2.00939 8.623e+02 5.269e
> +09
> MPI Reductions: 6.704e+03 1.00000
>
> Object Type Creations Destructions Memory Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> SNES 1 1 124 0
> Krylov Solver 1 1 16880 0
> Preconditioner 1 1 0 0
> Distributed array 1 1 46568 0
> Index Set 6 6 135976 0
> Vec 45 45 3812684 0
> Vec Scatter 3 3
> 0 0
> IS L to G Mapping 1 1 45092 0
> Matrix 3 3 1011036 0
> --------------------------------------------------------------------------------------------------------
>
> --------------------------------------------------------------------------------------------------------
> JFNK
> KSP SOLVER: GMRES
> MESH SIZE : 512 x 512 unknows
> NUMBER OF PROCESSORS : 24
>
> Linear solve converged due to CONVERGED_RTOL iterations 25042
> Linear solve converged due to CONVERGED_RTOL iterations 33804
> Linear solve converged due to CONVERGED_RTOL iterations 33047
> Linear solve converged due to CONVERGED_RTOL iterations 21219
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
>
> Max Max/Min
> Avg Total
> Time (sec): 1.076e+03 1.00009 1.076e+03
> Objects: 6.500e+01 1.00000 6.500e+01
> Flops: 1.044e+11 1.01176 1.036e
> +11 2.485e+12
> Flops/sec: 9.702e+07 1.01185 9.626e
> +07 2.310e+09
> Memory: 5.076e+06
> 1.01530 1.207e+08
> MPI Messages: 4.676e+05 2.00000 3.702e+05
> 8.884e+06
> MPI Message Lengths: 4.002e+08 2.00939 8.623e+02 7.661e
> +09
> MPI Reductions: 9.901e+03 1.00000
>
> Object Type Creations Destructions Memory Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> SNES 1 1
> 124 0
> Krylov Solver 1 1
> 16880 0
> Preconditioner 1 1
> 0 0
> Distributed array 1 1
> 46568 0
> Index Set 6 6
> 135976 0
> Vec 46 46
> 3901252 0
> Vec Scatter 3 3
> 0 0
> IS L to G Mapping 1 1
> 45092 0
> MatMFFD 1 1
> 0 0
> Matrix 4 4
> 1011036 0
> --------------------------------------------------------------------------------------------------------
>
>
More information about the petsc-users
mailing list