Weird (?) results obtained from experiments with the SNESMF module
Barry Smith
bsmith at mcs.anl.gov
Fri Dec 19 17:51:44 CST 2008
When I was a kid I was guided to be fair and to not exaggerate.
When I was
a teenager I found out that adults don't actually act that way, it was
a disappointment to
me. What was worse was once I became an adult I learned that
scientists don't act
that way: when they advocate something they exaggerate its positive
features and
minimize its negative features. This is what the great priests of
Jacobian-Free Newton Krylov
do, after you see one of their presentations you know that JFNK is
better in every possible
way from any other approach. In the practical world outside of their
lecture halls there is
no such thing as "one algorithm being better than another" it is only
"one algorithm is better
than another in this specific circumstance." This is why PETSc was
written the way it
is, to provide a toolbox that allows the selection of many
possibilities, NOT to have the
library writers decide what algorithm is better.
Their is actually nothing weird about the results you see, they are
actually what a
practitioner would expect :-)
Run both for 40 iterations with -ksp_monitor_true_residual
>My answer: That probably has to do with the fact that JFNK
approximately computes the product J(x)a. If the precision
>of that approximation is poor (and that's closely related to which
approach was used to calculate the differencing parameter,
>I think), then the linear solver must iterate more to produce a
"good" Newton correction.
This is kind of right, but the answer really has nothing to do with
Newton. It is only a
problem with the linear solve. Krylov methods depend fundamentally on
the fact that
they are used for LINEAR operators; when you do the differencing to
apply J this is
no longer exactly a linear operator thus the "Krylov" space being
generated starts
to become garbage. So matrix-free and non-matrix-free for the first
few iterations will
be generally similar but after some iterations the matrix-free will
start to drift off into
near garbage. Monkeying around with the h that is used can help
reduce the garbage to some
degree, but I doubt there is one "great" way to do it that always
works well. In fact, when
people publish on "my way of computing h is better than yours", they
will, of course, exaggerate
the benefits of theirs and minimize the problems with it.
Because of this "drifting problems" JFNK rarely makes sense without a
very good preconditioner,
that reduces the number of iterations to a handful.
With regard to why particular things happen in one code and not in
another (like an extra vector), you
will have to look at the code and run in the debugger to answer the
questions. That is how we
would answer these questions. BTW: the extra vector is because one
applies J(u)*a, the extra vector
is where the u values are stored.
Barry
If you poke around with the code you will sometimes find there are
better ways to do things then
we have done them, this is why PETSc is open source, we don't expect
that we can do things better
than everyone else.
On Dec 19, 2008, at 4:39 PM, Rafael Santos Coelho wrote:
> Hey, Barry,
>
> here's what you asked:
>
> --------------------------------------------------------------------------------------------------------
> JFNK
> KSP SOLVER: GMRES
> MESH SIZE : 512 x 512 unknows
> NUMBER OF PROCESSORS : 24
>
> Linear solve converged due to CONVERGED_RTOL iterations 25042
> Linear solve converged due to CONVERGED_RTOL iterations 33804
> Linear solve converged due to CONVERGED_RTOL iterations 33047
> Linear solve converged due to CONVERGED_RTOL iterations 21219
> SNES Object:
> type: ls
> line search variant: SNESLineSearchCubic
> alpha=0.0001, maxstep=1e+08, steptol=1e-12
> maximum iterations=100, maximum function evaluations=1000000
> tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
> total number of linear solver iterations=113112
> total number of function evaluations=116893
> KSP Object:
> type: gmres
> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> GMRES: happy breakdown tolerance 1e-30
> maximum iterations=1000000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:
> type: none
> linear system matrix = precond matrix:
> Matrix Object:
> type=mffd, rows=262144, cols=262144
> matrix-free approximation:
> err=1e-07 (relative error in function evaluation)
> Using wp compute h routine
> Computes normA
> Does not compute normU
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
>
> Max Max/
> Min Avg Total
> Time (sec): 8.080e+02 1.00009 8.080e+02
> Objects: 6.600e+01 1.00000 6.600e+01
> Flops: 1.044e+11 1.01176 1.036e
> +11 2.485e+12
> Flops/sec: 1.292e+08 1.01184 1.282e
> +08 3.076e+09
> Memory: 5.076e+06
> 1.01530 1.207e+08
> MPI Messages: 4.676e+05 2.00000 3.702e+05
> 8.884e+06
> MPI Message Lengths: 4.002e+08 2.00939 8.623e+02 7.661e
> +09
> MPI Reductions: 9.901e+03 1.00000
>
> Object Type Creations Destructions Memory Descendants'
> Mem.
>
> --- Event Stage 0: Main Stage
>
> SNES 1 1
> 124 0
> Krylov Solver 1 1
> 16880 0
> Preconditioner 1 1
> 0 0
> Distributed array 1 1
> 46568 0
> Index Set 6 6
> 135976 0
> Vec 46 46
> 3901252 0
> Vec Scatter 3 3
> 0 0
> IS L to G Mapping 1 1
> 45092 0
> MatMFFD 1 1
> 0 0
> Matrix 4 4
> 1011036 0
> Viewer 1 1
> 0 0
>
> OptionTable: -ksp_converged_reason
> OptionTable: -ksp_max_it 1000000
> OptionTable: -ksp_type gmres
> OptionTable: -log_summary
> OptionTable: -option_table
> OptionTable: -par 4.0
> OptionTable: -pc_type none
> OptionTable: -snes_converged_reason
> OptionTable: -snes_max_funcs 1000000
> OptionTable: -snes_max_it 100
> OptionTable: -snes_mf
> OptionTable: -snes_view
> OptionTable: -xdiv 512
> OptionTable: -ydiv 512
> --------------------------------------------------------------------------------------------------------
>
> --------------------------------------------------------------------------------------------------------
> NK
> KSP SOLVER: GMRES
> MESH SIZE : 512 x 512 unknows
> NUMBER OF PROCESSORS : 24
>
> Linear solve converged due to CONVERGED_RTOL iterations 25038
> Linear solve converged due to CONVERGED_RTOL iterations 25995
> Linear solve converged due to CONVERGED_RTOL iterations 26769
> SNES Object:
> type: ls
> line search variant: SNESLineSearchCubic
> alpha=0.0001, maxstep=1e+08, steptol=1e-12
> maximum iterations=100, maximum function evaluations=1000000
> tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
> total number of linear solver iterations=77802
> total number of function evaluations=4
> KSP Object:
> type: gmres
> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> GMRES: happy breakdown tolerance 1e-30
> maximum iterations=1000000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:
> type: none
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=262144, cols=262144
> total: nonzeros=1308672, allocated nonzeros=1308672
> not using I-node (on process 0) routines
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
>
> Max Max/Min
> Avg Total
> Time (sec): 5.329e+02 1.00008 5.329e+02
> Objects: 6.300e+01 1.00000 6.300e+01
> Flops: 6.553e+10 1.01230 6.501e
> +10 1.560e+12
> Flops/sec: 1.230e+08 1.01230 1.220e+08
> 2.928e+09
> Memory: 4.989e+06
> 1.01589 1.186e+08
> MPI Messages: 3.216e+05 1.99999 2.546e+05
> 6.111e+06
> MPI Message Lengths: 2.753e+08 2.00939 8.623e+02 5.269e+09
> MPI Reductions: 6.704e+03 1.00000
>
> Object Type Creations Destructions Memory Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> SNES 1 1
> 124 0
> Krylov Solver 1 1
> 16880 0
> Preconditioner 1 1
> 0 0
> Distributed array 1 1
> 46568 0
> Index Set 6 6
> 135976 0
> Vec 45 45
> 3812684 0
> Vec Scatter 3 3
> 0 0
> IS L to G Mapping 1 1
> 45092 0
> Matrix 3 3
> 1011036 0
> Viewer 1 1
> 0 0
>
> OptionTable: -ksp_converged_reason
> OptionTable: -ksp_max_it 1000000
> OptionTable: -ksp_type gmres
> OptionTable: -log_summary
> OptionTable: -option_table
> OptionTable: -par 4.0
> OptionTable: -pc_type none
> OptionTable: -snes_converged_reason
> OptionTable: -snes_max_funcs 1000000
> OptionTable: -snes_max_it 100
> OptionTable: -snes_view
> OptionTable: -xdiv 512
> OptionTable: -ydiv 512
> --------------------------------------------------------------------------------------------------------
>
More information about the petsc-users
mailing list