Weird (?) results obtained from experiments with the SNESMF module
Lisandro Dalcin
dalcinl at gmail.com
Fri Dec 19 18:20:45 CST 2008
As Barry said (please correct me if I'm wrong), GMRES is sometimes
sensitive to roundoff error, and VERY sensitive is this error is the
noise when computing an approximate linear operator via FD, and this
could get even worse for large problems if you use GMRES(30), with is
the default restart value in PETSc...
Rafael, please follow my advice, give a try to plain CG, and please
come back with your comments...
On Fri, Dec 19, 2008 at 8:51 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> When I was a kid I was guided to be fair and to not exaggerate. When I
> was
> a teenager I found out that adults don't actually act that way, it was a
> disappointment to
> me. What was worse was once I became an adult I learned that scientists
> don't act
> that way: when they advocate something they exaggerate its positive features
> and
> minimize its negative features. This is what the great priests of
> Jacobian-Free Newton Krylov
> do, after you see one of their presentations you know that JFNK is better in
> every possible
> way from any other approach. In the practical world outside of their lecture
> halls there is
> no such thing as "one algorithm being better than another" it is only "one
> algorithm is better
> than another in this specific circumstance." This is why PETSc was written
> the way it
> is, to provide a toolbox that allows the selection of many possibilities,
> NOT to have the
> library writers decide what algorithm is better.
>
> Their is actually nothing weird about the results you see, they are
> actually what a
> practitioner would expect :-)
>
>
> Run both for 40 iterations with -ksp_monitor_true_residual
>
>>My answer: That probably has to do with the fact that JFNK approximately
>> computes the product J(x)a. If the precision
>>of that approximation is poor (and that's closely related to which approach
>> was used to calculate the differencing parameter,
>>I think), then the linear solver must iterate more to produce a "good"
>> Newton correction.
>
> This is kind of right, but the answer really has nothing to do with Newton.
> It is only a
> problem with the linear solve. Krylov methods depend fundamentally on the
> fact that
> they are used for LINEAR operators; when you do the differencing to apply J
> this is
> no longer exactly a linear operator thus the "Krylov" space being generated
> starts
> to become garbage. So matrix-free and non-matrix-free for the first few
> iterations will
> be generally similar but after some iterations the matrix-free will start to
> drift off into
> near garbage. Monkeying around with the h that is used can help reduce the
> garbage to some
> degree, but I doubt there is one "great" way to do it that always works
> well. In fact, when
> people publish on "my way of computing h is better than yours", they will,
> of course, exaggerate
> the benefits of theirs and minimize the problems with it.
>
> Because of this "drifting problems" JFNK rarely makes sense without a very
> good preconditioner,
> that reduces the number of iterations to a handful.
>
> With regard to why particular things happen in one code and not in another
> (like an extra vector), you
> will have to look at the code and run in the debugger to answer the
> questions. That is how we
> would answer these questions. BTW: the extra vector is because one applies
> J(u)*a, the extra vector
> is where the u values are stored.
>
> Barry
>
> If you poke around with the code you will sometimes find there are better
> ways to do things then
> we have done them, this is why PETSc is open source, we don't expect that we
> can do things better
> than everyone else.
>
>
>
> On Dec 19, 2008, at 4:39 PM, Rafael Santos Coelho wrote:
>
>> Hey, Barry,
>>
>> here's what you asked:
>>
>>
>> --------------------------------------------------------------------------------------------------------
>> JFNK
>> KSP SOLVER: GMRES
>> MESH SIZE : 512 x 512 unknows
>> NUMBER OF PROCESSORS : 24
>>
>> Linear solve converged due to CONVERGED_RTOL iterations 25042
>> Linear solve converged due to CONVERGED_RTOL iterations 33804
>> Linear solve converged due to CONVERGED_RTOL iterations 33047
>> Linear solve converged due to CONVERGED_RTOL iterations 21219
>> SNES Object:
>> type: ls
>> line search variant: SNESLineSearchCubic
>> alpha=0.0001, maxstep=1e+08, steptol=1e-12
>> maximum iterations=100, maximum function evaluations=1000000
>> tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
>> total number of linear solver iterations=113112
>> total number of function evaluations=116893
>> KSP Object:
>> type: gmres
>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>> GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=1000000, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:
>> type: none
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mffd, rows=262144, cols=262144
>> matrix-free approximation:
>> err=1e-07 (relative error in function evaluation)
>> Using wp compute h routine
>> Computes normA
>> Does not compute normU
>> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
>>
>> Max Max/Min Avg
>> Total
>> Time (sec): 8.080e+02 1.00009 8.080e+02
>> Objects: 6.600e+01 1.00000 6.600e+01
>> Flops: 1.044e+11 1.01176 1.036e+11
>> 2.485e+12
>> Flops/sec: 1.292e+08 1.01184 1.282e+08
>> 3.076e+09
>> Memory: 5.076e+06 1.01530
>> 1.207e+08
>> MPI Messages: 4.676e+05 2.00000 3.702e+05
>> 8.884e+06
>> MPI Message Lengths: 4.002e+08 2.00939 8.623e+02 7.661e+09
>> MPI Reductions: 9.901e+03 1.00000
>>
>> Object Type Creations Destructions Memory Descendants' Mem.
>>
>> --- Event Stage 0: Main Stage
>>
>> SNES 1 1 124
>> 0
>> Krylov Solver 1 1 16880
>> 0
>> Preconditioner 1 1 0
>> 0
>> Distributed array 1 1 46568 0
>> Index Set 6 6 135976
>> 0
>> Vec 46 46 3901252
>> 0
>> Vec Scatter 3 3 0
>> 0
>> IS L to G Mapping 1 1 45092 0
>> MatMFFD 1 1 0
>> 0
>> Matrix 4 4 1011036
>> 0
>> Viewer 1 1 0
>> 0
>>
>> OptionTable: -ksp_converged_reason
>> OptionTable: -ksp_max_it 1000000
>> OptionTable: -ksp_type gmres
>> OptionTable: -log_summary
>> OptionTable: -option_table
>> OptionTable: -par 4.0
>> OptionTable: -pc_type none
>> OptionTable: -snes_converged_reason
>> OptionTable: -snes_max_funcs 1000000
>> OptionTable: -snes_max_it 100
>> OptionTable: -snes_mf
>> OptionTable: -snes_view
>> OptionTable: -xdiv 512
>> OptionTable: -ydiv 512
>>
>> --------------------------------------------------------------------------------------------------------
>>
>>
>> --------------------------------------------------------------------------------------------------------
>> NK
>> KSP SOLVER: GMRES
>> MESH SIZE : 512 x 512 unknows
>> NUMBER OF PROCESSORS : 24
>>
>> Linear solve converged due to CONVERGED_RTOL iterations 25038
>> Linear solve converged due to CONVERGED_RTOL iterations 25995
>> Linear solve converged due to CONVERGED_RTOL iterations 26769
>> SNES Object:
>> type: ls
>> line search variant: SNESLineSearchCubic
>> alpha=0.0001, maxstep=1e+08, steptol=1e-12
>> maximum iterations=100, maximum function evaluations=1000000
>> tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
>> total number of linear solver iterations=77802
>> total number of function evaluations=4
>> KSP Object:
>> type: gmres
>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>> GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=1000000, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:
>> type: none
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=262144, cols=262144
>> total: nonzeros=1308672, allocated nonzeros=1308672
>> not using I-node (on process 0) routines
>> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
>>
>> Max Max/Min Avg
>> Total
>> Time (sec): 5.329e+02 1.00008 5.329e+02
>> Objects: 6.300e+01 1.00000 6.300e+01
>> Flops: 6.553e+10 1.01230 6.501e+10
>> 1.560e+12
>> Flops/sec: 1.230e+08 1.01230 1.220e+08
>> 2.928e+09
>> Memory: 4.989e+06 1.01589
>> 1.186e+08
>> MPI Messages: 3.216e+05 1.99999 2.546e+05 6.111e+06
>> MPI Message Lengths: 2.753e+08 2.00939 8.623e+02 5.269e+09
>> MPI Reductions: 6.704e+03 1.00000
>>
>> Object Type Creations Destructions Memory Descendants' Mem.
>>
>> --- Event Stage 0: Main Stage
>>
>> SNES 1 1 124
>> 0
>> Krylov Solver 1 1 16880
>> 0
>> Preconditioner 1 1 0
>> 0
>> Distributed array 1 1 46568
>> 0
>> Index Set 6 6 135976
>> 0
>> Vec 45 45 3812684
>> 0
>> Vec Scatter 3 3 0
>> 0
>> IS L to G Mapping 1 1 45092 0
>> Matrix 3 3 1011036
>> 0
>> Viewer 1 1 0
>> 0
>>
>> OptionTable: -ksp_converged_reason
>> OptionTable: -ksp_max_it 1000000
>> OptionTable: -ksp_type gmres
>> OptionTable: -log_summary
>> OptionTable: -option_table
>> OptionTable: -par 4.0
>> OptionTable: -pc_type none
>> OptionTable: -snes_converged_reason
>> OptionTable: -snes_max_funcs 1000000
>> OptionTable: -snes_max_it 100
>> OptionTable: -snes_view
>> OptionTable: -xdiv 512
>> OptionTable: -ydiv 512
>>
>> --------------------------------------------------------------------------------------------------------
>>
>
>
--
Lisandro Dalcín
---------------
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
More information about the petsc-users
mailing list