Weird (?) results obtained from experiments with the SNESMF module

Fri Dec 19 17:51:44 CST 2008

   When I was a kid I was guided to be fair and  to not exaggerate.   
When I was
a teenager I found out that adults don't actually act that way, it was  
a disappointment to
me. What was worse was once I became an adult I learned that  
scientists don't act
that way: when they advocate something they exaggerate its positive  
features and
minimize its negative features. This is what the great priests of  
Jacobian-Free Newton Krylov
do, after you see one of their presentations you know that JFNK is  
better in every possible
way from any other approach. In the practical world outside of their  
lecture halls there is
no such thing as "one algorithm being better than another" it is only  
"one algorithm is better
than another in this specific circumstance." This is why PETSc was  
written the way it
is, to provide a toolbox that allows the selection of many  
possibilities, NOT to have the
library writers decide what algorithm is better.

   Their is actually nothing weird about the results you see, they are  
actually what a
practitioner would expect :-)

   Run both for 40 iterations  with -ksp_monitor_true_residual

 >My answer: That probably has to do with the fact that JFNK  
approximately computes the product J(x)a. If the precision
 >of that approximation is poor (and that's closely related to which  
approach was used to calculate the differencing parameter,
 >I think), then the linear solver must iterate more to produce a  
"good" Newton correction.

   This is kind of right, but the answer really has nothing to do with  
Newton. It is only a
problem with the linear solve. Krylov methods depend fundamentally on  
the fact that
they are used for LINEAR operators; when you do the differencing to  
apply J this is
no longer exactly a linear operator thus the "Krylov" space being  
generated starts
to become garbage. So matrix-free and non-matrix-free for the first  
few iterations will
be generally similar but after some iterations the matrix-free will  
start to drift off into
near garbage. Monkeying around with the  h that is used can help  
reduce the garbage to some
degree, but I doubt there is one "great" way to do it that always  
works well. In fact, when
people publish on "my way of computing h is better than yours", they  
will, of course, exaggerate
the benefits of theirs and minimize the problems with it.

Because of this "drifting problems" JFNK rarely makes sense without a  
very good preconditioner,
that reduces the number of iterations to a handful.

With regard to why particular things happen in one code and not in  
another (like an extra vector), you
will have to look at the code and run in the debugger to answer the  
questions. That is how we
would answer these questions. BTW: the extra vector is because one  
applies J(u)*a, the extra vector
is where the u values are stored.

    Barry

If you poke around with the code you will sometimes find there are  
better ways to do things then
we have done them, this is why PETSc is open source, we don't expect  
that we can do things better
than everyone else.

On Dec 19, 2008, at 4:39 PM, Rafael Santos Coelho wrote:

> Hey, Barry,
>
> here's what you asked:
>
> --------------------------------------------------------------------------------------------------------
> JFNK
> KSP SOLVER: GMRES
> MESH SIZE : 512 x 512 unknows
> NUMBER OF PROCESSORS : 24
>
> Linear solve converged due to CONVERGED_RTOL iterations 25042
> Linear solve converged due to CONVERGED_RTOL iterations 33804
> Linear solve converged due to CONVERGED_RTOL iterations 33047
> Linear solve converged due to CONVERGED_RTOL iterations 21219
> SNES Object:
>   type: ls
>     line search variant: SNESLineSearchCubic
>     alpha=0.0001, maxstep=1e+08, steptol=1e-12
>   maximum iterations=100, maximum function evaluations=1000000
>   tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
>   total number of linear solver iterations=113112
>   total number of function evaluations=116893
>   KSP Object:
>     type: gmres
>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt  
> Orthogonalization with no iterative refinement
>       GMRES: happy breakdown tolerance 1e-30
>     maximum iterations=1000000, initial guess is zero
>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>     left preconditioning
>   PC Object:
>     type: none
>     linear system matrix = precond matrix:
>     Matrix Object:
>       type=mffd, rows=262144, cols=262144
>           matrix-free approximation:
>             err=1e-07 (relative error in function evaluation)
>             Using wp compute h routine
>             Computes normA
>             Does not compute normU
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
>
>                                          Max           Max/ 
> Min        Avg             Total
> Time (sec):                    8.080e+02      1.00009     8.080e+02
> Objects:                        6.600e+01      1.00000     6.600e+01
> Flops:                           1.044e+11      1.01176     1.036e 
> +11    2.485e+12
> Flops/sec:                     1.292e+08      1.01184     1.282e 
> +08    3.076e+09
> Memory:                       5.076e+06       
> 1.01530                        1.207e+08
> MPI Messages:             4.676e+05      2.00000     3.702e+05     
> 8.884e+06
> MPI Message Lengths:  4.002e+08      2.00939     8.623e+02    7.661e 
> +09
> MPI Reductions:            9.901e+03      1.00000
>
> Object Type          Creations   Destructions   Memory  Descendants'  
> Mem.
>
> --- Event Stage 0: Main Stage
>
> SNES                        1              1                  
> 124            0
> Krylov Solver              1              1                 
> 16880          0
> Preconditioner            1              1                    
> 0              0
> Distributed array         1              1                
> 46568           0
> Index Set                   6              6                
> 135976         0
> Vec                          46             46             
> 3901252         0
> Vec Scatter                3              3                   
> 0               0
> IS L to G Mapping       1              1               
> 45092            0
> MatMFFD                   1              1                   
> 0              0
> Matrix                        4              4               
> 1011036        0
> Viewer                       1              1                    
> 0              0
>
> OptionTable: -ksp_converged_reason
> OptionTable: -ksp_max_it 1000000
> OptionTable: -ksp_type gmres
> OptionTable: -log_summary
> OptionTable: -option_table
> OptionTable: -par 4.0
> OptionTable: -pc_type none
> OptionTable: -snes_converged_reason
> OptionTable: -snes_max_funcs 1000000
> OptionTable: -snes_max_it 100
> OptionTable: -snes_mf
> OptionTable: -snes_view
> OptionTable: -xdiv 512
> OptionTable: -ydiv 512
> --------------------------------------------------------------------------------------------------------
>
> --------------------------------------------------------------------------------------------------------
> NK
> KSP SOLVER: GMRES
> MESH SIZE : 512 x 512 unknows
> NUMBER OF PROCESSORS : 24
>
> Linear solve converged due to CONVERGED_RTOL iterations 25038
> Linear solve converged due to CONVERGED_RTOL iterations 25995
> Linear solve converged due to CONVERGED_RTOL iterations 26769
> SNES Object:
>   type: ls
>     line search variant: SNESLineSearchCubic
>     alpha=0.0001, maxstep=1e+08, steptol=1e-12
>   maximum iterations=100, maximum function evaluations=1000000
>   tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
>   total number of linear solver iterations=77802
>   total number of function evaluations=4
>   KSP Object:
>     type: gmres
>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt  
> Orthogonalization with no iterative refinement
>       GMRES: happy breakdown tolerance 1e-30
>     maximum iterations=1000000, initial guess is zero
>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>     left preconditioning
>   PC Object:
>     type: none
>     linear system matrix = precond matrix:
>     Matrix Object:
>       type=mpiaij, rows=262144, cols=262144
>       total: nonzeros=1308672, allocated nonzeros=1308672
>         not using I-node (on process 0) routines
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
>
>                                          Max          Max/Min         
> Avg           Total
> Time (sec):                   5.329e+02      1.00008   5.329e+02
> Objects:                       6.300e+01      1.00000   6.300e+01
> Flops:                          6.553e+10      1.01230   6.501e 
> +10    1.560e+12
> Flops/sec:                    1.230e+08      1.01230   1.220e+08     
> 2.928e+09
> Memory:                       4.989e+06       
> 1.01589                      1.186e+08
> MPI Messages:             3.216e+05      1.99999   2.546e+05     
> 6.111e+06
> MPI Message Lengths:  2.753e+08      2.00939   8.623e+02    5.269e+09
> MPI Reductions:           6.704e+03      1.00000
>
> Object Type      Creations   Destructions   Memory  Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> SNES                        1              1               
> 124                 0
> Krylov Solver               1              1             
> 16880               0
> Preconditioner             1              1                
> 0                   0
> Distributed array          1              1            
> 46568                0
> Index Set                    6              6            
> 135976              0
> Vec                           45             45          
> 3812684             0
> Vec Scatter                3              3                 
> 0                   0
> IS L to G Mapping       1              1              
> 45092               0
> Matrix                        3              3            
> 1011036              0
> Viewer                       1              1                  
> 0                   0
>
> OptionTable: -ksp_converged_reason
> OptionTable: -ksp_max_it 1000000
> OptionTable: -ksp_type gmres
> OptionTable: -log_summary
> OptionTable: -option_table
> OptionTable: -par 4.0
> OptionTable: -pc_type none
> OptionTable: -snes_converged_reason
> OptionTable: -snes_max_funcs 1000000
> OptionTable: -snes_max_it 100
> OptionTable: -snes_view
> OptionTable: -xdiv 512
> OptionTable: -ydiv 512
> --------------------------------------------------------------------------------------------------------
>