[petsc-users] snes failures

Wed May 18 17:16:38 CDT 2016

   Send the code and I can play with it.

  Barry

> On May 18, 2016, at 4:17 PM, Juha Jaykka <juhaj at iki.fi> wrote:
> 
> On Wednesday 18 May 2016 13:48:52 Matthew Knepley wrote:
>> On Wed, May 18, 2016 at 1:38 PM, Juha Jaykka <juhaj at iki.fi> wrote:
>>> Dear list,
>>> 
>>> I'm designing a short training course on HPC, and decided to use PETSc as
>>> an
>>> example of a good way of getting things done quick, easy, and with good
>>> performance, and without needing to write one's own code for things like
>>> linear or non-linear solvers etc.
>>> 
>>> However, my SNES example turned out to be problematic: I chose the
>>> (static)
>>> sine-Gordon equation for my example, mostly because its exact solution is
>>> known so it is easy to compare with numerics and also because it is, after
>>> all, a dead simple equation. Yet my code refuses to converge most of the
>>> time!
>>> 
>>> Using -snes_type ngs always succeeds, but is also very slow. Any other
>>> type
>>> will fail once I increase the domain size from ~100 points (the actual
>>> number
>>> depends on the type). I always keep the lattice spacing at 0.1. The
>>> failure is
>>> also always the same: DIVERGED_LINE_SEARCH. Some types manage to take one
>>> step
>>> and get stuck, some types manage to decrease the norm once and then
>>> continue
>>> forever without decreasing the norm but not complaining about divergence
>>> either (unless they hit one of the max_it-type limits), and ncg is the
>>> worst
>>> of all: it always (with any lattice size!) fails at the very first step.
>>> 
>>> I've checked the Jacobian, and I suspect it is ok as ngs converges and the
>>> other types except ncg also converge nicely unless the domain is too big.
>> 
>> Nope, ngs does not use the Jacobian, and small problems can converge with
>> wrong Jacobians.
>> 
>> Any ideas of where this could go wrong?
>> 
>> 
>> 1) Just run with -snes_fd_color  -snes_fd_color_use_mat -mat_coloring_type
>> greedy and
>>    see if it converges.
> 
> It does not. And I should have mentioned earlier, that I tried -snes_mf, -
> snes_mf_operator, -snes_fd and -snes_fd_color already and none of those 
> converges. Your suggested options result in 
> 
>  0 SNES Function norm 1.002496882788e+00 
>      Line search: lambdas = [1., 0.], ftys = [1.01105, 1.005]
>      Line search terminated: lambda = 168.018, fnorms = 1.58978
>  1 SNES Function norm 1.589779063742e+00 
>      Line search: lambdas = [1., 0.], ftys = [5.57144, 4.11598]
>      Line search terminated: lambda = 4.82796, fnorms = 8.93164
>  2 SNES Function norm 8.931639387159e+00 
>      Line search: lambdas = [1., 0.], ftys = [2504.72, 385.612]
>      Line search terminated: lambda = 2.18197, fnorms = 157.043
>  3 SNES Function norm 1.570434892800e+02 
>      Line search: lambdas = [1., 0.], ftys = [1.89092e+08, 1.48956e+06]
>      Line search terminated: lambda = 2.00794, fnorms = 40941.5
>  4 SNES Function norm 4.094149042511e+04 
>      Line search: lambdas = [1., 0.], ftys = [8.60081e+17, 2.56063e+13]
>      Line search terminated: lambda = 2.00003, fnorms = 2.75067e+09
>  5 SNES Function norm 2.750671622274e+09 
>      Line search: lambdas = [1., 0.], ftys = [1.75232e+37, 7.76449e+27]
>      Line search terminated: lambda = 2., fnorms = 1.24157e+19
>  6 SNES Function norm 1.241565256983e+19 
>      Line search: lambdas = [1., 0.], ftys = [7.27339e+75, 7.14012e+56]
>      Line search terminated: lambda = 2., fnorms = 2.52948e+38
>  7 SNES Function norm 2.529479470902e+38 
>      Line search: lambdas = [1., 0.], ftys = [1.25309e+153, 6.03796e+114]
>      Line search terminated: lambda = 2., fnorms = 1.04992e+77
>  8 SNES Function norm 1.049915566775e+77 
>      Line search: lambdas = [1., 0.], ftys = [3.71943e+307, 4.31777e+230]
>      Line search terminated: lambda = 2., fnorms = inf.
>  9 SNES Function norm            inf 
> 
> Which is very similar (perhaps even identical) to what ncg does with cp 
> linesearch even without your suggestions. And yes, I also forgot to say, all 
> the results I referred to were with -snes_linesearch_type bt.
> 
> While testing a bit more, though, I noticed that when using -snes_type ngs the 
> norm first goes UP before starting to decrease:
> 
>  0 SNES Function norm 1.002496882788e+00 
>  1 SNES Function norm 1.264791228033e+00 
>  2 SNES Function norm 1.296062264876e+00 
>  3 SNES Function norm 1.290207363235e+00 
>  4 SNES Function norm 1.289395207346e+00 
> etc until
> 1952 SNES Function norm 9.975720236748e-09 
> 
> 
>> http://scicomp.stackexchange.com/questions/30/why-is-newtons-method-not-conv
>> erging
> 
> None of this flags up any problems and -snes_check_jacobian consistently gives 
> something like
> 
> 9.55762e-09 = ||J - Jfd||/||J|| 3.97595e-06  = ||J - Jfd||
> 
> and looking at the values themselves with -snes_check_jacobian_view does not 
> flag any odd points which might be wrong but not show up in the above norm.
> 
> There is just one point which I found in all this testing. Running with a 
> normal run but with -mat_mffd_type ds added, fails with
> 
>  Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2
> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
> 
> 
> instead of failing the line search. Where did the indefinite PC suddenly come 
> from?
> 
> Another point perhaps worth noting is that at a particular grid size, all the 
> failing solves always produce the same result with the same function norm 
> (which at 200 points equals 4.6458600451067145e-01), so at least they are 
> failing somewhat consistently. This is except the mffd above, of course. The 
> resulting iterate in the failing cases has an oscillatory nature, with the 
> number of oscillations increasing with the domain increasing: if my domain is 
> smaller than about -6 to +6 all the methods converge. If the domain is about 
> -13 to +13, the "solution" starts to pick up another oscillation etc.
> 
> Could there be something hairy in the sin() term of the sine-Gordon, somehow? 
> An oscillatory solution seems to point the finger towards an oscillatory term 
> in the equation, but I cannot see how or why it should cause oscillations.
> 
> This is also irrespective of whether my Jacobian gets called, so I think I can 
> be pretty confident the problem is not in the Jacobian, but someplace else 
> instead. (That said, the Jacobian may still of course have some other 
> problem.)
> 
> Cheers,
> Juha
>