[petsc-users] snes failures
Barry Smith
bsmith at mcs.anl.gov
Wed May 18 17:16:38 CDT 2016
Send the code and I can play with it.
Barry
> On May 18, 2016, at 4:17 PM, Juha Jaykka <juhaj at iki.fi> wrote:
>
> On Wednesday 18 May 2016 13:48:52 Matthew Knepley wrote:
>> On Wed, May 18, 2016 at 1:38 PM, Juha Jaykka <juhaj at iki.fi> wrote:
>>> Dear list,
>>>
>>> I'm designing a short training course on HPC, and decided to use PETSc as
>>> an
>>> example of a good way of getting things done quick, easy, and with good
>>> performance, and without needing to write one's own code for things like
>>> linear or non-linear solvers etc.
>>>
>>> However, my SNES example turned out to be problematic: I chose the
>>> (static)
>>> sine-Gordon equation for my example, mostly because its exact solution is
>>> known so it is easy to compare with numerics and also because it is, after
>>> all, a dead simple equation. Yet my code refuses to converge most of the
>>> time!
>>>
>>> Using -snes_type ngs always succeeds, but is also very slow. Any other
>>> type
>>> will fail once I increase the domain size from ~100 points (the actual
>>> number
>>> depends on the type). I always keep the lattice spacing at 0.1. The
>>> failure is
>>> also always the same: DIVERGED_LINE_SEARCH. Some types manage to take one
>>> step
>>> and get stuck, some types manage to decrease the norm once and then
>>> continue
>>> forever without decreasing the norm but not complaining about divergence
>>> either (unless they hit one of the max_it-type limits), and ncg is the
>>> worst
>>> of all: it always (with any lattice size!) fails at the very first step.
>>>
>>> I've checked the Jacobian, and I suspect it is ok as ngs converges and the
>>> other types except ncg also converge nicely unless the domain is too big.
>>
>> Nope, ngs does not use the Jacobian, and small problems can converge with
>> wrong Jacobians.
>>
>> Any ideas of where this could go wrong?
>>
>>
>> 1) Just run with -snes_fd_color -snes_fd_color_use_mat -mat_coloring_type
>> greedy and
>> see if it converges.
>
> It does not. And I should have mentioned earlier, that I tried -snes_mf, -
> snes_mf_operator, -snes_fd and -snes_fd_color already and none of those
> converges. Your suggested options result in
>
> 0 SNES Function norm 1.002496882788e+00
> Line search: lambdas = [1., 0.], ftys = [1.01105, 1.005]
> Line search terminated: lambda = 168.018, fnorms = 1.58978
> 1 SNES Function norm 1.589779063742e+00
> Line search: lambdas = [1., 0.], ftys = [5.57144, 4.11598]
> Line search terminated: lambda = 4.82796, fnorms = 8.93164
> 2 SNES Function norm 8.931639387159e+00
> Line search: lambdas = [1., 0.], ftys = [2504.72, 385.612]
> Line search terminated: lambda = 2.18197, fnorms = 157.043
> 3 SNES Function norm 1.570434892800e+02
> Line search: lambdas = [1., 0.], ftys = [1.89092e+08, 1.48956e+06]
> Line search terminated: lambda = 2.00794, fnorms = 40941.5
> 4 SNES Function norm 4.094149042511e+04
> Line search: lambdas = [1., 0.], ftys = [8.60081e+17, 2.56063e+13]
> Line search terminated: lambda = 2.00003, fnorms = 2.75067e+09
> 5 SNES Function norm 2.750671622274e+09
> Line search: lambdas = [1., 0.], ftys = [1.75232e+37, 7.76449e+27]
> Line search terminated: lambda = 2., fnorms = 1.24157e+19
> 6 SNES Function norm 1.241565256983e+19
> Line search: lambdas = [1., 0.], ftys = [7.27339e+75, 7.14012e+56]
> Line search terminated: lambda = 2., fnorms = 2.52948e+38
> 7 SNES Function norm 2.529479470902e+38
> Line search: lambdas = [1., 0.], ftys = [1.25309e+153, 6.03796e+114]
> Line search terminated: lambda = 2., fnorms = 1.04992e+77
> 8 SNES Function norm 1.049915566775e+77
> Line search: lambdas = [1., 0.], ftys = [3.71943e+307, 4.31777e+230]
> Line search terminated: lambda = 2., fnorms = inf.
> 9 SNES Function norm inf
>
> Which is very similar (perhaps even identical) to what ncg does with cp
> linesearch even without your suggestions. And yes, I also forgot to say, all
> the results I referred to were with -snes_linesearch_type bt.
>
> While testing a bit more, though, I noticed that when using -snes_type ngs the
> norm first goes UP before starting to decrease:
>
> 0 SNES Function norm 1.002496882788e+00
> 1 SNES Function norm 1.264791228033e+00
> 2 SNES Function norm 1.296062264876e+00
> 3 SNES Function norm 1.290207363235e+00
> 4 SNES Function norm 1.289395207346e+00
> etc until
> 1952 SNES Function norm 9.975720236748e-09
>
>
>> http://scicomp.stackexchange.com/questions/30/why-is-newtons-method-not-conv
>> erging
>
> None of this flags up any problems and -snes_check_jacobian consistently gives
> something like
>
> 9.55762e-09 = ||J - Jfd||/||J|| 3.97595e-06 = ||J - Jfd||
>
> and looking at the values themselves with -snes_check_jacobian_view does not
> flag any odd points which might be wrong but not show up in the above norm.
>
> There is just one point which I found in all this testing. Running with a
> normal run but with -mat_mffd_type ds added, fails with
>
> Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2
> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
>
>
> instead of failing the line search. Where did the indefinite PC suddenly come
> from?
>
> Another point perhaps worth noting is that at a particular grid size, all the
> failing solves always produce the same result with the same function norm
> (which at 200 points equals 4.6458600451067145e-01), so at least they are
> failing somewhat consistently. This is except the mffd above, of course. The
> resulting iterate in the failing cases has an oscillatory nature, with the
> number of oscillations increasing with the domain increasing: if my domain is
> smaller than about -6 to +6 all the methods converge. If the domain is about
> -13 to +13, the "solution" starts to pick up another oscillation etc.
>
> Could there be something hairy in the sin() term of the sine-Gordon, somehow?
> An oscillatory solution seems to point the finger towards an oscillatory term
> in the equation, but I cannot see how or why it should cause oscillations.
>
> This is also irrespective of whether my Jacobian gets called, so I think I can
> be pretty confident the problem is not in the Jacobian, but someplace else
> instead. (That said, the Jacobian may still of course have some other
> problem.)
>
> Cheers,
> Juha
>
More information about the petsc-users
mailing list