[petsc-users] Debugging failed solve (what's an acceptable upper bound to the condition number?)

Mon Nov 23 10:46:18 CST 2015

On 11/20/2015 08:36 PM, Barry Smith wrote:
>    Always make sure that when you reply it goes to everyone on the mailing list; otherwise you're stuck with only stupid old me trying to understand what is going on.
Oops, used to just hitting ctrl-r for the Moose list.
>
>     Can you run with everything else the same but use equal permittivities? Do all the huge condition numbers and no convergence of the nonlinear solve go away?
With equal permittivities, I still see the opposite electric field signs 
with Newton. When I run with Jacobian-free, I still have non-convergence 
issues and a large condition number, so I definitely still have more 
work to do.
>
>
>     Barry
>
>
>> On Nov 20, 2015, at 7:09 PM, Alex Lindsay <adlinds3 at ncsu.edu> wrote:
>>
>> I think I may be honing in on what's causing my problems. I have an interface where I am coupling two different subdomains. Among the physics at the interface is a jump discontinuity in the gradient of the electrical potential (e.g. a jump discontinuity in the electric field), governed by the ratio of the permittivities on either side of the interface. This is implemented in my code like this:
>>
>> Real
>> DGMatDiffusionInt::computeQpResidual(Moose::DGResidualType type)
>> {
>>   if (_D_neighbor[_qp] < std::numeric_limits<double>::epsilon())
>>     mooseError("It doesn't appear that DG material properties got passed.");
>>
>>   Real r = 0;
>>
>>   switch (type)
>>   {
>>   case Moose::Element:
>>     r += 0.5 * (-_D[_qp] * _grad_u[_qp] * _normals[_qp] + -_D_neighbor[_qp] * _grad_neighbor_value[_qp] * _normals[_qp]) * _test[_i][_qp];
>>     break;
>>
>>   case Moose::Neighbor:
>>     r += 0.5 * (_D[_qp] * _grad_u[_qp] * _normals[_qp] + _D_neighbor[_qp] * _grad_neighbor_value[_qp] * _normals[_qp]) * _test_neighbor[_i][_qp];
>>     break;
>>   }
>>
>>   return r;
>> }
>>
>> where here _D and _D_neighbor are the permittivities on either side of the interface. Attached are pictures showing the solution using Newton, and the solution using a Jacobin-Free method. Newton's method yields electric fields with opposite signs on either side of the interface, which is physically impossible. The Jacobian-free solution yields electric fields with the same sign, and with the proper ratio (a ratio of 5, equivalent to the ratio of the permittivities). I'm sure if I had the proper numerical analysis background, I might know why Newton's method has a hard time here, but I don't. Could someone explain why?
>>
>> Alex
>>
>> On 11/20/2015 03:24 PM, Barry Smith wrote:
>>>     Do you really only have 851 variables?
>>>
>>>   SVD: condition number 1.457087640207e+12, 0 of 851 singular values are (nearly) zero
>>>
>>> if so you can use -snes_fd  and -ksp_view_pmat  binary:filename to save the small matrix and then load it up into
>>> MATLAB or similar tool to fully analysis its eigenstructure to see the distribution from the tiny values to the large values; Is it just a small number of tiny ones etc.
>>>
>>>    Note that with such a large condition number the factor the linear system "converges" quickly may be meaningless since a small residual doesn't always mean a small error. The error code still be huge
>>>
>>>
>>>    Barry
>>>
>>>
>>>
>>>> On Nov 20, 2015, at 12:40 PM, Alex Lindsay <adlinds3 at ncsu.edu> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I have an application built on top of the Moose framework, and I'm trying to debug a solve that is not converging. My linear solve     converges very nicely. However, my non-linear solve does not, and the problem appears to be in the line search. Reading the PetSc FAQ, I see that the most common cause of poor line searches are bad Jacobians. However, I'm using a finite-differenced Jacobian; if I run -snes_type=test, I get "norm of matrix ratios" < 1e-15. Thus in this case the Jacobian should be accurate. I'm wondering then if my problem might be these (taken from the FAQ page):
>>>>
>>>> 	• The matrix is very ill-conditioned. Check the condition number.
>>>> 		• Try to improve it by choosing the relative scaling of components/boundary conditions.
>>>> 		• Try -ksp_diagonal_scale -ksp_diagonal_scale_fix.
>>>> 		• Perhaps change the formulation of the problem to produce more friendly algebraic equations.
>>>> 	• The matrix is nonlinear (e.g. evaluated using finite differencing of a nonlinear function). Try different differencing parameters, ./configure --with-precision=__float128 --download-f2cblaslapack, check if it converges in "easier" parameter regimes.
>>>> I'm almost ashamed to share my condition number because I'm sure it must be absurdly high. Without applying -ksp_diagonal_scale and -ksp_diagonal_scale_fix, the condition number is around 1e25. When I do apply those two parameters, the condition number is reduced to 1e17. Even after scaling all my variable residuals so that they were all on the order of unity (a suggestion on the Moose list), I still have a condition number of 1e12. I have no experience with condition numbers, but knowing that perfect condition number is unity, 1e12 seems unacceptable. What's an acceptable upper limit on the condition number? Is it problem dependent? Having already tried scaling the individual variable residuals, I'm not exactly sure what my next method would be for trying to reduce the condition number.
>>>>
>>>> I definitely have a nonlinear problem. Could I be having problems because I'm finite differencing non-linear residuals to form my Jacobian? I can see about using a different differencing parameter. I'm also going to consider trying quad precision. However, my hypothesis is that my condition number is the fundamental problem. Is that a reasonable hypothesis?
>>>>
>>>> If it's useful, below is console output with -pc_type=svd
>>>>
>>>> Time Step  1, time = 1e-10
>>>>                  dt = 1e-10
>>>>      |residual|_2 of individual variables:
>>>>                 potential:    8.12402e+07
>>>>                 potentialliq: 0.000819748
>>>>                 em:           49.206
>>>>                 emliq:        3.08187e-11
>>>>                 Arp:          2375.94
>>>>
>>>>   0 Nonlinear |R| = 8.124020e+07
>>>>        SVD: condition number 1.457087640207e+12, 0 of 851 singular values are (nearly) zero
>>>>        SVD: smallest singular values: 5.637144317564e-09 9.345415388433e-08 4.106132915572e-05 1.017339655185e-04 1.147649477723e-04
>>>>        SVD: largest singular values : 1.498505466947e+03 1.577560767570e+03 1.719172328193e+03 2.344218235296e+03 8.213813311188e+03
>>>>      0 KSP unpreconditioned resid norm 3.185019606208e+05 true resid norm 3.185019606208e+05 ||r(i)||/||b|| 1.000000000000e+00
>>>>      1 KSP unpreconditioned resid norm 6.382886902896e-07 true resid norm 6.382761808414e-07 ||r(i)||/||b|| 2.003994511046e-12
>>>>    Linear solve converged due to CONVERGED_RTOL iterations 1
>>>>        Line search: Using full step: fnorm 8.124020470169e+07 gnorm 1.097605946684e+01
>>>>      |residual|_2 of individual variables:
>>>>                 potential:    8.60047
>>>>                 potentialliq: 0.335436
>>>>                 em:           2.26472
>>>>                 emliq:        0.642578
>>>>                 Arp:          6.39151
>>>>
>>>>   1 Nonlinear |R| = 1.097606e+01
>>>>        SVD: condition number 1.457473763066e+12, 0 of 851 singular values are (nearly) zero
>>>>        SVD: smallest singular values: 5.637185516434e-09 9.347128557672e-08 1.017339655587e-04 1.146760266781e-04 4.064422034774e-04
>>>>        SVD: largest singular values : 1.498505466944e+03 1.577544976882e+03 1.718956369043e+03 2.343692402876e+03 8.216049987736e+03
>>>>      0 KSP unpreconditioned resid norm 2.653715381459e+01 true resid norm 2.653715381459e+01 ||r(i)||/||b|| 1.000000000000e+00
>>>>      1 KSP unpreconditioned resid norm 6.031179341420e-05 true resid norm 6.031183387732e-05 ||r(i)||/||b|| 2.272731819648e-06
>>>>    Linear solve converged due to CONVERGED_RTOL iterations 1
>>>>        Line search: gnorm after quadratic fit 2.485190757827e+11
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 2.632996340352e+10 lambda=5.0000000000000003e-02
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 4.290675557416e+09 lambda=2.5000000000000001e-02
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 4.332980055153e+08 lambda=1.2500000000000001e-02
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.677118626669e+07 lambda=6.2500000000000003e-03
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.024469780306e+05 lambda=3.1250000000000002e-03
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 7.011543252988e+03 lambda=1.5625000000000001e-03
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.750171277470e+03 lambda=7.8125000000000004e-04
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 3.486970625406e+02 lambda=3.4794637057251714e-04
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 7.830624839582e+01 lambda=1.5977866967992950e-04
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 2.147529381328e+01 lambda=6.8049915671999093e-05
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.138950943123e+01 lambda=1.7575203052774536e-05
>>>>        Line search: Cubically determined step, current gnorm 1.095195976135e+01 lambda=1.7575203052774537e-06
>>>>      |residual|_2 of individual variables:
>>>>                 potential:    8.59984
>>>>                 potentialliq: 0.395753
>>>>                 em:           2.26492
>>>>                 emliq:        0.642578
>>>>                 Arp:          6.34735
>>>>
>>>>   2 Nonlinear |R| = 1.095196e+01
>>>>        SVD: condition number 1.457459214030e+12, 0 of 851 singular values are (nearly) zero
>>>>        SVD: smallest singular values: 5.637295371943e-09 9.347057884198e-08 1.017339655949e-04 1.146738253493e-04 4.064421554132e-04
>>>>        SVD: largest singular values : 1.498505466946e+03 1.577543742603e+03 1.718948052797e+03 2.343672206864e+03 8.216128082047e+03
>>>>      0 KSP unpreconditioned resid norm 2.653244141805e+01 true resid norm 2.653244141805e+01 ||r(i)||/||b|| 1.000000000000e+00
>>>>      1 KSP unpreconditioned resid norm 4.480869560737e-05 true resid norm 4.480686665183e-05 ||r(i)||/||b|| 1.688757771886e-06
>>>>    Linear solve converged due to CONVERGED_RTOL iterations 1
>>>>        Line search: gnorm after quadratic fit 2.481752147885e+11
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 2.631959989642e+10 lambda=5.0000000000000003e-02
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 4.289110800463e+09 lambda=2.5000000000000001e-02
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 4.332043942482e+08 lambda=1.2500000000000001e-02
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.677933337886e+07 lambda=6.2500000000000003e-03
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.027980597206e+05 lambda=3.1250000000000002e-03
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 7.054113639063e+03 lambda=1.5625000000000001e-03
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.771258630210e+03 lambda=7.8125000000000004e-04
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 3.517070127496e+02 lambda=3.4519087020105563e-04
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 7.844350966118e+01 lambda=1.5664532891249369e-04
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 2.114833995101e+01 lambda=6.5367917100814859e-05
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.144636844292e+01 lambda=1.6044984646715980e-05
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.095640770627e+01 lambda=1.6044984646715980e-06
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.095196729511e+01 lambda=1.6044984646715980e-07
>>>>        Line search: Cubically determined step, current gnorm 1.095195451041e+01 lambda=2.3994454223607641e-08
>>>>      |residual|_2 of individual variables:
>>>>                 potential:    8.59983
>>>>                 potentialliq: 0.396107
>>>>                 em:           2.26492
>>>>                 emliq:        0.642578
>>>>                 Arp:          6.34733
>>>>
>>>>   3 Nonlinear |R| = 1.095195e+01
>>>>        SVD: condition number 1.457474387942e+12, 0 of 851 singular values are (nearly) zero
>>>>        SVD: smallest singular values: 5.637237413167e-09 9.347057670885e-08 1.017339654798e-04 1.146737961973e-04 4.064420550524e-04
>>>>        SVD: largest singular values : 1.498505466946e+03 1.577543716995e+03 1.718947893048e+03 2.343671853830e+03 8.216129148438e+03
>>>>      0 KSP unpreconditioned resid norm 2.653237816527e+01 true resid norm 2.653237816527e+01 ||r(i)||/||b|| 1.000000000000e+00
>>>>      1 KSP unpreconditioned resid norm 8.525213442515e-05 true resid norm 8.527696332776e-05 ||r(i)||/||b|| 3.214071607022e-06
>>>>    Linear solve converged due to CONVERGED_RTOL iterations 1
>>>>        Line search: gnorm after quadratic fit 2.481576195523e+11
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 2.632005412624e+10 lambda=5.0000000000000003e-02
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 4.289212002697e+09 lambda=2.5000000000000001e-02
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 4.332196637845e+08 lambda=1.2500000000000001e-02
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.678040222943e+07 lambda=6.2500000000000003e-03
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.027868984884e+05 lambda=3.1250000000000002e-03
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 7.010733464460e+03 lambda=1.5625000000000001e-03
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.751519860441e+03 lambda=7.8125000000000004e-04
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 3.497889916171e+02 lambda=3.4753778542938795e-04
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 7.932631084466e+01 lambda=1.5879606741873878e-04
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 2.194608479634e+01 lambda=6.5716583192912669e-05
>>>>        Line search: Cubic step no good, shrinking lambda, current gnorm 1.117190149691e+01 lambda=1.1541218569257328e-05
>>>>        Line search: Cubically determined step, current gnorm 1.093879875464e+01 lambda=1.1541218569257329e-06
>>>>      |residual|_2 of individual variables:
>>>>                 potential:    8.59942
>>>>                 potentialliq: 0.403326
>>>>                 em:           2.26505
>>>>                 emliq:        0.714844
>>>>                 Arp:          6.3169
>>>>
>>>>   4 Nonlinear |R| = 1.093880e+01
>>>>
>> <EFields_NEWTON.png><EFields_PJFNK.png>