[petsc-users] SNES Line search: fixing the non zero solution values of dirichlet nodes during line search

Mon Jun 22 19:48:58 CDT 2020

Thanks for the suggestions Barry and Matt. DMAddBoundary surely seems
interesting for further improvement. Currently we settled on setting up the
problem as DAE and using the time available from the function evaluation
routine of TSIFunction for Dirichlet_value evaluation. Also the residual
for nodes adjacent to Dirichlet nodes are now updated using the Dirichlet
values, instead of the solution value which is updated only after the first
SNES iteration, This change makes the line search method more robust for
us.

Thanks
Ashish

On Mon, Jun 22, 2020 at 10:19 AM Matthew Knepley <knepley at gmail.com> wrote:

> On Mon, Jun 22, 2020 at 1:13 PM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>
>> On Jun 22, 2020, at 11:56 AM, Ashish Patel <ashish.patel at onscale.com>
>> wrote:
>>
>> Thanks Stefano, Barry and Hong for your feedback.
>> Stefano, the option that you suggested is helpful to freeze the dirichlet
>> values. However, it did not completely solve the convergence issue for
>> nonlinear problems using linesearch since the direction vector which is
>> calculated before any Solve uses the residual which is not yet updated for
>> the new Dirichlet values (u(t) =b(t)). The update happens only after the
>> solve at SNES iteration 0. To fix this I now update the residual for nodes
>> adjacent to Dirichlet nodes without depending on any Solve. The Solve stage
>> only updates the solution for dirichlet nodes. Calculating the residual
>> this way resolves the issue.
>>
>> Barry, Hong yes you are right, the dirichlet boundary conditions are time
>> dependent. We do solve the equations implicitly and use TSIFunction and
>> TSIJacobian to set up the problem. The dirichlet nodes are part of
>> computational nodes. The jacobian for dirchlet nodes has diagonal 1 and
>> rows & colums 0. For setting the residual in TSIFunction the residual for
>> dirichlet nodes are set to (solution_value-Dirichlet_value).
>>
>>
>>   Understood. The reason the line search cannot succeed well but
>> everything is fine with no line search is that you are putting into the
>> Dirichlet nodes (solution_value-Dirichlet_value) for the next time-step but
>> with multistage integrators and line searches the nonlinear function being
>> evaluated is not simply at the new time step. hence for all those other
>> function evaluations the solution you put in
>> (solution_value-Dirichlet_value) is meaningless. That is your "function" on
>> the Dirichlet nodes is not a function because it doesn't depend on the
>> current input.
>>
>>   As I indicated before there are two approaches to putting the
>> simulation on solid footing. Rewriting the boundary condition as an ODE if
>> possible so TS manages the handling of the boundary just like the interior
>> points or telling TS you are solving a DAE, this means when you provide the
>> IFunction and IJacobian you take into account that there is no U_t on the
>> left handside for the boundary nodes.
>>
>
> There is a third strategy, which is the one employed by DMPlex for all BC,
> including time dependent. It eliminates the dofs constrained by boundary
> conditions
> for the system. For a simple example, see TS ex45 or ex46. There,
> time-dependent boundary conditions are used.
>
>   Thanks,
>
>      Matt
>
>
>> Regards
>> Ashish
>>
>> On Sun, Jun 21, 2020 at 6:23 PM Zhang, Hong <hongzhang at anl.gov> wrote:
>>
>>> Can you elaborate a bit on how you "set the RHS as increment of
>>> (Dirichlet_value-solution_value)”?
>>>
>>> It seems that the boundary condition is imposed as algebraic equations
>>> (e.g. 0 = u(t)-b(t) ), and the boundary nodes are included in the
>>> computational domain. This approach should work theoretically and is used
>>> in some TS examples such as ex17.c, you might want to check if the
>>> algebraic equations are implemented properly in your IFunction() and
>>> IJacobian(). An alternative approach you can try is to remove the boundary
>>> nodes from the unknown variables, see the example ex25.c. In this way you
>>> solve ODEs instead of DAEs and do not need to worry about SNES modifying
>>> certain values.
>>>
>>> Hong (Mr.)
>>>
>>> On Jun 19, 2020, at 5:15 PM, Ashish Patel <ashish.patel at onscale.com>
>>> wrote:
>>>
>>> Dear PETSc users,
>>>
>>> We use PETSc as part of a finite element method program and we are
>>> trying to properly implement Dirichlet boundary conditions for non-linear,
>>> transient problems. We find that when we use a line search method it also
>>> changes the non-zero solution value of Dirichlet nodes as it steps through
>>> the line search iteration. I was wondering if  there is a way to freeze
>>> some index sets of a vector from changing during the line search operation?
>>>
>>> We are using the TS framework to setup the problem and use
>>> 'MatZeroRowsColumns' to set the diagonal of the jacobian to 1 for the
>>> dirichlet nodes and set the RHS as increment of
>>> (Dirichlet_value-solution_value). This works when the line search method is
>>> turned off by using '-snes_linesearch_type basic' however using the default
>>> 'bt' linesearch, the TS diverges with error shown below. In a separate
>>> implementation if we overwrote the dirichlet nodes of the solution vector
>>> in TS residual function with the Dirichlet values  then the 'bt' line
>>> search method converged to the right solution. However we would like to
>>> avoid modifying the internal PETSc vector in our implementation.
>>>
>>> 0 TS dt 1. time 0.
>>>     0 SNES Function norm 2.378549386020e+03
>>>         Line search: gnorm after quadratic fit 4.369235425165e+03
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.385369069060e+03 lambda=1.0000000000000002e-02
>>>         Line search: Cubically determined step, current gnorm
>>> 2.373925008934e+03 lambda=3.8846250444606093e-03
>>>     1 SNES Function norm 2.373925008934e+03
>>>         Line search: gnorm after quadratic fit 5.006914179995e+03
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.420957096780e+03 lambda=1.0000000000000002e-02
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.376034946750e+03 lambda=1.6129422079664700e-03
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.374313344729e+03 lambda=4.8465026740690043e-04
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373999473242e+03 lambda=1.5857251532828948e-04
>>>         Line search: Cubically determined step, current gnorm
>>> 2.373921668024e+03 lambda=5.8116507162753387e-05
>>>     2 SNES Function norm 2.373921668024e+03
>>>         Line search: gnorm after quadratic fit 4.771035112853e+03
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.410650718394e+03 lambda=1.0000000000000002e-02
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.375104094198e+03 lambda=1.8983783738011522e-03
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.374049151562e+03 lambda=7.0688528086485479e-04
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373935090907e+03 lambda=2.9132722794896019e-04
>>>         Line search: Cubically determined step, current gnorm
>>> 2.373921032081e+03 lambda=1.2527602265373028e-04
>>>     3 SNES Function norm 2.373921032081e+03
>>>         Line search: gnorm after quadratic fit 5.117914832660e+03
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.422635362094e+03 lambda=1.0000000000000002e-02
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.375923870970e+03 lambda=1.5769887508300913e-03
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.374287081592e+03 lambda=4.8018017100729705e-04
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373999131160e+03 lambda=1.5908966655977892e-04
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373930608274e+03 lambda=5.7603977147935371e-05
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373922022038e+03 lambda=2.3884787050507805e-05
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921396636e+03 lambda=1.0282405393471519e-05
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921173719e+03 lambda=4.3868704034012554e-06
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921091304e+03 lambda=1.8774991151727107e-06
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921057168e+03 lambda=8.0331628193813397e-07
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921042769e+03 lambda=3.4377811174560817e-07
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921036645e+03 lambda=1.4712421886880395e-07
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921034032e+03 lambda=6.2965363212312132e-08
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032916e+03 lambda=2.6947718250469261e-08
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032438e+03 lambda=1.1533043989179318e-08
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032233e+03 lambda=4.9359018284334258e-09
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032146e+03 lambda=2.1124641857409436e-09
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032109e+03 lambda=9.0409137304143975e-10
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032093e+03 lambda=3.8693264359674819e-10
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032086e+03 lambda=1.6559911204766632e-10
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032083e+03 lambda=7.0873187020534353e-11
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032081e+03 lambda=3.0332192901757729e-11
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032081e+03 lambda=1.2981600435512875e-11
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032081e+03 lambda=5.5559212521628090e-12
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032081e+03 lambda=2.3777380405336958e-12
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032081e+03 lambda=1.0176091649134191e-12
>>>         Line search: Cubic step no good, shrinking lambda, current gnorm
>>> 2.373921032081e+03 lambda=4.3555419789721080e-13
>>>         Line search: unable to find good step length! After 27 tries
>>>         Line search: fnorm=2.3739210320805191e+03,
>>> gnorm=2.3739210320805323e+03, ynorm=8.5698020038772756e+03,
>>> minlambda=9.9999999999999998e-13, lambda=4.3555419789721080e-13, initial
>>> slope=-5.6355010665542409e+06
>>> SNES Object: 1 MPI processes
>>>   type: newtonls
>>>   maximum iterations=50, maximum function evaluations=10000
>>>   tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
>>>   total number of linear solver iterations=4
>>>   total number of function evaluations=48
>>>   norm schedule ALWAYS
>>>   SNESLineSearch Object: 1 MPI processes
>>>     type: bt
>>>       interpolation: cubic
>>>       alpha=1.000000e-04
>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>> lambda=1.000000e-08
>>>     maximum iterations=40
>>>   KSP Object: 1 MPI processes
>>>     type: preonly
>>>     maximum iterations=10000, initial guess is zero
>>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>     left preconditioning
>>>     using NONE norm type for convergence test
>>>   PC Object: 1 MPI processes
>>>     type: cholesky
>>>       out-of-place factorization
>>>       tolerance for zero pivot 2.22045e-14
>>>       matrix ordering: natural
>>>       factor fill ratio given 0., needed 0.
>>>         Factored matrix follows:
>>>           Mat Object: 1 MPI processes
>>>             type: mumps
>>>             rows=38154, cols=38154
>>>             package used to perform factorization: mumps
>>>             total: nonzeros=7080060, allocated nonzeros=7080060
>>>             total number of mallocs used during MatSetValues calls=0
>>>               MUMPS run parameters:
>>>                 SYM (matrix type):                   2
>>>                 PAR (host participation):            1
>>>                 ICNTL(1) (output for error):         6
>>>                 ICNTL(2) (output of diagnostic msg): 0
>>>                 ICNTL(3) (output for global info):   0
>>>                 ICNTL(4) (level of printing):        0
>>>                 ICNTL(5) (input mat struct):         0
>>>                 ICNTL(6) (matrix prescaling):        7
>>>                 ICNTL(7) (sequential matrix ordering):7
>>>                 ICNTL(8) (scaling strategy):        77
>>>                 ICNTL(10) (max num of refinements):  0
>>>                 ICNTL(11) (error analysis):          0
>>>                 ICNTL(12) (efficiency control):
>>> 0
>>>                 ICNTL(13) (efficiency control):
>>> 0
>>>                 ICNTL(14) (percentage of estimated workspace increase):
>>> 20
>>>                 ICNTL(18) (input mat struct):
>>> 0
>>>                 ICNTL(19) (Schur complement info):
>>>  0
>>>                 ICNTL(20) (rhs sparse pattern):
>>> 0
>>>                 ICNTL(21) (solution struct):
>>>  0
>>>                 ICNTL(22) (in-core/out-of-core facility):
>>> 0
>>>                 ICNTL(23) (max size of memory can be allocated
>>> locally):0
>>>                 ICNTL(24) (detection of null pivot rows):
>>> 0
>>>                 ICNTL(25) (computation of a null space basis):
>>>  0
>>>                 ICNTL(26) (Schur options for rhs or solution):
>>>  0
>>>                 ICNTL(27) (experimental parameter):
>>> -32
>>>                 ICNTL(28) (use parallel or sequential ordering):
>>>  1
>>>                 ICNTL(29) (parallel ordering):
>>>  0
>>>                 ICNTL(30) (user-specified set of entries in inv(A)):
>>>  0
>>>                 ICNTL(31) (factors is discarded in the solve phase):
>>>  0
>>>                 ICNTL(33) (compute determinant):
>>>  0
>>>                 ICNTL(35) (activate BLR based factorization):
>>> 0
>>>                 ICNTL(36) (choice of BLR factorization variant):
>>>  0
>>>                 ICNTL(38) (estimated compression rate of LU factors):
>>> 333
>>>                 CNTL(1) (relative pivoting threshold):      0.01
>>>                 CNTL(2) (stopping criterion of refinement): 1.49012e-08
>>>                 CNTL(3) (absolute pivoting threshold):      0.
>>>                 CNTL(4) (value of static pivoting):         -1.
>>>                 CNTL(5) (fixation for null pivots):         0.
>>>                 CNTL(7) (dropping parameter for BLR):       0.
>>>                 RINFO(1) (local estimated flops for the elimination
>>> after analysis):
>>>                   [0] 2.73979e+09
>>>                 RINFO(2) (local estimated flops for the assembly after
>>> factorization):
>>>                   [0]  1.08826e+07
>>>                 RINFO(3) (local estimated flops for the elimination
>>> after factorization):
>>>                   [0]  2.73979e+09
>>>                 INFO(15) (estimated size of (in MB) MUMPS internal data
>>> for running numerical factorization):
>>>                 [0] 94
>>>                 INFO(16) (size of (in MB) MUMPS internal data used
>>> during numerical factorization):
>>>                   [0] 94
>>>                 INFO(23) (num of pivots eliminated on this processor
>>> after factorization):
>>>                   [0] 38154
>>>                 RINFOG(1) (global estimated flops for the elimination
>>> after analysis): 2.73979e+09
>>>                 RINFOG(2) (global estimated flops for the assembly after
>>> factorization): 1.08826e+07
>>>                 RINFOG(3) (global estimated flops for the elimination
>>> after factorization): 2.73979e+09
>>>                 (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant):
>>> (0.,0.)*(2^0)
>>>                 INFOG(3) (estimated real workspace for factors on all
>>> processors after analysis): 8377336
>>>                 INFOG(4) (estimated integer workspace for factors on all
>>> processors after analysis): 447902
>>>                 INFOG(5) (estimated maximum front size in the complete
>>> tree): 990
>>>                 INFOG(6) (number of nodes in the complete tree): 2730
>>>                 INFOG(7) (ordering option effectively use after
>>> analysis): 5
>>>                 INFOG(8) (structural symmetry in percent of the permuted
>>> matrix after analysis): 100
>>>                 INFOG(9) (total real/complex workspace to store the
>>> matrix factors after factorization): 8377336
>>>                 INFOG(10) (total integer space store the matrix factors
>>> after factorization): 447902
>>>                 INFOG(11) (order of largest frontal matrix after
>>> factorization): 990
>>>                 INFOG(12) (number of off-diagonal pivots): 10
>>>                 INFOG(13) (number of delayed pivots after
>>> factorization): 0
>>>                 INFOG(14) (number of memory compress after
>>> factorization): 0
>>>                 INFOG(15) (number of steps of iterative refinement after
>>> solution): 0
>>>                 INFOG(16) (estimated size (in MB) of all MUMPS internal
>>> data for factorization after analysis: value on the most memory consuming
>>> processor): 94
>>>                 INFOG(17) (estimated size of all MUMPS internal data for
>>> factorization after analysis: sum over all processors): 94
>>>                 INFOG(18) (size of all MUMPS internal data allocated
>>> during factorization: value on the most memory consuming processor): 94
>>>                 INFOG(19) (size of all MUMPS internal data allocated
>>> during factorization: sum over all processors): 94
>>>                 INFOG(20) (estimated number of entries in the factors):
>>> 7080060
>>>                 INFOG(21) (size in MB of memory effectively used during
>>> factorization - value on the most memory consuming processor): 80
>>>                 INFOG(22) (size in MB of memory effectively used during
>>> factorization - sum over all processors): 80
>>>                 INFOG(23) (after analysis: value of ICNTL(6) effectively
>>> used): 0
>>>                 INFOG(24) (after analysis: value of ICNTL(12)
>>> effectively used): 1
>>>                 INFOG(25) (after factorization: number of pivots
>>> modified by static pivoting): 0
>>>                 INFOG(28) (after factorization: number of null pivots
>>> encountered): 0
>>>                 INFOG(29) (after factorization: effective number of
>>> entries in the factors (sum over all processors)): 7080060
>>>                 INFOG(30, 31) (after solution: size in Mbytes of memory
>>> used during solution phase): 92, 92
>>>                 INFOG(32) (after analysis: type of analysis done): 1
>>>                 INFOG(33) (value used for ICNTL(8)): 7
>>>                 INFOG(34) (exponent of the determinant if determinant is
>>> requested): 0
>>>                 INFOG(35) (after factorization: number of entries taking
>>> into account BLR factor compression - sum over all processors): 7080060
>>>                 INFOG(36) (after analysis: estimated size of all MUMPS
>>> internal data for running BLR in-core - value on the most memory consuming
>>> processor): 0
>>>                 INFOG(37) (after analysis: estimated size of all MUMPS
>>> internal data for running BLR in-core - sum over all processors): 0
>>>                 INFOG(38) (after analysis: estimated size of all MUMPS
>>> internal data for running BLR out-of-core - value on the most memory
>>> consuming processor): 0
>>>                 INFOG(39) (after analysis: estimated size of all MUMPS
>>> internal data for running BLR out-of-core - sum over all processors): 0
>>>     linear system matrix = precond matrix:
>>>     Mat Object: 1 MPI processes
>>>       type: seqaij
>>>       rows=38154, cols=38154
>>>       total: nonzeros=973446, allocated nonzeros=973446
>>>       total number of mallocs used during MatSetValues calls=0
>>>         not using I-node routines
>>> [0]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> [0]PETSC ERROR:
>>> [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE,
>>> increase -ts_max_snes_failures or make negative to attempt recovery
>>> [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html
>>> for trouble shooting.
>>> [0]PETSC ERROR: Petsc Release Version 3.12.3, Jan, 03, 2020
>>>
>>> Thanks,
>>> Ashish
>>> Scientific Computing Division
>>> OnScale
>>> CA, USA
>>>
>>>
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200622/fc47a4bf/attachment-0001.html>