[petsc-users] SNES Line search: fixing the non zero solution values of dirichlet nodes during line search

Zhang, Hong hongzhang at anl.gov
Sun Jun 21 20:23:04 CDT 2020


Can you elaborate a bit on how you "set the RHS as increment of (Dirichlet_value-solution_value)”?

It seems that the boundary condition is imposed as algebraic equations (e.g. 0 = u(t)-b(t) ), and the boundary nodes are included in the computational domain. This approach should work theoretically and is used in some TS examples such as ex17.c, you might want to check if the algebraic equations are implemented properly in your IFunction() and IJacobian(). An alternative approach you can try is to remove the boundary nodes from the unknown variables, see the example ex25.c. In this way you solve ODEs instead of DAEs and do not need to worry about SNES modifying certain values.

Hong (Mr.)

On Jun 19, 2020, at 5:15 PM, Ashish Patel <ashish.patel at onscale.com<mailto:ashish.patel at onscale.com>> wrote:

Dear PETSc users,

We use PETSc as part of a finite element method program and we are trying to properly implement Dirichlet boundary conditions for non-linear, transient problems. We find that when we use a line search method it also changes the non-zero solution value of Dirichlet nodes as it steps through the line search iteration. I was wondering if  there is a way to freeze some index sets of a vector from changing during the line search operation?

We are using the TS framework to setup the problem and use 'MatZeroRowsColumns' to set the diagonal of the jacobian to 1 for the dirichlet nodes and set the RHS as increment of (Dirichlet_value-solution_value). This works when the line search method is turned off by using '-snes_linesearch_type basic' however using the default 'bt' linesearch, the TS diverges with error shown below. In a separate implementation if we overwrote the dirichlet nodes of the solution vector in TS residual function with the Dirichlet values  then the 'bt' line search method converged to the right solution. However we would like to avoid modifying the internal PETSc vector in our implementation.

0 TS dt 1. time 0.
    0 SNES Function norm 2.378549386020e+03
        Line search: gnorm after quadratic fit 4.369235425165e+03
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.385369069060e+03 lambda=1.0000000000000002e-02
        Line search: Cubically determined step, current gnorm 2.373925008934e+03 lambda=3.8846250444606093e-03
    1 SNES Function norm 2.373925008934e+03
        Line search: gnorm after quadratic fit 5.006914179995e+03
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.420957096780e+03 lambda=1.0000000000000002e-02
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.376034946750e+03 lambda=1.6129422079664700e-03
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.374313344729e+03 lambda=4.8465026740690043e-04
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373999473242e+03 lambda=1.5857251532828948e-04
        Line search: Cubically determined step, current gnorm 2.373921668024e+03 lambda=5.8116507162753387e-05
    2 SNES Function norm 2.373921668024e+03
        Line search: gnorm after quadratic fit 4.771035112853e+03
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.410650718394e+03 lambda=1.0000000000000002e-02
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.375104094198e+03 lambda=1.8983783738011522e-03
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.374049151562e+03 lambda=7.0688528086485479e-04
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373935090907e+03 lambda=2.9132722794896019e-04
        Line search: Cubically determined step, current gnorm 2.373921032081e+03 lambda=1.2527602265373028e-04
    3 SNES Function norm 2.373921032081e+03
        Line search: gnorm after quadratic fit 5.117914832660e+03
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.422635362094e+03 lambda=1.0000000000000002e-02
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.375923870970e+03 lambda=1.5769887508300913e-03
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.374287081592e+03 lambda=4.8018017100729705e-04
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373999131160e+03 lambda=1.5908966655977892e-04
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373930608274e+03 lambda=5.7603977147935371e-05
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373922022038e+03 lambda=2.3884787050507805e-05
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921396636e+03 lambda=1.0282405393471519e-05
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921173719e+03 lambda=4.3868704034012554e-06
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921091304e+03 lambda=1.8774991151727107e-06
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921057168e+03 lambda=8.0331628193813397e-07
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921042769e+03 lambda=3.4377811174560817e-07
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921036645e+03 lambda=1.4712421886880395e-07
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921034032e+03 lambda=6.2965363212312132e-08
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032916e+03 lambda=2.6947718250469261e-08
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032438e+03 lambda=1.1533043989179318e-08
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032233e+03 lambda=4.9359018284334258e-09
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032146e+03 lambda=2.1124641857409436e-09
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032109e+03 lambda=9.0409137304143975e-10
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032093e+03 lambda=3.8693264359674819e-10
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032086e+03 lambda=1.6559911204766632e-10
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032083e+03 lambda=7.0873187020534353e-11
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032081e+03 lambda=3.0332192901757729e-11
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032081e+03 lambda=1.2981600435512875e-11
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032081e+03 lambda=5.5559212521628090e-12
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032081e+03 lambda=2.3777380405336958e-12
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032081e+03 lambda=1.0176091649134191e-12
        Line search: Cubic step no good, shrinking lambda, current gnorm 2.373921032081e+03 lambda=4.3555419789721080e-13
        Line search: unable to find good step length! After 27 tries
        Line search: fnorm=2.3739210320805191e+03, gnorm=2.3739210320805323e+03, ynorm=8.5698020038772756e+03, minlambda=9.9999999999999998e-13, lambda=4.3555419789721080e-13, initial slope=-5.6355010665542409e+06
SNES Object: 1 MPI processes
  type: newtonls
  maximum iterations=50, maximum function evaluations=10000
  tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
  total number of linear solver iterations=4
  total number of function evaluations=48
  norm schedule ALWAYS
  SNESLineSearch Object: 1 MPI processes
    type: bt
      interpolation: cubic
      alpha=1.000000e-04
    maxstep=1.000000e+08, minlambda=1.000000e-12
    tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08
    maximum iterations=40
  KSP Object: 1 MPI processes
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: 1 MPI processes
    type: cholesky
      out-of-place factorization
      tolerance for zero pivot 2.22045e-14
      matrix ordering: natural
      factor fill ratio given 0., needed 0.
        Factored matrix follows:
          Mat Object: 1 MPI processes
            type: mumps
            rows=38154, cols=38154
            package used to perform factorization: mumps
            total: nonzeros=7080060, allocated nonzeros=7080060
            total number of mallocs used during MatSetValues calls=0
              MUMPS run parameters:
                SYM (matrix type):                   2
                PAR (host participation):            1
                ICNTL(1) (output for error):         6
                ICNTL(2) (output of diagnostic msg): 0
                ICNTL(3) (output for global info):   0
                ICNTL(4) (level of printing):        0
                ICNTL(5) (input mat struct):         0
                ICNTL(6) (matrix prescaling):        7
                ICNTL(7) (sequential matrix ordering):7
                ICNTL(8) (scaling strategy):        77
                ICNTL(10) (max num of refinements):  0
                ICNTL(11) (error analysis):          0
                ICNTL(12) (efficiency control):                         0
                ICNTL(13) (efficiency control):                         0
                ICNTL(14) (percentage of estimated workspace increase): 20
                ICNTL(18) (input mat struct):                           0
                ICNTL(19) (Schur complement info):                      0
                ICNTL(20) (rhs sparse pattern):                         0
                ICNTL(21) (solution struct):                            0
                ICNTL(22) (in-core/out-of-core facility):               0
                ICNTL(23) (max size of memory can be allocated locally):0
                ICNTL(24) (detection of null pivot rows):               0
                ICNTL(25) (computation of a null space basis):          0
                ICNTL(26) (Schur options for rhs or solution):          0
                ICNTL(27) (experimental parameter):                     -32
                ICNTL(28) (use parallel or sequential ordering):        1
                ICNTL(29) (parallel ordering):                          0
                ICNTL(30) (user-specified set of entries in inv(A)):    0
                ICNTL(31) (factors is discarded in the solve phase):    0
                ICNTL(33) (compute determinant):                        0
                ICNTL(35) (activate BLR based factorization):           0
                ICNTL(36) (choice of BLR factorization variant):        0
                ICNTL(38) (estimated compression rate of LU factors):   333
                CNTL(1) (relative pivoting threshold):      0.01
                CNTL(2) (stopping criterion of refinement): 1.49012e-08
                CNTL(3) (absolute pivoting threshold):      0.
                CNTL(4) (value of static pivoting):         -1.
                CNTL(5) (fixation for null pivots):         0.
                CNTL(7) (dropping parameter for BLR):       0.
                RINFO(1) (local estimated flops for the elimination after analysis):
                  [0] 2.73979e+09
                RINFO(2) (local estimated flops for the assembly after factorization):
                  [0]  1.08826e+07
                RINFO(3) (local estimated flops for the elimination after factorization):
                  [0]  2.73979e+09
                INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization):
                [0] 94
                INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization):
                  [0] 94
                INFO(23) (num of pivots eliminated on this processor after factorization):
                  [0] 38154
                RINFOG(1) (global estimated flops for the elimination after analysis): 2.73979e+09
                RINFOG(2) (global estimated flops for the assembly after factorization): 1.08826e+07
                RINFOG(3) (global estimated flops for the elimination after factorization): 2.73979e+09
                (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0)
                INFOG(3) (estimated real workspace for factors on all processors after analysis): 8377336
                INFOG(4) (estimated integer workspace for factors on all processors after analysis): 447902
                INFOG(5) (estimated maximum front size in the complete tree): 990
                INFOG(6) (number of nodes in the complete tree): 2730
                INFOG(7) (ordering option effectively use after analysis): 5
                INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100
                INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 8377336
                INFOG(10) (total integer space store the matrix factors after factorization): 447902
                INFOG(11) (order of largest frontal matrix after factorization): 990
                INFOG(12) (number of off-diagonal pivots): 10
                INFOG(13) (number of delayed pivots after factorization): 0
                INFOG(14) (number of memory compress after factorization): 0
                INFOG(15) (number of steps of iterative refinement after solution): 0
                INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 94
                INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 94
                INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 94
                INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 94
                INFOG(20) (estimated number of entries in the factors): 7080060
                INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 80
                INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 80
                INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0
                INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1
                INFOG(25) (after factorization: number of pivots modified by static pivoting): 0
                INFOG(28) (after factorization: number of null pivots encountered): 0
                INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 7080060
                INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 92, 92
                INFOG(32) (after analysis: type of analysis done): 1
                INFOG(33) (value used for ICNTL(8)): 7
                INFOG(34) (exponent of the determinant if determinant is requested): 0
                INFOG(35) (after factorization: number of entries taking into account BLR factor compression - sum over all processors): 7080060
                INFOG(36) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - value on the most memory consuming processor): 0
                INFOG(37) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - sum over all processors): 0
                INFOG(38) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - value on the most memory consuming processor): 0
                INFOG(39) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - sum over all processors): 0
    linear system matrix = precond matrix:
    Mat Object: 1 MPI processes
      type: seqaij
      rows=38154, cols=38154
      total: nonzeros=973446, allocated nonzeros=973446
      total number of mallocs used during MatSetValues calls=0
        not using I-node routines
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR:
[0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE, increase -ts_max_snes_failures or make negative to attempt recovery
[0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.12.3, Jan, 03, 2020

Thanks,
Ashish
Scientific Computing Division
OnScale
CA, USA

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200622/1b76add1/attachment-0001.html>


More information about the petsc-users mailing list