[petsc-users] Diagnosing Convergence Issue in Fieldsplit Problem
Barry Smith
bsmith at petsc.dev
Thu May 23 11:46:28 CDT 2024
Use -pc_fieldsplit_0_ksp_type preonly
> On May 23, 2024, at 12:43 PM, Colton Bryant <coltonbryant2021 at u.northwestern.edu> wrote:
>
> That produces the following error:
>
> [0]PETSC ERROR: Residual norm computed by GMRES recursion formula 2.79175e-07 is far from the computed residual norm 0.000113154 at restart, residual norm at start of cycle 2.83065e-07
> [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZtDgFFlCg0V07VVhElstTyhYk7k_JOhZDT-XmG5SH94V00fLNgdy0Xm3HtempNQntM8S8XI0Eu8wDU5B5c7nyAk$ for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.21.0, unknown
> [0]PETSC ERROR: ./mainOversetLS_exe on a arch-linux-c-opt named glass by colton Thu May 23 10:41:09 2024
> [0]PETSC ERROR: Configure options --download-mpich --with-cc=gcc --with-cxx=g++ --with-debugging=no --with-fc=gfortran COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux-c-opt --download-sowing
> [0]PETSC ERROR: #1 KSPGMRESCycle() at /home/colton/petsc/src/ksp/ksp/impls/gmres/gmres.c:115
> [0]PETSC ERROR: #2 KSPSolve_GMRES() at /home/colton/petsc/src/ksp/ksp/impls/gmres/gmres.c:227
> [0]PETSC ERROR: #3 KSPSolve_Private() at /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:905
> [0]PETSC ERROR: #4 KSPSolve() at /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:1078
> [0]PETSC ERROR: #5 PCApply_FieldSplit_Schur() at /home/colton/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c:1203
> [0]PETSC ERROR: #6 PCApply() at /home/colton/petsc/src/ksp/pc/interface/precon.c:497
> [0]PETSC ERROR: #7 KSP_PCApply() at /home/colton/petsc/include/petsc/private/kspimpl.h:409
> [0]PETSC ERROR: #8 KSPFGMRESCycle() at /home/colton/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:123
> [0]PETSC ERROR: #9 KSPSolve_FGMRES() at /home/colton/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:235
> [0]PETSC ERROR: #10 KSPSolve_Private() at /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:905
> [0]PETSC ERROR: #11 KSPSolve() at /home/colton/petsc/src/ksp/ksp/interface/itfunc.c:1078
> [0]PETSC ERROR: #12 solveStokes() at cartesianStokesGrid.cpp:1403
>
>
>
> On Thu, May 23, 2024 at 10:33 AM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>
>> Run the failing case with also -ksp_error_if_not_converged so we see exactly where the problem is first detected.
>>
>>
>>
>>
>>> On May 23, 2024, at 11:51 AM, Colton Bryant <coltonbryant2021 at u.northwestern.edu <mailto:coltonbryant2021 at u.northwestern.edu>> wrote:
>>>
>>> Hi Barry,
>>>
>>> Thanks for letting me know about the need to use fgmres in this case. I ran a smaller problem (1230 in the first block) and saw similar behavior in the true residual.
>>>
>>> I also ran the same problem with the options -fieldsplit_0_pc_type svd -fieldsplit_0_pc_svd_monitor and get the following output:
>>> SVD: condition number 1.933639985881e+03, 0 of 1230 singular values are (nearly) zero
>>> SVD: smallest singular values: 4.132036392141e-03 4.166444542385e-03 4.669534028645e-03 4.845532162256e-03 5.047038625390e-03
>>> SVD: largest singular values : 7.947990616611e+00 7.961437414477e+00 7.961851612473e+00 7.971335373142e+00 7.989870790960e+00
>>>
>>> I would be surprised if the A_{00} block is ill conditioned as it's just a standard discretization of the laplacian with some rows replaced with ones on the diagonal due to interpolations from the overset mesh. I'm wondering if I'm somehow violating a solvability condition of the problem?
>>>
>>> Thanks for the help!
>>>
>>> -Colton
>>>
>>> On Wed, May 22, 2024 at 6:09 PM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>
>>>> Thanks for the info. I see you are using GMRES inside the Schur complement solver, this is ok but when you do you need to use fgmres as the outer solver. But this is unlikely to be the cause of the exact problem you are seeing.
>>>>
>>>> I'm not sure why the Schur complement KSP is suddenly seeing a large increase in the true residual norm. Is it possible the A_{00} block is ill-conditioned?
>>>>
>>>> Can you run with a smaller problem? Say 2,000 or so in the first block? Is there still a problem?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> On May 22, 2024, at 6:00 PM, Colton Bryant <coltonbryant2021 at u.northwestern.edu <mailto:coltonbryant2021 at u.northwestern.edu>> wrote:
>>>>>
>>>>> Hi Barry,
>>>>>
>>>>> I have not used any other solver parameters in the code and the full set of solver related command line options are those I mentioned in the previous email.
>>>>>
>>>>> Below is the output from -ksp_view:
>>>>>
>>>>> KSP Object: (back_) 1 MPI process
>>>>> type: gmres
>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>>> happy breakdown tolerance 1e-30
>>>>> maximum iterations=10000, initial guess is zero
>>>>> tolerances: relative=1e-08, absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using PRECONDITIONED norm type for convergence test
>>>>> PC Object: (back_) 1 MPI process
>>>>> type: fieldsplit
>>>>> FieldSplit with Schur preconditioner, blocksize = 1, factorization FULL
>>>>> Preconditioner for the Schur complement formed from S itself
>>>>> Split info:
>>>>> Split number 0 Defined by IS
>>>>> Split number 1 Defined by IS
>>>>> KSP solver for A00 block
>>>>> KSP Object: (back_fieldsplit_0_) 1 MPI process
>>>>> type: gmres
>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>>> happy breakdown tolerance 1e-30
>>>>> maximum iterations=10000, initial guess is zero
>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using PRECONDITIONED norm type for convergence test
>>>>> PC Object: (back_fieldsplit_0_) 1 MPI process
>>>>> type: lu
>>>>> out-of-place factorization
>>>>> tolerance for zero pivot 2.22045e-14
>>>>> matrix ordering: nd
>>>>> factor fill ratio given 5., needed 8.83482
>>>>> Factored matrix follows:
>>>>> Mat Object: (back_fieldsplit_0_) 1 MPI process
>>>>> type: seqaij
>>>>> rows=30150, cols=30150
>>>>> package used to perform factorization: petsc
>>>>> total: nonzeros=2649120, allocated nonzeros=2649120
>>>>> using I-node routines: found 15019 nodes, limit used is 5
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object: (back_fieldsplit_0_) 1 MPI process
>>>>> type: seqaij
>>>>> rows=30150, cols=30150
>>>>> total: nonzeros=299850, allocated nonzeros=299850
>>>>> total number of mallocs used during MatSetValues calls=0
>>>>> using I-node routines: found 15150 nodes, limit used is 5
>>>>> KSP solver for S = A11 - A10 inv(A00) A01
>>>>> KSP Object: (back_fieldsplit_1_) 1 MPI process
>>>>> type: gmres
>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>>> happy breakdown tolerance 1e-30
>>>>> maximum iterations=10000, initial guess is zero
>>>>> tolerances: relative=1e-08, absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using PRECONDITIONED norm type for convergence test
>>>>> PC Object: (back_fieldsplit_1_) 1 MPI process
>>>>> type: none
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object: (back_fieldsplit_1_) 1 MPI process
>>>>> type: schurcomplement
>>>>> rows=15000, cols=15000
>>>>> Schur complement A11 - A10 inv(A00) A01
>>>>> A11
>>>>> Mat Object: (back_fieldsplit_1_) 1 MPI process
>>>>> type: seqaij
>>>>> rows=15000, cols=15000
>>>>> total: nonzeros=74700, allocated nonzeros=74700
>>>>> total number of mallocs used during MatSetValues calls=0
>>>>> not using I-node routines
>>>>> A10
>>>>> Mat Object: 1 MPI process
>>>>> type: seqaij
>>>>> rows=15000, cols=30150
>>>>> total: nonzeros=149550, allocated nonzeros=149550
>>>>> total number of mallocs used during MatSetValues calls=0
>>>>> not using I-node routines
>>>>> KSP solver for A00 block viewable with the additional option -back_fieldsplit_0_ksp_view
>>>>> A01
>>>>> Mat Object: 1 MPI process
>>>>> type: seqaij
>>>>> rows=30150, cols=15000
>>>>> total: nonzeros=149550, allocated nonzeros=149550
>>>>> total number of mallocs used during MatSetValues calls=0
>>>>> using I-node routines: found 15150 nodes, limit used is 5
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object: (back_) 1 MPI process
>>>>> type: seqaij
>>>>> rows=45150, cols=45150
>>>>> total: nonzeros=673650, allocated nonzeros=673650
>>>>> total number of mallocs used during MatSetValues calls=0
>>>>> has attached null space
>>>>> using I-node routines: found 15150 nodes, limit used is 5
>>>>>
>>>>> Thanks again!
>>>>>
>>>>> -Colton
>>>>>
>>>>> On Wed, May 22, 2024 at 3:39 PM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>>>
>>>>>> Are you using any other command line options or did you hardwire any solver parameters in the code with, like, KSPSetXXX() or PCSetXXX() Please send all of them.
>>>>>>
>>>>>> Something funky definitely happened when the true residual norms jumped up.
>>>>>>
>>>>>> Could you run the same thing with -ksp_view and don't use any thing like -ksp_error_if_not_converged so we can see exactly what is being run.
>>>>>>
>>>>>> Barry
>>>>>>
>>>>>>
>>>>>>> On May 22, 2024, at 3:21 PM, Colton Bryant <coltonbryant2021 at u.northwestern.edu <mailto:coltonbryant2021 at u.northwestern.edu>> wrote:
>>>>>>>
>>>>>>> This Message Is From an External Sender
>>>>>>> This message came from outside your organization.
>>>>>>> Hello,
>>>>>>>
>>>>>>> I am solving the Stokes equations on a MAC grid discretized by finite differences using a DMSTAG object. I have tested the solver quite extensively on manufactured problems and it seems to work well. As I am still just trying to get things working and not yet worried about speed I am using the following solver options:
>>>>>>> -pc_type fieldsplit
>>>>>>> -pc_fieldsplit_detect_saddle_point
>>>>>>> -fieldsplit_0_pc_type lu
>>>>>>> -fieldsplit_1_ksp_rtol 1.e-8
>>>>>>>
>>>>>>> However I am now using this solver as an inner step of a larger code and have run into issues. The code repeatedly solves the Stokes equations with varying right hand sides coming from changing problem geometry (the solver is a part of an overset grid scheme coupled to a level set method evolving in time). After a couple timesteps I observe the following output when running with -fieldsplit_1_ksp_converged_reason -fieldsplit_1_ksp_monitor_true_residual:
>>>>>>>
>>>>>>> Residual norms for back_fieldsplit_1_ solve.
>>>>>>> 0 KSP preconditioned resid norm 2.826514299465e-02 true resid norm 2.826514299465e-02 ||r(i)||/||b|| 1.000000000000e+00
>>>>>>> 1 KSP preconditioned resid norm 7.286621865915e-03 true resid norm 7.286621865915e-03 ||r(i)||/||b|| 2.577953300039e-01
>>>>>>> 2 KSP preconditioned resid norm 1.500598474492e-03 true resid norm 1.500598474492e-03 ||r(i)||/||b|| 5.309007192273e-02
>>>>>>> 3 KSP preconditioned resid norm 3.796396924978e-04 true resid norm 3.796396924978e-04 ||r(i)||/||b|| 1.343137349666e-02
>>>>>>> 4 KSP preconditioned resid norm 8.091057439816e-05 true resid norm 8.091057439816e-05 ||r(i)||/||b|| 2.862556697960e-03
>>>>>>> 5 KSP preconditioned resid norm 3.689113122359e-05 true resid norm 3.689113122359e-05 ||r(i)||/||b|| 1.305181128239e-03
>>>>>>> 6 KSP preconditioned resid norm 2.116450533352e-05 true resid norm 2.116450533352e-05 ||r(i)||/||b|| 7.487846545662e-04
>>>>>>> 7 KSP preconditioned resid norm 3.968234031201e-06 true resid norm 3.968234031200e-06 ||r(i)||/||b|| 1.403932055801e-04
>>>>>>> 8 KSP preconditioned resid norm 6.666949419511e-07 true resid norm 6.666949419506e-07 ||r(i)||/||b|| 2.358717739644e-05
>>>>>>> 9 KSP preconditioned resid norm 1.941522884928e-07 true resid norm 1.941522884931e-07 ||r(i)||/||b|| 6.868965372998e-06
>>>>>>> 10 KSP preconditioned resid norm 6.729545258682e-08 true resid norm 6.729545258626e-08 ||r(i)||/||b|| 2.380863687793e-06
>>>>>>> 11 KSP preconditioned resid norm 3.009070131709e-08 true resid norm 3.009070131735e-08 ||r(i)||/||b|| 1.064586912687e-06
>>>>>>> 12 KSP preconditioned resid norm 7.849353009588e-09 true resid norm 7.849353009903e-09 ||r(i)||/||b|| 2.777043445840e-07
>>>>>>> 13 KSP preconditioned resid norm 2.306283345754e-09 true resid norm 2.306283346677e-09 ||r(i)||/||b|| 8.159461097060e-08
>>>>>>> 14 KSP preconditioned resid norm 9.336302495083e-10 true resid norm 9.336302502503e-10 ||r(i)||/||b|| 3.303115255517e-08
>>>>>>> 15 KSP preconditioned resid norm 6.537456143401e-10 true resid norm 6.537456141617e-10 ||r(i)||/||b|| 2.312903968982e-08
>>>>>>> 16 KSP preconditioned resid norm 6.389159552788e-10 true resid norm 6.389159550304e-10 ||r(i)||/||b|| 2.260437724130e-08
>>>>>>> 17 KSP preconditioned resid norm 6.380905134246e-10 true resid norm 6.380905136023e-10 ||r(i)||/||b|| 2.257517372981e-08
>>>>>>> 18 KSP preconditioned resid norm 6.380440605992e-10 true resid norm 6.380440604688e-10 ||r(i)||/||b|| 2.257353025207e-08
>>>>>>> 19 KSP preconditioned resid norm 6.380427156582e-10 true resid norm 6.380427157894e-10 ||r(i)||/||b|| 2.257348267830e-08
>>>>>>> 20 KSP preconditioned resid norm 6.380426714897e-10 true resid norm 6.380426714004e-10 ||r(i)||/||b|| 2.257348110785e-08
>>>>>>> 21 KSP preconditioned resid norm 6.380426656970e-10 true resid norm 6.380426658839e-10 ||r(i)||/||b|| 2.257348091268e-08
>>>>>>> 22 KSP preconditioned resid norm 6.380426650538e-10 true resid norm 6.380426650287e-10 ||r(i)||/||b|| 2.257348088242e-08
>>>>>>> 23 KSP preconditioned resid norm 6.380426649918e-10 true resid norm 6.380426645888e-10 ||r(i)||/||b|| 2.257348086686e-08
>>>>>>> 24 KSP preconditioned resid norm 6.380426649803e-10 true resid norm 6.380426644294e-10 ||r(i)||/||b|| 2.257348086122e-08
>>>>>>> 25 KSP preconditioned resid norm 6.380426649796e-10 true resid norm 6.380426649774e-10 ||r(i)||/||b|| 2.257348088061e-08
>>>>>>> 26 KSP preconditioned resid norm 6.380426649795e-10 true resid norm 6.380426653788e-10 ||r(i)||/||b|| 2.257348089481e-08
>>>>>>> 27 KSP preconditioned resid norm 6.380426649795e-10 true resid norm 6.380426646744e-10 ||r(i)||/||b|| 2.257348086989e-08
>>>>>>> 28 KSP preconditioned resid norm 6.380426649795e-10 true resid norm 6.380426650818e-10 ||r(i)||/||b|| 2.257348088430e-08
>>>>>>> 29 KSP preconditioned resid norm 6.380426649795e-10 true resid norm 6.380426649518e-10 ||r(i)||/||b|| 2.257348087970e-08
>>>>>>> 30 KSP preconditioned resid norm 6.380426652142e-10 true resid norm 6.380426652142e-10 ||r(i)||/||b|| 2.257348088898e-08
>>>>>>> 31 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426646799e-10 ||r(i)||/||b|| 2.257348087008e-08
>>>>>>> 32 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426648077e-10 ||r(i)||/||b|| 2.257348087460e-08
>>>>>>> 33 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426649048e-10 ||r(i)||/||b|| 2.257348087804e-08
>>>>>>> 34 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426648142e-10 ||r(i)||/||b|| 2.257348087483e-08
>>>>>>> 35 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426651079e-10 ||r(i)||/||b|| 2.257348088522e-08
>>>>>>> 36 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426650433e-10 ||r(i)||/||b|| 2.257348088294e-08
>>>>>>> 37 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426649765e-10 ||r(i)||/||b|| 2.257348088057e-08
>>>>>>> 38 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426650364e-10 ||r(i)||/||b|| 2.257348088269e-08
>>>>>>> 39 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426650051e-10 ||r(i)||/||b|| 2.257348088159e-08
>>>>>>> 40 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426651154e-10 ||r(i)||/||b|| 2.257348088549e-08
>>>>>>> 41 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426650246e-10 ||r(i)||/||b|| 2.257348088227e-08
>>>>>>> 42 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426650702e-10 ||r(i)||/||b|| 2.257348088389e-08
>>>>>>> 43 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426651686e-10 ||r(i)||/||b|| 2.257348088737e-08
>>>>>>> 44 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426650870e-10 ||r(i)||/||b|| 2.257348088448e-08
>>>>>>> 45 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426651208e-10 ||r(i)||/||b|| 2.257348088568e-08
>>>>>>> 46 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426651441e-10 ||r(i)||/||b|| 2.257348088650e-08
>>>>>>> 47 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426650955e-10 ||r(i)||/||b|| 2.257348088478e-08
>>>>>>> 48 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426650877e-10 ||r(i)||/||b|| 2.257348088451e-08
>>>>>>> 49 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426651240e-10 ||r(i)||/||b|| 2.257348088579e-08
>>>>>>> 50 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426650534e-10 ||r(i)||/||b|| 2.257348088329e-08
>>>>>>> 51 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426648615e-10 ||r(i)||/||b|| 2.257348087651e-08
>>>>>>> 52 KSP preconditioned resid norm 6.380426652141e-10 true resid norm 6.380426649523e-10 ||r(i)||/||b|| 2.257348087972e-08
>>>>>>> 53 KSP preconditioned resid norm 6.380426652140e-10 true resid norm 6.380426652601e-10 ||r(i)||/||b|| 2.257348089061e-08
>>>>>>> 54 KSP preconditioned resid norm 6.380426652125e-10 true resid norm 6.380427512852e-10 ||r(i)||/||b|| 2.257348393411e-08
>>>>>>> 55 KSP preconditioned resid norm 6.380426651849e-10 true resid norm 6.380603444402e-10 ||r(i)||/||b|| 2.257410636701e-08
>>>>>>> 56 KSP preconditioned resid norm 6.380426646751e-10 true resid norm 6.439925413105e-10 ||r(i)||/||b|| 2.278398313542e-08
>>>>>>> 57 KSP preconditioned resid norm 6.380426514019e-10 true resid norm 2.674218007058e-09 ||r(i)||/||b|| 9.461186902765e-08
>>>>>>> 58 KSP preconditioned resid norm 6.380425077384e-10 true resid norm 2.406759314486e-08 ||r(i)||/||b|| 8.514937691775e-07
>>>>>>> 59 KSP preconditioned resid norm 6.380406171326e-10 true resid norm 3.100137288622e-07 ||r(i)||/||b|| 1.096805803957e-05
>>>>>>> Linear back_fieldsplit_1_ solve did not converge due to DIVERGED_BREAKDOWN iterations 60
>>>>>>>
>>>>>>> Any advice on steps I could take to elucidate the issue would be greatly appreciated. Thanks so much for any help in advance!
>>>>>>>
>>>>>>> Best,
>>>>>>> Colton Bryant
>>>>>>
>>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240523/aa1cb29c/attachment-0001.html>
More information about the petsc-users
mailing list