# [petsc-users] [petsc-maint] Iterative Solver Problem

Mon Apr 28 13:21:35 CDT 2014

```Hello Again;

I used -ksp_rtol 1.e-12 and it took way way longer to get the result
for one iteration and it did not converge:

Linear solve did not converge due to DIVERGED_ITS iterations 10000
KSP Object: 8 MPI processes
type: gmres
GMRES: restart=300, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances:  relative=1e-12, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: asm
Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1
Additive Schwarz: restriction/interpolation type - RESTRICT
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object:  (sub_)   1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object:  (sub_)   1 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 1e-12
matrix ordering: nd
factor fill ratio given 5, needed 3.70575
Factored matrix follows:
Matrix Object:           1 MPI processes
type: seqaij
rows=5630, cols=5630
package used to perform factorization: petsc
total: nonzeros=877150, allocated nonzeros=877150
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 1126 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object:     1 MPI processes
type: seqaij
rows=5630, cols=5630
total: nonzeros=236700, allocated nonzeros=236700
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 1126 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object:   8 MPI processes
type: mpiaij
rows=41000, cols=41000
total: nonzeros=1817800, allocated nonzeros=2555700
total number of mallocs used during MatSetValues calls =121180
using I-node (on process 0) routines: found 1025 nodes, limit used is 5

Well, let me clear everything. I am solving the whole system (air and
water) coupled at once. Although originally the system is not linear,
but I linearized the equations, so I have some lagged terms. In
addition the interface (between two phases) location is wrong at the
beginning and should be corrected in each iteration after getting the
solution. Therefore, I solve the whole domain, move the interface and
again solve the whole domain. This should continue until the interface
movement becomes from the order of 1E-12.

My problem is after getting the converged solution. Restarting from
the converged solution, if I use Superlu, it gives me back the
converged solution and stops after one iteration. But, if I use any
iterative solver, it does not give me back the converged solution and
starts moving the interface cause the wrong solution ask for new
interface location. This leads to oscillation for ever and for some
cases divergence.

--
With Best Regards;

Quoting Barry Smith <bsmith at mcs.anl.gov>:

>
> On Apr 28, 2014, at 12:59 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>  First try a much tighter tolerance on the linear solver. Use
>> -ksp_rtol 1.e-12
>>
>>  I don?t fully understand. Is the coupled system nonlinear? Are you
>> solving a nonlinear system, how are you doing that since you seem
>> to be only solving a single linear system? Does the linear system
>> involve all unknowns in the fluid and air?
>>
>>  Barry
>>
>>
>>
>> <umhassa5 at cc.umanitoba.ca> wrote:
>>
>>> Hello PETSc team;
>>>
>>> The PETSc setup in my code is working now. I have issues with
>>> using the iterative solver instead of direct solver.
>>>
>>> I am solving a 2D, two-phase flow. Two fluids (air and water) flow
>>> into a channel and there is interaction between two phases. I am
>>> solving for the velocities in x and y directions, pressure and two
>>> scalars. They are all coupled together. I am looking for the
>>> steady-state solution. Since there is interface between the phases
>>> which needs updating, there are many iterations to reach the
>>> steady-state solution. "A" is a nine-banded non-symmetric matrix
>>> and each node has five unknowns. I am storing the non-zero
>>> coefficients and their locations in three separate vectors.
>>>
>>> I started using the direct solver. Superlu works fine and gives me
>>> good results compared to the previous works. However it is not
>>> cheap and applicable for fine grids. But, the iterative solver did
>>> not work and here is what I did:
>>>
>>> I got the converged solution by using Superlu. After that I
>>> restarted from the converged solution and did one iteration using
>>> -pc_type lu -pc_factor_mat_solver_package superlu_dist
>>> -log_summary. Again, it gave me the same converged solution.
>>>
>>> After that I started from the converged solution once more and
>>> this time I tried different combinations of iterative solvers and
>>> preconditions like the followings:
>>> -ksp_type gmres -ksp_gmres_restart 300 -pc_type asm -sub_pc_type
>>> lu ksp_monitor_true_residual -ksp_converged_reason -ksp_view
>>> -log_summary
>>>
>>> and here is the report:
>>> Linear solve converged due to CONVERGED_RTOL iterations 41
>>> KSP Object: 8 MPI processes
>>> type: gmres
>>>   GMRES: restart=300, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>>   GMRES: happy breakdown tolerance 1e-30
>>> maximum iterations=10000, initial guess is zero
>>> tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 8 MPI processes
>>> type: asm
>>>   Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1
>>>   Additive Schwarz: restriction/interpolation type - RESTRICT
>>>   Local solve is same for all blocks, in the following KSP and PC objects:
>>> KSP Object:  (sub_)   1 MPI processes
>>>   type: preonly
>>>   maximum iterations=10000, initial guess is zero
>>>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>   left preconditioning
>>>   using NONE norm type for convergence test
>>> PC Object:  (sub_)   1 MPI processes
>>>   type: lu
>>>     LU: out-of-place factorization
>>>     tolerance for zero pivot 1e-12
>>>     matrix ordering: nd
>>>     factor fill ratio given 5, needed 3.70575
>>>       Factored matrix follows:
>>>         Matrix Object:           1 MPI processes
>>>           type: seqaij
>>>           rows=5630, cols=5630
>>>           package used to perform factorization: petsc
>>>           total: nonzeros=877150, allocated nonzeros=877150
>>>           total number of mallocs used during MatSetValues calls =0
>>>             using I-node routines: found 1126 nodes, limit used is 5
>>>   linear system matrix = precond matrix:
>>>   Matrix Object:     1 MPI processes
>>>     type: seqaij
>>>     rows=5630, cols=5630
>>>     total: nonzeros=236700, allocated nonzeros=236700
>>>     total number of mallocs used during MatSetValues calls =0
>>>       using I-node routines: found 1126 nodes, limit used is 5
>>> linear system matrix = precond matrix:
>>> Matrix Object:   8 MPI processes
>>>   type: mpiaij
>>>   rows=41000, cols=41000
>>>   total: nonzeros=1817800, allocated nonzeros=2555700
>>>   total number of mallocs used during MatSetValues calls =121180
>>>     using I-node (on process 0) routines: found 1025 nodes, limit used is 5
>>>
>>> But, the results are far from the converged solution. For example
>>> two reference nodes for the pressure are compared:
>>>
>>> Based on Superlu
>>> Channel Inlet pressure (MIXTURE):      0.38890D-01
>>> Channel Inlet pressure (LIQUID):       0.38416D-01
>>>
>>> Based on Gmres
>>> Channel Inlet pressure (MIXTURE):     -0.87214D+00
>>> Channel Inlet pressure (LIQUID):      -0.87301D+00
>>>
>>>
>>> I also tried this:
>>> -ksp_type gcr -pc_type asm -ksp_diagonal_scale
>>> -ksp_diagonal_scale_fix -ksp_monitor_true_residual
>>> -ksp_converged_reason -ksp_view -log_summary
>>>
>>> and here is the report:
>>> 0 KSP unpreconditioned resid norm 2.248340888101e+05 true resid
>>> norm 2.248340888101e+05 ||r(i)||/||b|| 1.000000000000e+00
>>> 1 KSP unpreconditioned resid norm 4.900010460179e+04 true resid
>>> norm 4.900010460179e+04 ||r(i)||/||b|| 2.179389471637e-01
>>> 2 KSP unpreconditioned resid norm 4.267761572746e+04 true resid
>>> norm 4.267761572746e+04 ||r(i)||/||b|| 1.898182608933e-01
>>> 3 KSP unpreconditioned resid norm 2.041242251471e+03 true resid
>>> norm 2.041242251471e+03 ||r(i)||/||b|| 9.078882398457e-03
>>> 4 KSP unpreconditioned resid norm 1.852885420564e+03 true resid
>>> norm 1.852885420564e+03 ||r(i)||/||b|| 8.241123178296e-03
>>> 5 KSP unpreconditioned resid norm 1.748965594395e+02 true resid
>>> norm 1.748965594395e+02 ||r(i)||/||b|| 7.778916460804e-04
>>> 6 KSP unpreconditioned resid norm 5.664539353996e+01 true resid
>>> norm 5.664539353996e+01 ||r(i)||/||b|| 2.519430831852e-04
>>> 7 KSP unpreconditioned resid norm 3.607535692806e+01 true resid
>>> norm 3.607535692806e+01 ||r(i)||/||b|| 1.604532351788e-04
>>> 8 KSP unpreconditioned resid norm 1.041501303366e+01 true resid
>>> norm 1.041501303366e+01 ||r(i)||/||b|| 4.632310468924e-05
>>> 9 KSP unpreconditioned resid norm 3.089920380322e+00 true resid
>>> norm 3.089920380322e+00 ||r(i)||/||b|| 1.374311340720e-05
>>> 10 KSP unpreconditioned resid norm 1.456883209806e+00 true resid
>>> norm 1.456883209806e+00 ||r(i)||/||b|| 6.479814593583e-06
>>> 11 KSP unpreconditioned resid norm 5.566902714391e-01 true resid
>>> norm 5.566902714391e-01 ||r(i)||/||b|| 2.476004748147e-06
>>> 12 KSP unpreconditioned resid norm 2.403913756663e-01 true resid
>>> norm 2.403913756663e-01 ||r(i)||/||b|| 1.069194520006e-06
>>> 13 KSP unpreconditioned resid norm 1.650435118839e-01 true resid
>>> norm 1.650435118839e-01 ||r(i)||/||b|| 7.340680088032e-07
>>> Linear solve converged due to CONVERGED_RTOL iterations 13
>>> KSP Object: 8 MPI processes
>>> type: gcr
>>>   GCR: restart = 30
>>>   GCR: restarts performed = 1
>>> maximum iterations=10000, initial guess is zero
>>> tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
>>> right preconditioning
>>> diagonally scaled system
>>> using UNPRECONDITIONED norm type for convergence test
>>> PC Object: 8 MPI processes
>>> type: asm
>>>   Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1
>>>   Additive Schwarz: restriction/interpolation type - RESTRICT
>>>   Local solve is same for all blocks, in the following KSP and PC objects:
>>> KSP Object:  (sub_)   1 MPI processes
>>>   type: preonly
>>>   maximum iterations=10000, initial guess is zero
>>>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>   left preconditioning
>>>   using NONE norm type for convergence test
>>> PC Object:  (sub_)   1 MPI processes
>>>   type: ilu
>>>     ILU: out-of-place factorization
>>>     0 levels of fill
>>>     tolerance for zero pivot 1e-12
>>>     using diagonal shift to prevent zero pivot
>>>     matrix ordering: natural
>>>     factor fill ratio given 1, needed 1
>>>       Factored matrix follows:
>>>         Matrix Object:           1 MPI processes
>>>           type: seqaij
>>>           rows=5630, cols=5630
>>>           package used to perform factorization: petsc
>>>           total: nonzeros=236700, allocated nonzeros=236700
>>>           total number of mallocs used during MatSetValues calls =0
>>>             using I-node routines: found 1126 nodes, limit used is 5
>>>   linear system matrix = precond matrix:
>>>   Matrix Object:     1 MPI processes
>>>     type: seqaij
>>>     rows=5630, cols=5630
>>>     total: nonzeros=236700, allocated nonzeros=236700
>>>     total number of mallocs used during MatSetValues calls =0
>>>       using I-node routines: found 1126 nodes, limit used is 5
>>> linear system matrix = precond matrix:
>>> Matrix Object:   8 MPI processes
>>>   type: mpiaij
>>>   rows=41000, cols=41000
>>>   total: nonzeros=1817800, allocated nonzeros=2555700
>>>   total number of mallocs used during MatSetValues calls =121180
>>>     using I-node (on process 0) routines: found 1025 nodes, limit used is 5
>>>
>>> Channel Inlet pressure (MIXTURE):      -0.90733D+00
>>> Channel Inlet pressure (LIQUID):      -0.10118D+01
>>>
>>>
>>> As you may see these are complete different results which are not
>>> close to the converged solution.
>>>
>>> Since, I want to have fine grids I need to use iterative solver. I
>>> wonder if I am missing something or using wrong
>>> solver/precondition/option. I would appreciate if you could help
>>> me (like always).
>>>
>>> --
>>> With Best Regards;
>>>
>>>
>>>
>>>
>>
>
>
>

```