[petsc-users] [petsc-maint] Iterative Solver Problem

Mon Apr 28 13:33:39 CDT 2014

  Please run with the additional options -ksp_max_it 500 -ksp_gmres_restart 500 -ksp_monitor_true_residual -ksp_monitor_singular_value and send back all the output (that would include the 500 residual norms as it tries to converge.)

  Barry

On Apr 28, 2014, at 1:21 PM, Foad Hassaninejadfarahani <umhassa5 at cc.umanitoba.ca> wrote:

> Hello Again;
> 
> I used -ksp_rtol 1.e-12 and it took way way longer to get the result for one iteration and it did not converge:
> 
> Linear solve did not converge due to DIVERGED_ITS iterations 10000
> KSP Object: 8 MPI processes
>  type: gmres
>    GMRES: restart=300, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>    GMRES: happy breakdown tolerance 1e-30
>  maximum iterations=10000, initial guess is zero
>  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000
>  left preconditioning
>  using PRECONDITIONED norm type for convergence test
> PC Object: 8 MPI processes
>  type: asm
>    Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1
>    Additive Schwarz: restriction/interpolation type - RESTRICT
>    Local solve is same for all blocks, in the following KSP and PC objects:
>  KSP Object:  (sub_)   1 MPI processes
>    type: preonly
>    maximum iterations=10000, initial guess is zero
>    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>    left preconditioning
>    using NONE norm type for convergence test
>  PC Object:  (sub_)   1 MPI processes
>    type: lu
>      LU: out-of-place factorization
>      tolerance for zero pivot 1e-12
>      matrix ordering: nd
>      factor fill ratio given 5, needed 3.70575
>        Factored matrix follows:
>          Matrix Object:           1 MPI processes
>            type: seqaij
>            rows=5630, cols=5630
>            package used to perform factorization: petsc
>            total: nonzeros=877150, allocated nonzeros=877150
>            total number of mallocs used during MatSetValues calls =0
>              using I-node routines: found 1126 nodes, limit used is 5
>    linear system matrix = precond matrix:
>    Matrix Object:     1 MPI processes
>      type: seqaij
>      rows=5630, cols=5630
>      total: nonzeros=236700, allocated nonzeros=236700
>      total number of mallocs used during MatSetValues calls =0
>        using I-node routines: found 1126 nodes, limit used is 5
>  linear system matrix = precond matrix:
>  Matrix Object:   8 MPI processes
>    type: mpiaij
>    rows=41000, cols=41000
>    total: nonzeros=1817800, allocated nonzeros=2555700
>    total number of mallocs used during MatSetValues calls =121180
>      using I-node (on process 0) routines: found 1025 nodes, limit used is 5
> 
> 
> Well, let me clear everything. I am solving the whole system (air and water) coupled at once. Although originally the system is not linear, but I linearized the equations, so I have some lagged terms. In addition the interface (between two phases) location is wrong at the beginning and should be corrected in each iteration after getting the solution. Therefore, I solve the whole domain, move the interface and again solve the whole domain. This should continue until the interface movement becomes from the order of 1E-12.
> 
> My problem is after getting the converged solution. Restarting from the converged solution, if I use Superlu, it gives me back the converged solution and stops after one iteration. But, if I use any iterative solver, it does not give me back the converged solution and starts moving the interface cause the wrong solution ask for new interface location. This leads to oscillation for ever and for some cases divergence.
> 
> -- 
> With Best Regards;
> Foad
> 
> 
> Quoting Barry Smith <bsmith at mcs.anl.gov>:
> 
>> 
>> On Apr 28, 2014, at 12:59 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>>> 
>>> First try a much tighter tolerance on the linear solver. Use -ksp_rtol 1.e-12
>>> 
>>> I don?t fully understand. Is the coupled system nonlinear? Are you solving a nonlinear system, how are you doing that since you seem to be only solving a single linear system? Does the linear system involve all unknowns in the fluid and air?
>>> 
>>> Barry
>>> 
>>> 
>>> 
>>> On Apr 28, 2014, at 11:19 AM, Foad Hassaninejadfarahani <umhassa5 at cc.umanitoba.ca> wrote:
>>> 
>>>> Hello PETSc team;
>>>> 
>>>> The PETSc setup in my code is working now. I have issues with using the iterative solver instead of direct solver.
>>>> 
>>>> I am solving a 2D, two-phase flow. Two fluids (air and water) flow into a channel and there is interaction between two phases. I am solving for the velocities in x and y directions, pressure and two scalars. They are all coupled together. I am looking for the steady-state solution. Since there is interface between the phases which needs updating, there are many iterations to reach the steady-state solution. "A" is a nine-banded non-symmetric matrix and each node has five unknowns. I am storing the non-zero coefficients and their locations in three separate vectors.
>>>> 
>>>> I started using the direct solver. Superlu works fine and gives me good results compared to the previous works. However it is not cheap and applicable for fine grids. But, the iterative solver did not work and here is what I did:
>>>> 
>>>> I got the converged solution by using Superlu. After that I restarted from the converged solution and did one iteration using  -pc_type lu -pc_factor_mat_solver_package superlu_dist -log_summary. Again, it gave me the same converged solution.
>>>> 
>>>> After that I started from the converged solution once more and this time I tried different combinations of iterative solvers and preconditions like the followings:
>>>> -ksp_type gmres -ksp_gmres_restart 300 -pc_type asm -sub_pc_type lu ksp_monitor_true_residual -ksp_converged_reason -ksp_view -log_summary
>>>> 
>>>> and here is the report:
>>>> Linear solve converged due to CONVERGED_RTOL iterations 41
>>>> KSP Object: 8 MPI processes
>>>> type: gmres
>>>>  GMRES: restart=300, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>>  GMRES: happy breakdown tolerance 1e-30
>>>> maximum iterations=10000, initial guess is zero
>>>> tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
>>>> left preconditioning
>>>> using PRECONDITIONED norm type for convergence test
>>>> PC Object: 8 MPI processes
>>>> type: asm
>>>>  Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1
>>>>  Additive Schwarz: restriction/interpolation type - RESTRICT
>>>>  Local solve is same for all blocks, in the following KSP and PC objects:
>>>> KSP Object:  (sub_)   1 MPI processes
>>>>  type: preonly
>>>>  maximum iterations=10000, initial guess is zero
>>>>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>>  left preconditioning
>>>>  using NONE norm type for convergence test
>>>> PC Object:  (sub_)   1 MPI processes
>>>>  type: lu
>>>>    LU: out-of-place factorization
>>>>    tolerance for zero pivot 1e-12
>>>>    matrix ordering: nd
>>>>    factor fill ratio given 5, needed 3.70575
>>>>      Factored matrix follows:
>>>>        Matrix Object:           1 MPI processes
>>>>          type: seqaij
>>>>          rows=5630, cols=5630
>>>>          package used to perform factorization: petsc
>>>>          total: nonzeros=877150, allocated nonzeros=877150
>>>>          total number of mallocs used during MatSetValues calls =0
>>>>            using I-node routines: found 1126 nodes, limit used is 5
>>>>  linear system matrix = precond matrix:
>>>>  Matrix Object:     1 MPI processes
>>>>    type: seqaij
>>>>    rows=5630, cols=5630
>>>>    total: nonzeros=236700, allocated nonzeros=236700
>>>>    total number of mallocs used during MatSetValues calls =0
>>>>      using I-node routines: found 1126 nodes, limit used is 5
>>>> linear system matrix = precond matrix:
>>>> Matrix Object:   8 MPI processes
>>>>  type: mpiaij
>>>>  rows=41000, cols=41000
>>>>  total: nonzeros=1817800, allocated nonzeros=2555700
>>>>  total number of mallocs used during MatSetValues calls =121180
>>>>    using I-node (on process 0) routines: found 1025 nodes, limit used is 5
>>>> 
>>>> But, the results are far from the converged solution. For example two reference nodes for the pressure are compared:
>>>> 
>>>> Based on Superlu
>>>> Channel Inlet pressure (MIXTURE):      0.38890D-01
>>>> Channel Inlet pressure (LIQUID):       0.38416D-01
>>>> 
>>>> Based on Gmres
>>>> Channel Inlet pressure (MIXTURE):     -0.87214D+00
>>>> Channel Inlet pressure (LIQUID):      -0.87301D+00
>>>> 
>>>> 
>>>> I also tried this:
>>>> -ksp_type gcr -pc_type asm -ksp_diagonal_scale -ksp_diagonal_scale_fix -ksp_monitor_true_residual -ksp_converged_reason -ksp_view -log_summary
>>>> 
>>>> and here is the report:
>>>> 0 KSP unpreconditioned resid norm 2.248340888101e+05 true resid norm 2.248340888101e+05 ||r(i)||/||b|| 1.000000000000e+00
>>>> 1 KSP unpreconditioned resid norm 4.900010460179e+04 true resid norm 4.900010460179e+04 ||r(i)||/||b|| 2.179389471637e-01
>>>> 2 KSP unpreconditioned resid norm 4.267761572746e+04 true resid norm 4.267761572746e+04 ||r(i)||/||b|| 1.898182608933e-01
>>>> 3 KSP unpreconditioned resid norm 2.041242251471e+03 true resid norm 2.041242251471e+03 ||r(i)||/||b|| 9.078882398457e-03
>>>> 4 KSP unpreconditioned resid norm 1.852885420564e+03 true resid norm 1.852885420564e+03 ||r(i)||/||b|| 8.241123178296e-03
>>>> 5 KSP unpreconditioned resid norm 1.748965594395e+02 true resid norm 1.748965594395e+02 ||r(i)||/||b|| 7.778916460804e-04
>>>> 6 KSP unpreconditioned resid norm 5.664539353996e+01 true resid norm 5.664539353996e+01 ||r(i)||/||b|| 2.519430831852e-04
>>>> 7 KSP unpreconditioned resid norm 3.607535692806e+01 true resid norm 3.607535692806e+01 ||r(i)||/||b|| 1.604532351788e-04
>>>> 8 KSP unpreconditioned resid norm 1.041501303366e+01 true resid norm 1.041501303366e+01 ||r(i)||/||b|| 4.632310468924e-05
>>>> 9 KSP unpreconditioned resid norm 3.089920380322e+00 true resid norm 3.089920380322e+00 ||r(i)||/||b|| 1.374311340720e-05
>>>> 10 KSP unpreconditioned resid norm 1.456883209806e+00 true resid norm 1.456883209806e+00 ||r(i)||/||b|| 6.479814593583e-06
>>>> 11 KSP unpreconditioned resid norm 5.566902714391e-01 true resid norm 5.566902714391e-01 ||r(i)||/||b|| 2.476004748147e-06
>>>> 12 KSP unpreconditioned resid norm 2.403913756663e-01 true resid norm 2.403913756663e-01 ||r(i)||/||b|| 1.069194520006e-06
>>>> 13 KSP unpreconditioned resid norm 1.650435118839e-01 true resid norm 1.650435118839e-01 ||r(i)||/||b|| 7.340680088032e-07
>>>> Linear solve converged due to CONVERGED_RTOL iterations 13
>>>> KSP Object: 8 MPI processes
>>>> type: gcr
>>>>  GCR: restart = 30
>>>>  GCR: restarts performed = 1
>>>> maximum iterations=10000, initial guess is zero
>>>> tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
>>>> right preconditioning
>>>> diagonally scaled system
>>>> using UNPRECONDITIONED norm type for convergence test
>>>> PC Object: 8 MPI processes
>>>> type: asm
>>>>  Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1
>>>>  Additive Schwarz: restriction/interpolation type - RESTRICT
>>>>  Local solve is same for all blocks, in the following KSP and PC objects:
>>>> KSP Object:  (sub_)   1 MPI processes
>>>>  type: preonly
>>>>  maximum iterations=10000, initial guess is zero
>>>>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>>  left preconditioning
>>>>  using NONE norm type for convergence test
>>>> PC Object:  (sub_)   1 MPI processes
>>>>  type: ilu
>>>>    ILU: out-of-place factorization
>>>>    0 levels of fill
>>>>    tolerance for zero pivot 1e-12
>>>>    using diagonal shift to prevent zero pivot
>>>>    matrix ordering: natural
>>>>    factor fill ratio given 1, needed 1
>>>>      Factored matrix follows:
>>>>        Matrix Object:           1 MPI processes
>>>>          type: seqaij
>>>>          rows=5630, cols=5630
>>>>          package used to perform factorization: petsc
>>>>          total: nonzeros=236700, allocated nonzeros=236700
>>>>          total number of mallocs used during MatSetValues calls =0
>>>>            using I-node routines: found 1126 nodes, limit used is 5
>>>>  linear system matrix = precond matrix:
>>>>  Matrix Object:     1 MPI processes
>>>>    type: seqaij
>>>>    rows=5630, cols=5630
>>>>    total: nonzeros=236700, allocated nonzeros=236700
>>>>    total number of mallocs used during MatSetValues calls =0
>>>>      using I-node routines: found 1126 nodes, limit used is 5
>>>> linear system matrix = precond matrix:
>>>> Matrix Object:   8 MPI processes
>>>>  type: mpiaij
>>>>  rows=41000, cols=41000
>>>>  total: nonzeros=1817800, allocated nonzeros=2555700
>>>>  total number of mallocs used during MatSetValues calls =121180
>>>>    using I-node (on process 0) routines: found 1025 nodes, limit used is 5
>>>> 
>>>> Channel Inlet pressure (MIXTURE):      -0.90733D+00
>>>> Channel Inlet pressure (LIQUID):      -0.10118D+01
>>>> 
>>>> 
>>>> As you may see these are complete different results which are not close to the converged solution.
>>>> 
>>>> Since, I want to have fine grids I need to use iterative solver. I wonder if I am missing something or using wrong solver/precondition/option. I would appreciate if you could help me (like always).
>>>> 
>>>> --
>>>> With Best Regards;
>>>> Foad
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 
>> 
> 
>