[petsc-users] [petsc-maint] Iterative Solver Problem

Mon Apr 28 13:00:25 CDT 2014

On Apr 28, 2014, at 12:59 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

> 
>  First try a much tighter tolerance on the linear solver. Use -ksp_rtol 1.e-12
> 
>  I don’t fully understand. Is the coupled system nonlinear? Are you solving a nonlinear system, how are you doing that since you seem to be only solving a single linear system? Does the linear system involve all unknowns in the fluid and air?
> 
>  Barry
> 
> 
> 
> On Apr 28, 2014, at 11:19 AM, Foad Hassaninejadfarahani <umhassa5 at cc.umanitoba.ca> wrote:
> 
>> Hello PETSc team;
>> 
>> The PETSc setup in my code is working now. I have issues with using the iterative solver instead of direct solver.
>> 
>> I am solving a 2D, two-phase flow. Two fluids (air and water) flow into a channel and there is interaction between two phases. I am solving for the velocities in x and y directions, pressure and two scalars. They are all coupled together. I am looking for the steady-state solution. Since there is interface between the phases which needs updating, there are many iterations to reach the steady-state solution. "A" is a nine-banded non-symmetric matrix and each node has five unknowns. I am storing the non-zero coefficients and their locations in three separate vectors.
>> 
>> I started using the direct solver. Superlu works fine and gives me good results compared to the previous works. However it is not cheap and applicable for fine grids. But, the iterative solver did not work and here is what I did:
>> 
>> I got the converged solution by using Superlu. After that I restarted from the converged solution and did one iteration using  -pc_type lu -pc_factor_mat_solver_package superlu_dist -log_summary. Again, it gave me the same converged solution.
>> 
>> After that I started from the converged solution once more and this time I tried different combinations of iterative solvers and preconditions like the followings:
>> -ksp_type gmres -ksp_gmres_restart 300 -pc_type asm -sub_pc_type lu ksp_monitor_true_residual -ksp_converged_reason -ksp_view -log_summary
>> 
>> and here is the report:
>> Linear solve converged due to CONVERGED_RTOL iterations 41
>> KSP Object: 8 MPI processes
>> type: gmres
>>   GMRES: restart=300, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>   GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=10000, initial guess is zero
>> tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
>> left preconditioning
>> using PRECONDITIONED norm type for convergence test
>> PC Object: 8 MPI processes
>> type: asm
>>   Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1
>>   Additive Schwarz: restriction/interpolation type - RESTRICT
>>   Local solve is same for all blocks, in the following KSP and PC objects:
>> KSP Object:  (sub_)   1 MPI processes
>>   type: preonly
>>   maximum iterations=10000, initial guess is zero
>>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>   left preconditioning
>>   using NONE norm type for convergence test
>> PC Object:  (sub_)   1 MPI processes
>>   type: lu
>>     LU: out-of-place factorization
>>     tolerance for zero pivot 1e-12
>>     matrix ordering: nd
>>     factor fill ratio given 5, needed 3.70575
>>       Factored matrix follows:
>>         Matrix Object:           1 MPI processes
>>           type: seqaij
>>           rows=5630, cols=5630
>>           package used to perform factorization: petsc
>>           total: nonzeros=877150, allocated nonzeros=877150
>>           total number of mallocs used during MatSetValues calls =0
>>             using I-node routines: found 1126 nodes, limit used is 5
>>   linear system matrix = precond matrix:
>>   Matrix Object:     1 MPI processes
>>     type: seqaij
>>     rows=5630, cols=5630
>>     total: nonzeros=236700, allocated nonzeros=236700
>>     total number of mallocs used during MatSetValues calls =0
>>       using I-node routines: found 1126 nodes, limit used is 5
>> linear system matrix = precond matrix:
>> Matrix Object:   8 MPI processes
>>   type: mpiaij
>>   rows=41000, cols=41000
>>   total: nonzeros=1817800, allocated nonzeros=2555700
>>   total number of mallocs used during MatSetValues calls =121180
>>     using I-node (on process 0) routines: found 1025 nodes, limit used is 5
>> 
>> But, the results are far from the converged solution. For example two reference nodes for the pressure are compared:
>> 
>> Based on Superlu
>> Channel Inlet pressure (MIXTURE):      0.38890D-01
>> Channel Inlet pressure (LIQUID):       0.38416D-01
>> 
>> Based on Gmres
>> Channel Inlet pressure (MIXTURE):     -0.87214D+00
>> Channel Inlet pressure (LIQUID):      -0.87301D+00
>> 
>> 
>> I also tried this:
>> -ksp_type gcr -pc_type asm -ksp_diagonal_scale -ksp_diagonal_scale_fix -ksp_monitor_true_residual -ksp_converged_reason -ksp_view -log_summary
>> 
>> and here is the report:
>> 0 KSP unpreconditioned resid norm 2.248340888101e+05 true resid norm 2.248340888101e+05 ||r(i)||/||b|| 1.000000000000e+00
>> 1 KSP unpreconditioned resid norm 4.900010460179e+04 true resid norm 4.900010460179e+04 ||r(i)||/||b|| 2.179389471637e-01
>> 2 KSP unpreconditioned resid norm 4.267761572746e+04 true resid norm 4.267761572746e+04 ||r(i)||/||b|| 1.898182608933e-01
>> 3 KSP unpreconditioned resid norm 2.041242251471e+03 true resid norm 2.041242251471e+03 ||r(i)||/||b|| 9.078882398457e-03
>> 4 KSP unpreconditioned resid norm 1.852885420564e+03 true resid norm 1.852885420564e+03 ||r(i)||/||b|| 8.241123178296e-03
>> 5 KSP unpreconditioned resid norm 1.748965594395e+02 true resid norm 1.748965594395e+02 ||r(i)||/||b|| 7.778916460804e-04
>> 6 KSP unpreconditioned resid norm 5.664539353996e+01 true resid norm 5.664539353996e+01 ||r(i)||/||b|| 2.519430831852e-04
>> 7 KSP unpreconditioned resid norm 3.607535692806e+01 true resid norm 3.607535692806e+01 ||r(i)||/||b|| 1.604532351788e-04
>> 8 KSP unpreconditioned resid norm 1.041501303366e+01 true resid norm 1.041501303366e+01 ||r(i)||/||b|| 4.632310468924e-05
>> 9 KSP unpreconditioned resid norm 3.089920380322e+00 true resid norm 3.089920380322e+00 ||r(i)||/||b|| 1.374311340720e-05
>> 10 KSP unpreconditioned resid norm 1.456883209806e+00 true resid norm 1.456883209806e+00 ||r(i)||/||b|| 6.479814593583e-06
>> 11 KSP unpreconditioned resid norm 5.566902714391e-01 true resid norm 5.566902714391e-01 ||r(i)||/||b|| 2.476004748147e-06
>> 12 KSP unpreconditioned resid norm 2.403913756663e-01 true resid norm 2.403913756663e-01 ||r(i)||/||b|| 1.069194520006e-06
>> 13 KSP unpreconditioned resid norm 1.650435118839e-01 true resid norm 1.650435118839e-01 ||r(i)||/||b|| 7.340680088032e-07
>> Linear solve converged due to CONVERGED_RTOL iterations 13
>> KSP Object: 8 MPI processes
>> type: gcr
>>   GCR: restart = 30
>>   GCR: restarts performed = 1
>> maximum iterations=10000, initial guess is zero
>> tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
>> right preconditioning
>> diagonally scaled system
>> using UNPRECONDITIONED norm type for convergence test
>> PC Object: 8 MPI processes
>> type: asm
>>   Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1
>>   Additive Schwarz: restriction/interpolation type - RESTRICT
>>   Local solve is same for all blocks, in the following KSP and PC objects:
>> KSP Object:  (sub_)   1 MPI processes
>>   type: preonly
>>   maximum iterations=10000, initial guess is zero
>>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>   left preconditioning
>>   using NONE norm type for convergence test
>> PC Object:  (sub_)   1 MPI processes
>>   type: ilu
>>     ILU: out-of-place factorization
>>     0 levels of fill
>>     tolerance for zero pivot 1e-12
>>     using diagonal shift to prevent zero pivot
>>     matrix ordering: natural
>>     factor fill ratio given 1, needed 1
>>       Factored matrix follows:
>>         Matrix Object:           1 MPI processes
>>           type: seqaij
>>           rows=5630, cols=5630
>>           package used to perform factorization: petsc
>>           total: nonzeros=236700, allocated nonzeros=236700
>>           total number of mallocs used during MatSetValues calls =0
>>             using I-node routines: found 1126 nodes, limit used is 5
>>   linear system matrix = precond matrix:
>>   Matrix Object:     1 MPI processes
>>     type: seqaij
>>     rows=5630, cols=5630
>>     total: nonzeros=236700, allocated nonzeros=236700
>>     total number of mallocs used during MatSetValues calls =0
>>       using I-node routines: found 1126 nodes, limit used is 5
>> linear system matrix = precond matrix:
>> Matrix Object:   8 MPI processes
>>   type: mpiaij
>>   rows=41000, cols=41000
>>   total: nonzeros=1817800, allocated nonzeros=2555700
>>   total number of mallocs used during MatSetValues calls =121180
>>     using I-node (on process 0) routines: found 1025 nodes, limit used is 5
>> 
>> Channel Inlet pressure (MIXTURE):      -0.90733D+00
>> Channel Inlet pressure (LIQUID):      -0.10118D+01
>> 
>> 
>> As you may see these are complete different results which are not close to the converged solution.
>> 
>> Since, I want to have fine grids I need to use iterative solver. I wonder if I am missing something or using wrong solver/precondition/option. I would appreciate if you could help me (like always).
>> 
>> -- 
>> With Best Regards;
>> Foad
>> 
>> 
>> 
>> 
>