[petsc-users] ASM vs GASM
Smith, Barry F.
bsmith at mcs.anl.gov
Tue Apr 30 23:49:53 CDT 2019
1) valgrind
2) confirm the IS are identical for ASM and GASM (this is easy on one process). Confirm the KSP,PC for subdomains are the same.
3) run with convergence monitoring for two subdomains (with ASM and GASM), turn on richardson on the inner solvers so it will plot the
convergence history. Do they start with the same residual? If not shows they are being setup differently.
4) conclude it is probably a bug in GASM (or GASM's error checking). Send the code to petsc-main at mcs.anl.gov if possible
Barry
> On Apr 30, 2019, at 11:06 PM, Boyce Griffith <boyceg at gmail.com> wrote:
>
>
>> On Apr 30, 2019, at 11:41 PM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>>
>>
>> Preconditioned residual falling nicely while true residual gets stuck usually indicates 1) null space not properly handled 2) nonlinearity (unintentional) inside PC 3) very small pivots in factorization
>
> Sure.
>
> Regarding (1), in the problem, one of the subdomains is just Laplace, whereas the other subdomain is a coupled parabolic/elliptic system. The elliptic equations are coupled at the subdomain interface. ILU and LU are both able to factor the submatrices without complaint and without special nullspace handling. I will add an extra Dirichlet boundary condition and see if it makes any difference.
>
> Avoiding (2) is why I usually use FGMRES as the outer solver.
>
> However, if I setup the “same” preconditioner but ASM instead of GASM, everything appears to work. (E.g. solver converges quickly with reasonable overlaps and relatively good subdomain solves.) I am doing everything in serial — shouldn’t ASM and GASM do (nearly) the same thing?
>
> Also, FWIW, I can do the same problem just using FIELDSPLIT, and that seems to converge pretty well.
>
>> What happens if you use -ksp_type fgmres ? Same behavior?
>
> With FGMRES, the residual stalls out (around 1e-4).
>
>> -ksp_pc_side right (with gmres)?
>
> With GMRES and right preconditioning, the residual stalls out (around 1e-2).
>
>> Could also be using uninitialized data inside GAMS which would mimic the unintentional nonlinearity in the PC, run under valgrind.
>
> Well, I actually get a seg fault from the code when I run with valgrind, but I am not sure that is real. (I try to use mpich with valgrind, but right now I am running OpenMPI on MacOS right now — I recently got an updated mpich from homebrew that was breaking everything!) I will try to check this out on a different system to see if I can get more information.
>
>> Barry
>>
>>
>>> On Apr 30, 2019, at 10:33 PM, Boyce Griffith <boyceg at gmail.com> wrote:
>>>
>>>
>>>> On Apr 30, 2019, at 6:22 PM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>>>>
>>>>
>>>> Boyce,
>>>>
>>>> I noticed that in the KSPView you sent the solver inside GASM was fgmres, I don't know why!
>>>
>>> Thanks for checking! I don’t know why either. I often use FGMRES for outer KSPs, and probably used it for the inner ones without thinking. However, switching to preonly does not appear to help here:
>>>
>>> ==========
>>>
>>> Using GMRES as the outer KSP with ILU for the subdomain PC:
>>>
>>> ./main -d 3 -n 10 -ksp_type gmres -ksp_max_it 20 -ksp_monitor_true_residual -ksp_view
>>>
>>> 0 KSP preconditioned resid norm 7.956172961039e-01 true resid norm 5.249964981356e+02 ||r(i)||/||b|| 1.000000000000e+00
>>> 1 KSP preconditioned resid norm 1.790924766864e-01 true resid norm 6.415624322140e+01 ||r(i)||/||b|| 1.222031831626e-01
>>> 2 KSP preconditioned resid norm 1.021248567184e-02 true resid norm 1.531971188768e+00 ||r(i)||/||b|| 2.918059823653e-03
>>> 3 KSP preconditioned resid norm 1.245290521533e-03 true resid norm 1.344228387002e-01 ||r(i)||/||b|| 2.560452101634e-04
>>> 4 KSP preconditioned resid norm 1.090553693795e-04 true resid norm 1.055832655005e-01 ||r(i)||/||b|| 2.011123233687e-04
>>> 5 KSP preconditioned resid norm 5.774209753120e-06 true resid norm 1.045883881261e-01 ||r(i)||/||b|| 1.992173062059e-04
>>> 6 KSP preconditioned resid norm 4.856074176119e-07 true resid norm 1.047048288204e-01 ||r(i)||/||b|| 1.994390994840e-04
>>> 7 KSP preconditioned resid norm 3.569834574103e-08 true resid norm 1.046981572164e-01 ||r(i)||/||b|| 1.994263915821e-04
>>> 8 KSP preconditioned resid norm 1.914112077836e-09 true resid norm 1.046980716404e-01 ||r(i)||/||b|| 1.994262285791e-04
>>> 9 KSP preconditioned resid norm 9.380275080687e-11 true resid norm 1.046980942278e-01 ||r(i)||/||b|| 1.994262716030e-04
>>> 10 KSP preconditioned resid norm 3.490998066884e-12 true resid norm 1.046980929565e-01 ||r(i)||/||b|| 1.994262691815e-04
>>> 11 KSP preconditioned resid norm 2.275544655754e-13 true resid norm 1.046980929905e-01 ||r(i)||/||b|| 1.994262692463e-04
>>> KSP Object: 1 MPI processes
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, nonzero initial guess
>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI processes
>>> type: gasm
>>> Restriction/interpolation type: RESTRICT
>>> requested amount of overlap = 0
>>> total number of subdomains = 2
>>> max number of local subdomains = 2
>>> [0|1] number of locally-supported subdomains = 2
>>> Subdomain solver info is as follows:
>>> - - - - - - - - - - - - - - - - - -
>>> [0|1] (subcomm [0|1]) local subdomain number 0, local size = 484
>>> KSP Object: (sub_) 1 MPI processes
>>> type: preonly
>>> maximum iterations=10000, initial guess is zero
>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>>> left preconditioning
>>> using NONE norm type for convergence test
>>> PC Object: (sub_) 1 MPI processes
>>> type: ilu
>>> out-of-place factorization
>>> 0 levels of fill
>>> tolerance for zero pivot 2.22045e-14
>>> matrix ordering: natural
>>> factor fill ratio given 1., needed 1.
>>> Factored matrix follows:
>>> Mat Object: 1 MPI processes
>>> type: seqaij
>>> rows=484, cols=484
>>> package used to perform factorization: petsc
>>> total: nonzeros=15376, allocated nonzeros=15376
>>> total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> linear system matrix = precond matrix:
>>> Mat Object: () 1 MPI processes
>>> type: seqaij
>>> rows=484, cols=484
>>> total: nonzeros=15376, allocated nonzeros=15376
>>> total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> - - - - - - - - - - - - - - - - - -
>>> [0|1] (subcomm [0|1]) local subdomain number 1, local size = 121
>>> KSP Object: (sub_) 1 MPI processes
>>> type: preonly
>>> maximum iterations=10000, initial guess is zero
>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>>> left preconditioning
>>> using NONE norm type for convergence test
>>> PC Object: (sub_) 1 MPI processes
>>> type: ilu
>>> out-of-place factorization
>>> 0 levels of fill
>>> tolerance for zero pivot 2.22045e-14
>>> matrix ordering: natural
>>> factor fill ratio given 1., needed 1.
>>> Factored matrix follows:
>>> Mat Object: 1 MPI processes
>>> type: seqaij
>>> rows=121, cols=121
>>> package used to perform factorization: petsc
>>> total: nonzeros=961, allocated nonzeros=961
>>> total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> linear system matrix = precond matrix:
>>> Mat Object: () 1 MPI processes
>>> type: seqaij
>>> rows=121, cols=121
>>> total: nonzeros=961, allocated nonzeros=961
>>> total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> - - - - - - - - - - - - - - - - - -
>>>
>>> This shows convergence in the preconditioned norm but not in the true norm, and the solution appears nonphysical.
>>>
>>> ==========
>>>
>>> Using GMRES for the outer KSP with LU for the subdomain PC:
>>>
>>> ./main -d 3 -n 10 -ksp_type gmres -sub_pc_type lu -ksp_monitor_true_residual -ksp_view
>>>
>>> 0 KSP preconditioned resid norm 7.587523490344e-01 true resid norm 5.249964981356e+02 ||r(i)||/||b|| 1.000000000000e+00
>>> 1 KSP preconditioned resid norm 1.452294080360e-15 true resid norm 1.026159563147e-01 ||r(i)||/||b|| 1.954602681716e-04
>>> KSP Object: 1 MPI processes
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=5000, nonzero initial guess
>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI processes
>>> type: gasm
>>> Restriction/interpolation type: RESTRICT
>>> requested amount of overlap = 0
>>> total number of subdomains = 2
>>> max number of local subdomains = 2
>>> [0|1] number of locally-supported subdomains = 2
>>> Subdomain solver info is as follows:
>>> - - - - - - - - - - - - - - - - - -
>>> [0|1] (subcomm [0|1]) local subdomain number 0, local size = 484
>>> KSP Object: (sub_) 1 MPI processes
>>> type: preonly
>>> maximum iterations=10000, initial guess is zero
>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>>> left preconditioning
>>> using NONE norm type for convergence test
>>> PC Object: (sub_) 1 MPI processes
>>> type: lu
>>> out-of-place factorization
>>> tolerance for zero pivot 2.22045e-14
>>> matrix ordering: nd
>>> factor fill ratio given 5., needed 2.69979
>>> Factored matrix follows:
>>> Mat Object: 1 MPI processes
>>> type: seqaij
>>> rows=484, cols=484
>>> package used to perform factorization: petsc
>>> total: nonzeros=41512, allocated nonzeros=41512
>>> total number of mallocs used during MatSetValues calls =0
>>> using I-node routines: found 194 nodes, limit used is 5
>>> linear system matrix = precond matrix:
>>> Mat Object: () 1 MPI processes
>>> type: seqaij
>>> rows=484, cols=484
>>> total: nonzeros=15376, allocated nonzeros=15376
>>> total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> - - - - - - - - - - - - - - - - - -
>>> [0|1] (subcomm [0|1]) local subdomain number 1, local size = 121
>>> KSP Object: (sub_) 1 MPI processes
>>> type: preonly
>>> maximum iterations=10000, initial guess is zero
>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>>> left preconditioning
>>> using NONE norm type for convergence test
>>> PC Object: (sub_) 1 MPI processes
>>> type: lu
>>> out-of-place factorization
>>> tolerance for zero pivot 2.22045e-14
>>> matrix ordering: nd
>>> factor fill ratio given 5., needed 2.57128
>>> Factored matrix follows:
>>> Mat Object: 1 MPI processes
>>> type: seqaij
>>> rows=121, cols=121
>>> package used to perform factorization: petsc
>>> total: nonzeros=2471, allocated nonzeros=2471
>>> total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> linear system matrix = precond matrix:
>>> Mat Object: () 1 MPI processes
>>> type: seqaij
>>> rows=121, cols=121
>>> total: nonzeros=961, allocated nonzeros=961
>>> total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> - - - - - - - - - - - - - - - - - -
>>>
>>>
>>>
>>> ==========
>>>
>>> Using FGMRES as the outer KSP:
>>>
>>> ./main -d 3 -n 10 -ksp_type fgmres -ksp_max_it 20 -ksp_monitor_true_residual -ksp_view
>>>
>>> 0 KSP unpreconditioned resid norm 5.249964981356e+02 true resid norm 5.249964981356e+02 ||r(i)||/||b|| 1.000000000000e+00
>>> 1 KSP unpreconditioned resid norm 9.908062627364e+00 true resid norm 9.908062627364e+00 ||r(i)||/||b|| 1.887262612713e-02
>>> 2 KSP unpreconditioned resid norm 1.198244246143e+00 true resid norm 1.198244246143e+00 ||r(i)||/||b|| 2.282385216660e-03
>>> 3 KSP unpreconditioned resid norm 1.011712741080e-01 true resid norm 1.011712741080e-01 ||r(i)||/||b|| 1.927084741847e-04
>>> 4 KSP unpreconditioned resid norm 9.890789240724e-02 true resid norm 9.890789240713e-02 ||r(i)||/||b|| 1.883972421881e-04
>>> 5 KSP unpreconditioned resid norm 9.790912262081e-02 true resid norm 9.790912262142e-02 ||r(i)||/||b|| 1.864948108590e-04
>>> 6 KSP unpreconditioned resid norm 9.758733533827e-02 true resid norm 9.758733534329e-02 ||r(i)||/||b|| 1.858818786218e-04
>>> 7 KSP unpreconditioned resid norm 9.754239526505e-02 true resid norm 9.754239528968e-02 ||r(i)||/||b|| 1.857962779487e-04
>>> 8 KSP unpreconditioned resid norm 9.732735871034e-02 true resid norm 9.732735925876e-02 ||r(i)||/||b|| 1.853866827767e-04
>>> 9 KSP unpreconditioned resid norm 9.681037606662e-02 true resid norm 9.681036910553e-02 ||r(i)||/||b|| 1.844019330592e-04
>>> 10 KSP unpreconditioned resid norm 9.672261014893e-02 true resid norm 9.672255386871e-02 ||r(i)||/||b|| 1.842346648258e-04
>>> 11 KSP unpreconditioned resid norm 9.636675313899e-02 true resid norm 9.636858548398e-02 ||r(i)||/||b|| 1.835604348338e-04
>>> 12 KSP unpreconditioned resid norm 9.626809760271e-02 true resid norm 9.624997274413e-02 ||r(i)||/||b|| 1.833345042985e-04
>>> 13 KSP unpreconditioned resid norm 9.538127889520e-02 true resid norm 9.646040937528e-02 ||r(i)||/||b|| 1.837353386505e-04
>>> 14 KSP unpreconditioned resid norm 7.698037210447e-02 true resid norm 1.004972547621e-01 ||r(i)||/||b|| 1.914246192480e-04
>>> 15 KSP unpreconditioned resid norm 6.963471194663e-02 true resid norm 1.021051124617e-01 ||r(i)||/||b|| 1.944872257707e-04
>>> 16 KSP unpreconditioned resid norm 6.288795988348e-02 true resid norm 1.048077408458e-01 ||r(i)||/||b|| 1.996351236971e-04
>>> 17 KSP unpreconditioned resid norm 5.576252453977e-02 true resid norm 1.071287580389e-01 ||r(i)||/||b|| 2.040561383159e-04
>>> 18 KSP unpreconditioned resid norm 5.096717927451e-02 true resid norm 1.092140137471e-01 ||r(i)||/||b|| 2.080280804443e-04
>>> 19 KSP unpreconditioned resid norm 4.698350635435e-02 true resid norm 1.104283980783e-01 ||r(i)||/||b|| 2.103412088852e-04
>>> 20 KSP unpreconditioned resid norm 4.387914680239e-02 true resid norm 1.116698560164e-01 ||r(i)||/||b|| 2.127059064451e-04
>>> KSP Object: 1 MPI processes
>>> type: fgmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, nonzero initial guess
>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
>>> right preconditioning
>>> using UNPRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI processes
>>> type: gasm
>>> Restriction/interpolation type: RESTRICT
>>> requested amount of overlap = 0
>>> total number of subdomains = 2
>>> max number of local subdomains = 2
>>> [0|1] number of locally-supported subdomains = 2
>>> Subdomain solver info is as follows:
>>> - - - - - - - - - - - - - - - - - -
>>> [0|1] (subcomm [0|1]) local subdomain number 0, local size = 484
>>> KSP Object: (sub_) 1 MPI processes
>>> type: preonly
>>> maximum iterations=10000, initial guess is zero
>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>>> left preconditioning
>>> using NONE norm type for convergence test
>>> PC Object: (sub_) 1 MPI processes
>>> type: ilu
>>> out-of-place factorization
>>> 0 levels of fill
>>> tolerance for zero pivot 2.22045e-14
>>> matrix ordering: natural
>>> factor fill ratio given 1., needed 1.
>>> Factored matrix follows:
>>> Mat Object: 1 MPI processes
>>> type: seqaij
>>> rows=484, cols=484
>>> package used to perform factorization: petsc
>>> total: nonzeros=15376, allocated nonzeros=15376
>>> total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> linear system matrix = precond matrix:
>>> Mat Object: () 1 MPI processes
>>> type: seqaij
>>> rows=484, cols=484
>>> total: nonzeros=15376, allocated nonzeros=15376
>>> total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> - - - - - - - - - - - - - - - - - -
>>> [0|1] (subcomm [0|1]) local subdomain number 1, local size = 121
>>> KSP Object: (sub_) 1 MPI processes
>>> type: preonly
>>> maximum iterations=10000, initial guess is zero
>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>>> left preconditioning
>>> using NONE norm type for convergence test
>>> PC Object: (sub_) 1 MPI processes
>>> type: ilu
>>> out-of-place factorization
>>> 0 levels of fill
>>> tolerance for zero pivot 2.22045e-14
>>> matrix ordering: natural
>>> factor fill ratio given 1., needed 1.
>>> Factored matrix follows:
>>> Mat Object: 1 MPI processes
>>> type: seqaij
>>> rows=121, cols=121
>>> package used to perform factorization: petsc
>>> total: nonzeros=961, allocated nonzeros=961
>>> total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> linear system matrix = precond matrix:
>>> Mat Object: () 1 MPI processes
>>> type: seqaij
>>> rows=121, cols=121
>>> total: nonzeros=961, allocated nonzeros=961
>>> total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> - - - - - - - - - - - - - - - - - -
>>>
>>> ==========
>>>
>>> Again, note that using the same subdomains, ASM appears to work fine.
>>>
>>>> This would explain why the outer GMRES had inconsistent residuals. If you switch the inner solver to preonly + LU for GASM what happens?
>>>>
>>>>> On Apr 30, 2019, at 11:36 AM, Boyce Griffith via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> On Apr 30, 2019, at 12:31 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>>>>>
>>>>>> When I said it was singular I was looking at "preconditioned residual norm to an rtol of 1e-12. If I look at the true residual norm, however, it stagnates around 1e-4."
>>>>>>
>>>>>> This is not what I am seeing in this output.
>>>>>
>>>>> Whoops, I switched the outer KSP from GMRES to FGMRES.
>>>>>
>>>>> With GMRES, the preconditioned residual norm drops nicely (but, of course, the solution is no good):
>>>>>
>>>>> $ ./main -d 3 -n 10 -ksp_type gmres -ksp_monitor_true_residual
>>>>> Running ./main -d 3 -n 10 -ksp_type gmres -ksp_monitor_true_residual
>>>>>
>>>>> 0 KSP preconditioned resid norm 7.954859454640e-01 true resid norm 5.249964981356e+02 ||r(i)||/||b|| 1.000000000000e+00
>>>>> 1 KSP preconditioned resid norm 1.791745669837e-01 true resid norm 6.420515097608e+01 ||r(i)||/||b|| 1.222963414120e-01
>>>>> 2 KSP preconditioned resid norm 1.018932536518e-02 true resid norm 1.538149013353e+00 ||r(i)||/||b|| 2.929827187068e-03
>>>>> 3 KSP preconditioned resid norm 1.247250041620e-03 true resid norm 1.231074134137e-01 ||r(i)||/||b|| 2.344918753761e-04
>>>>> 4 KSP preconditioned resid norm 1.090687825399e-04 true resid norm 9.214204251786e-02 ||r(i)||/||b|| 1.755098231037e-04
>>>>> 5 KSP preconditioned resid norm 5.773017036638e-06 true resid norm 9.199655161085e-02 ||r(i)||/||b|| 1.752326957181e-04
>>>>> 6 KSP preconditioned resid norm 4.880868222010e-07 true resid norm 9.199488147685e-02 ||r(i)||/||b|| 1.752295144892e-04
>>>>> 7 KSP preconditioned resid norm 3.528569945380e-08 true resid norm 9.199485972669e-02 ||r(i)||/||b|| 1.752294730601e-04
>>>>> 8 KSP preconditioned resid norm 1.875782938387e-09 true resid norm 9.199486015879e-02 ||r(i)||/||b|| 1.752294738832e-04
>>>>> 9 KSP preconditioned resid norm 8.952213054230e-11 true resid norm 9.199486012037e-02 ||r(i)||/||b|| 1.752294738100e-04
>>>>> 10 KSP preconditioned resid norm 3.450175063457e-12 true resid norm 9.199486011997e-02 ||r(i)||/||b|| 1.752294738092e-04
>>>>> 11 KSP preconditioned resid norm 2.186653062508e-13 true resid norm 9.199486012016e-02 ||r(i)||/||b|| 1.752294738096e-04
>>>>>
>>>>>> It is just a poor PC. The big drop in the residual at the beginning is suspicious. When you solve this problem well have you checked that you are getting a good solution?
>>>>>
>>>>> Yes indeed, the solution appears to be no good with this preconditioner.
>>>>>
>>>>> Note that all that the PC is doing is applying GASM to split the problem onto two subdomains.
>>>>>
>>>>>> That is, do you check that your model is not messed up, like a bad mesh.
>>>>>
>>>>> This is a uniform HEX mesh automatically generated by libMesh.
>>>>>
>>>>>> On Tue, Apr 30, 2019 at 11:35 AM Boyce Griffith via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>>>>>
>>>>>>
>>>>>>> On Apr 30, 2019, at 11:23 AM, Fande Kong <fdkong.jd at gmail.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Apr 30, 2019 at 7:40 AM Boyce Griffith via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On Apr 30, 2019, at 9:06 AM, Mark Adams <mfadams at lbl.gov> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Allowing GASM to construct the "outer" subdomains from the non-overlapping "inner" subdomains, and using "exact" subdomain solvers (subdomain KSPs are using FGMRES+ILU with an rtol of 1e-12), I get convergence in ~2 iterations in the preconditioned residual norm to an rtol of 1e-12. If I look at the true residual norm, however, it stagnates around 1e-4.
>>>>>>>>
>>>>>>>>
>>>>>>>> That PC is singular.
>>>>>>>
>>>>>>> Yes. I am confused about why GASM is giving a singular PC but ASM is not with the same subdomains.
>>>>>>>
>>>>>>> We could not tell much without the info of detailed solver setup. The overlapping function was implemented by me and Dmitry a couple of years ago, and it is trick when a subdomain is shared by multiple cores. Do you mind to start from our example? Or setup an example for us to demonstrate your issues?
>>>>>>>
>>>>>>> At least please use "-ksp_view" (linear solver) or "-snes_view" (nonlinear solver) to print more information that will help us a bit.
>>>>>>
>>>>>> Here you go:
>>>>>>
>>>>>> $ ./main -d 3 -n 10 -ksp_type fgmres -sub_ksp_type fgmres -sub_pc_type ilu -sub_ksp_rtol 1.0e-12 -ksp_converged_reason -ksp_monitor_true_residual -ksp_max_it 10 -ksp_view
>>>>>> make: `main' is up to date.
>>>>>> Running ./main -d 3 -n 10 -ksp_type fgmres -sub_ksp_type fgmres -sub_pc_type ilu -sub_ksp_rtol 1.0e-12 -ksp_converged_reason -ksp_monitor_true_residual -ksp_max_it 10 -ksp_view
>>>>>>
>>>>>> 0 KSP unpreconditioned resid norm 5.249964981356e+02 true resid norm 5.249964981356e+02 ||r(i)||/||b|| 1.000000000000e+00
>>>>>> 1 KSP unpreconditioned resid norm 9.316296223724e-02 true resid norm 9.316296223724e-02 ||r(i)||/||b|| 1.774544450641e-04
>>>>>> 2 KSP unpreconditioned resid norm 9.314881028141e-02 true resid norm 9.314881028141e-02 ||r(i)||/||b|| 1.774274887779e-04
>>>>>> 3 KSP unpreconditioned resid norm 9.299990517422e-02 true resid norm 9.299918556770e-02 ||r(i)||/||b|| 1.771424874222e-04
>>>>>> 4 KSP unpreconditioned resid norm 9.224468272306e-02 true resid norm 9.393543403858e-02 ||r(i)||/||b|| 1.789258297382e-04
>>>>>> 5 KSP unpreconditioned resid norm 9.150828598034e-02 true resid norm 9.511673987375e-02 ||r(i)||/||b|| 1.811759510997e-04
>>>>>> 6 KSP unpreconditioned resid norm 9.078924839691e-02 true resid norm 1.013093335976e-01 ||r(i)||/||b|| 1.929714463951e-04
>>>>>> 7 KSP unpreconditioned resid norm 9.008689850931e-02 true resid norm 1.011099594157e-01 ||r(i)||/||b|| 1.925916835155e-04
>>>>>> 8 KSP unpreconditioned resid norm 8.940060065590e-02 true resid norm 1.090779251949e-01 ||r(i)||/||b|| 2.077688624253e-04
>>>>>> 9 KSP unpreconditioned resid norm 8.872975256529e-02 true resid norm 1.102873098599e-01 ||r(i)||/||b|| 2.100724676289e-04
>>>>>> 10 KSP unpreconditioned resid norm 8.807378313465e-02 true resid norm 1.071996745064e-01 ||r(i)||/||b|| 2.041912182026e-04
>>>>>> Linear solve did not converge due to DIVERGED_ITS iterations 10
>>>>>> KSP Object: 1 MPI processes
>>>>>> type: fgmres
>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>>>> happy breakdown tolerance 1e-30
>>>>>> maximum iterations=10, nonzero initial guess
>>>>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
>>>>>> right preconditioning
>>>>>> using UNPRECONDITIONED norm type for convergence test
>>>>>> PC Object: 1 MPI processes
>>>>>> type: gasm
>>>>>> Restriction/interpolation type: RESTRICT
>>>>>> requested amount of overlap = 0
>>>>>> total number of subdomains = 2
>>>>>> max number of local subdomains = 2
>>>>>> [0|1] number of locally-supported subdomains = 2
>>>>>> Subdomain solver info is as follows:
>>>>>> - - - - - - - - - - - - - - - - - -
>>>>>> [0|1] (subcomm [0|1]) local subdomain number 0, local size = 484
>>>>>> KSP Object: (sub_) 1 MPI processes
>>>>>> type: fgmres
>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>>>> happy breakdown tolerance 1e-30
>>>>>> maximum iterations=10000, initial guess is zero
>>>>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
>>>>>> right preconditioning
>>>>>> using UNPRECONDITIONED norm type for convergence test
>>>>>> PC Object: (sub_) 1 MPI processes
>>>>>> type: ilu
>>>>>> out-of-place factorization
>>>>>> 0 levels of fill
>>>>>> tolerance for zero pivot 2.22045e-14
>>>>>> matrix ordering: natural
>>>>>> factor fill ratio given 1., needed 1.
>>>>>> Factored matrix follows:
>>>>>> Mat Object: 1 MPI processes
>>>>>> type: seqaij
>>>>>> rows=484, cols=484
>>>>>> package used to perform factorization: petsc
>>>>>> total: nonzeros=15376, allocated nonzeros=15376
>>>>>> total number of mallocs used during MatSetValues calls =0
>>>>>> not using I-node routines
>>>>>> linear system matrix = precond matrix:
>>>>>> Mat Object: () 1 MPI processes
>>>>>> type: seqaij
>>>>>> rows=484, cols=484
>>>>>> total: nonzeros=15376, allocated nonzeros=15376
>>>>>> total number of mallocs used during MatSetValues calls =0
>>>>>> not using I-node routines
>>>>>> - - - - - - - - - - - - - - - - - -
>>>>>> [0|1] (subcomm [0|1]) local subdomain number 1, local size = 121
>>>>>> KSP Object: (sub_) 1 MPI processes
>>>>>> type: fgmres
>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>>>> happy breakdown tolerance 1e-30
>>>>>> maximum iterations=10000, initial guess is zero
>>>>>> tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
>>>>>> right preconditioning
>>>>>> using UNPRECONDITIONED norm type for convergence test
>>>>>> PC Object: (sub_) 1 MPI processes
>>>>>> type: ilu
>>>>>> out-of-place factorization
>>>>>> 0 levels of fill
>>>>>> tolerance for zero pivot 2.22045e-14
>>>>>> matrix ordering: natural
>>>>>> factor fill ratio given 1., needed 1.
>>>>>> Factored matrix follows:
>>>>>> Mat Object: 1 MPI processes
>>>>>> type: seqaij
>>>>>> rows=121, cols=121
>>>>>> package used to perform factorization: petsc
>>>>>> total: nonzeros=961, allocated nonzeros=961
>>>>>> total number of mallocs used during MatSetValues calls =0
>>>>>> not using I-node routines
>>>>>> linear system matrix = precond matrix:
>>>>>> Mat Object: () 1 MPI processes
>>>>>> type: seqaij
>>>>>> rows=121, cols=121
>>>>>> total: nonzeros=961, allocated nonzeros=961
>>>>>> total number of mallocs used during MatSetValues calls =0
>>>>>> not using I-node routines
>>>>>> - - - - - - - - - - - - - - - - - -
>>>>>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>>>>> [0]PETSC ERROR: Petsc has generated inconsistent data
>>>>>> [0]PETSC ERROR: Called more times than PetscViewerASCIIPushSynchronized()
>>>>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>>>>> [0]PETSC ERROR: Petsc Release Version 3.11.1, unknown
>>>>>> [0]PETSC ERROR: ./main on a darwin-dbg named boycesimacwork.dhcp.unc.edu by boyceg Tue Apr 30 11:29:02 2019
>>>>>> [0]PETSC ERROR: Configure options --CC=mpicc --CXX=mpicxx --FC=mpif90 --PETSC_ARCH=darwin-dbg --with-debugging=1 --with-c++-support=1 --with-hypre=1 --download-hypre=1 --with-hdf5=1 --with-hdf5-dir=/usr/local
>>>>>> [0]PETSC ERROR: #1 PetscViewerASCIIPopSynchronized() line 438 in /Users/boyceg/sfw/petsc/petsc-maint/src/sys/classes/viewer/impls/ascii/filev.c
>>>>>> [0]PETSC ERROR: #2 PCView_GASM() line 251 in /Users/boyceg/sfw/petsc/petsc-maint/src/ksp/pc/impls/gasm/gasm.c
>>>>>> [0]PETSC ERROR: #3 PCView() line 1651 in /Users/boyceg/sfw/petsc/petsc-maint/src/ksp/pc/interface/precon.c
>>>>>> [0]PETSC ERROR: #4 KSPView() line 213 in /Users/boyceg/sfw/petsc/petsc-maint/src/ksp/ksp/interface/itcreate.c
>>>>>> [0]PETSC ERROR: #5 PetscObjectView() line 100 in /Users/boyceg/sfw/petsc/petsc-maint/src/sys/objects/destroy.c
>>>>>> [0]PETSC ERROR: #6 ObjectView() line 14 in /Users/boyceg/sfw/petsc/petsc-maint/src/ksp/ksp/interface/itfunc.c
>>>>>> [0]PETSC ERROR: #7 KSPSolve() line 831 in /Users/boyceg/sfw/petsc/petsc-maint/src/ksp/ksp/interface/itfunc.c
>>>>>>
>>>>>>> One difference between them is that the default GASM overlap is 0 (https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGASMSetOverlap.html), but the default ASM overlap is 1 (https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCASMSetOverlap.html).
>>>>>>>
>>>>>>> For your particular application, not sure you need any overlap since there are two different PDEs on two different subdomains. It may be fine to run the PC without the overlapping domain. It is a definitely interesting application for GASM.
>>>>>>>
>>>>>>> Fande,
>>>>>>>
>>>>>>>
>>>>>>> However, changing the GASM overlap does not make any difference in the convergence history.
>>>>>>>
>>>>>>> -- Boyce
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the petsc-users
mailing list