[petsc-users] PCASMType

Thu Aug 4 20:26:25 CDT 2016

> On Aug 4, 2016, at 9:01 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>> 
>> On Aug 4, 2016, at 8:51 PM, Boyce Griffith <griffith at cims.nyu.edu> wrote:
>> 
>>> 
>>> On Aug 4, 2016, at 8:42 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> 
>>> 
>>> History,
>>> 
>>> 1) I originally implemented the ASM with one subdomain per process
>>> 2) easily extended to support multiple domain per process
>>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that 
>>>  restrict etc could be achieved by simply dropping the parallel communication in the vector scatters
>>> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled.
>>> 
>>> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or  -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there.
>>> 
>>> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do.
>> 
>> OK, got it. The reason I’m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains.
> 
>    But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used.

Yes we are computing overlapping and non-overlapping IS’es.

I just double-checked, and somehow the ASMType setting is not making it from the command line into the solver configuration — sorry, I should have checked this more carefully before emailing the list. (I thought that the command line options were being captured correctly, since I am able to control the PC type and all of the sub-KSP/sub-PC settings.)

>> BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories):
> 
>  Yeah, better one question per email or we will miss them.
> 
>   There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to.

OK, so similar story as above: we have a custom MSM that, when used as a MG smoother, gives convergence rates that are about 2x PCASM, whereas when we use PCASM with MULTIPLICATIVE, it doesn’t seem to help.

However, now I am questioning whether the settings are getting propagated into PCASM… I’ll need to take another look.

Thanks,

— Boyce

> 
>   Barry
> 
>> 
>>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly:
>>>> 
>>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE
>>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00
>>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01
>>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01
>>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01
>>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02
>>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02
>>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02
>>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02
>>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02
>>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02
>>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02
>>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03
>>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03
>>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03
>>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03
>>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03
>>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04
>>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04
>>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04
>>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04
>>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05
>>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05
>>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05
>>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06
>>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06
>>>> KSP Object: 1 MPI processes
>>>> type: gmres
>>>>  GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>>  GMRES: happy breakdown tolerance 1e-30
>>>> maximum iterations=10000, initial guess is zero
>>>> tolerances:  relative=9.18274e-06, absolute=1e-50, divergence=10000.
>>>> left preconditioning
>>>> using PRECONDITIONED norm type for convergence test
>>>> PC Object: 1 MPI processes
>>>> type: asm
>>>>  Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1
>>>>  Additive Schwarz: restriction/interpolation type - BASIC
>>>>  Additive Schwarz: local solve composition type - MULTIPLICATIVE
>>>>  Local solve is same for all blocks, in the following KSP and PC objects:
>>>>  KSP Object:    (sub_)     1 MPI processes
>>>>    type: preonly
>>>>    maximum iterations=10000, initial guess is zero
>>>>    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>>    left preconditioning
>>>>    using NONE norm type for convergence test
>>>>  PC Object:    (sub_)     1 MPI processes
>>>>    type: icc
>>>>      0 levels of fill
>>>>      tolerance for zero pivot 2.22045e-14
>>>>      using Manteuffel shift [POSITIVE_DEFINITE]
>>>>      matrix ordering: natural
>>>>      factor fill ratio given 1., needed 1.
>>>>        Factored matrix follows:
>>>>          Mat Object:             1 MPI processes
>>>>            type: seqsbaij
>>>>            rows=160, cols=160
>>>>            package used to perform factorization: petsc
>>>>            total: nonzeros=443, allocated nonzeros=443
>>>>            total number of mallocs used during MatSetValues calls =0
>>>>                block size is 1
>>>>    linear system matrix = precond matrix:
>>>>    Mat Object:       1 MPI processes
>>>>      type: seqaij
>>>>      rows=160, cols=160
>>>>      total: nonzeros=726, allocated nonzeros=726
>>>>      total number of mallocs used during MatSetValues calls =0
>>>>        not using I-node routines
>>>> linear system matrix = precond matrix:
>>>> Mat Object:   1 MPI processes
>>>>  type: seqaij
>>>>  rows=1024, cols=1024
>>>>  total: nonzeros=4992, allocated nonzeros=5120
>>>>  total number of mallocs used during MatSetValues calls =0
>>>>    not using I-node routines
>>>> Norm of error 0.000292304 iterations 24
>>>> 
>>>> Thanks,
>>>> 
>>>> -- Boyce

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160804/e7acd934/attachment-0001.html>