[petsc-users] solving system with 2x2 block size

Tue Nov 15 18:50:23 CST 2016

   Glad to hear it.

> On Nov 15, 2016, at 5:16 PM, Manav Bhatia <bhatiamanav at gmail.com> wrote:
> 
> I tried the options "-mat_type baij” and that seemed to change the type of matrix to BAIJ. 
> 
> I am now experimenting with various preconditioners (ILU, ASM, etc.), and things seem to be working fine so far. 
> 
> KSP Object:(fluid_complex_) 4 MPI processes
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-10, absolute=1e-50, divergence=10000.
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object:(fluid_complex_) 4 MPI processes
>   type: asm
>     Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1
>     Additive Schwarz: restriction/interpolation type - RESTRICT
>     Local solve is same for all blocks, in the following KSP and PC objects:
>   KSP Object:  (fluid_complex_sub_)   1 MPI processes
>     type: gmres
>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>       GMRES: happy breakdown tolerance 1e-30
>     maximum iterations=10000, initial guess is zero
>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>     left preconditioning
>     using PRECONDITIONED norm type for convergence test
>   PC Object:  (fluid_complex_sub_)   1 MPI processes
>     type: ilu
>       ILU: out-of-place factorization
>       3 levels of fill
>       tolerance for zero pivot 2.22045e-14
>       matrix ordering: natural
>       factor fill ratio given 1., needed 2.32913
>         Factored matrix follows:
>           Mat Object:           1 MPI processes
>             type: seqbaij
>             rows=294752, cols=294752, bs=2
>             package used to perform factorization: petsc
>             total: nonzeros=49050496, allocated nonzeros=49050496
>             total number of mallocs used during MatSetValues calls =0
>                 block size is 2
>     linear system matrix = precond matrix:
>     Mat Object:    (fluid_complex_)     1 MPI processes
>       type: seqbaij
>       rows=294752, cols=294752, bs=2
>       total: nonzeros=21059584, allocated nonzeros=21059584
>       total number of mallocs used during MatSetValues calls =0
>           block size is 2
>   linear system matrix = precond matrix:
>   Mat Object:  (fluid_complex_)   4 MPI processes
>     type: mpibaij
>     rows=1158728, cols=1158728, bs=2
>     total: nonzeros=83105344, allocated nonzeros=83266816
>     total number of mallocs used during MatSetValues calls =0
>         block size is 2
> 
> 
> -Manav
> 
> 
>> On Nov 15, 2016, at 3:34 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>>> 
>>> On Nov 15, 2016, at 3:23 PM, Manav Bhatia <bhatiamanav at gmail.com> wrote:
>>> 
>>> I have a complex system,   (A + i B) (x + i y) = (f + ig), that I am trying to solve using real matrices: 
>>> 
>>>    [A  -B;   B A ] [x; y] = [f; g]
>>> 
>>> So, the 2x2 block is made of the real and imaginary component of each entry in the complex matrix. 
>>> 
>>> I am following the discussion in the following paper: 
>>> 
>>> DAY D. \& HEROUX M.A. 2001. Solving complex-valued linear systems via equivalent real formulations. \textit{SIAM Journal on Scientific Computing} 23: 480-498.
>>> 
>>> Following is an excerpt. 
>>> 
>>> **********************************************************************************
>>> 
>>> The matrix K in the K formulation has a natural 2-by-2 block structure that can be exploited by using block entry data structures. Using the block entry features of these packages has the following benefits.
>>> 
>>> 	• Applying 2-by-2 block Jacobi scaling to K corresponds exactly to applying point Jacobi scaling to C.
>>> 
>>> 	• The block sparsity pattern of K exactly matches the point sparsity pattern of C. Thus any pattern-based preconditioners such as block ILU(l) applied to K correspond exactly to ILU(l) applied to C. See section 4 for definitions of block ILU(l) and ILU(l).
>>> 
>>> 	• Any drop tolerance-based complex preconditioner has a straightforward K formulation since the absolute value of a complex entry equals the scaled Frobenius norm of the corresponding block entry in K. 
>>> 
>>> **********************************************************************************
>>> 
>>> The paper additional outlines the challenges of the poor spectral properties of the equivalent real system. 
>>> 
>>> So, I am assembling the system with a 2x2 block, but am not sure how to best pick the right solver options in Petsc. 
>>> 
>>> I agree that I am getting confused by the “block” nomenclature. Particularly, I am not sure how to reconcile the different notions with points 1 and 2 from the paper (noted above). 
>> 
>>    In PETSc we call this 2x2 block Jacobi "point-block Jacobi"  you can use the option -pc_type pbjacobi.  The ILU() in PETSc can also be "point block", this is obtained with the usual -pc_type ilu (that is there is no different preconditioner name for ILU point block).  To use all these things you need to make your matrix a BAIJ matrix (not an AIJ) and set its block size to 2. 
>> 
>>   Have you tried solving the matrices as complex? Is there a reason you wish to reformulate them as real? 
>> 
>>   The convergence of iterative methods (either with real or complex numbers) depends on the properties of the A and B (i.e. C) matrix. Where does the C matrix come from? There are many applications that result in complex matrices that are poorly conditioned for iterative methods.
>> 
>>   Barry
>> 
>> 
>> 
>> 
>>> 
>>> Any guidance would be appreciated!
>>> 
>>> Thanks,
>>> Manav
>>> 
>>> 
>>>> On Nov 15, 2016, at 3:12 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>> 
>>>> 
>>>>  We can help you if you provide more information about what the blocks represent in your problem. 
>>>> 
>>>>  Do you have two degrees of freedom at each grid point? What physically are the two degrees of freedom. What equations are you solving?
>>>> 
>>>>   I think you may be mixing up the "matrix block size" of 2 with the blocks in "block Jacobi". Though both are called "block" they really don't have anything to do with each other. 
>>>> 
>>>>  Barry
>>>> 
>>>>> On Nov 15, 2016, at 3:03 PM, Manav Bhatia <bhatiamanav at gmail.com> wrote:
>>>>> 
>>>>> Hi, 
>>>>> 
>>>>>  I am setting up a matrix with the following calls. The intent is to solve the system with a 2x2 block size.
>>>>> 
>>>>>  What combinations of KSP/PC will effectively translate to solving this block matrix system? 
>>>>> 
>>>>>  I saw a discussion about bjacobi in the manual with the following calls (I omitted the prefixes from my actual command):   
>>>>> -pc_type bjacobi -pc_bjacobi_blocks 2 -sub_ksp_type preonly -sub_pc_type lu  -ksp_view
>>>>> 
>>>>> which provides the following output: 
>>>>> KSP Object:(fluid_complex_) 1 MPI processes
>>>>> type: gmres
>>>>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>>>   GMRES: happy breakdown tolerance 1e-30
>>>>> maximum iterations=10000, initial guess is zero
>>>>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using PRECONDITIONED norm type for convergence test
>>>>> PC Object:(fluid_complex_) 1 MPI processes
>>>>> type: bjacobi
>>>>>   block Jacobi: number of blocks = 2
>>>>>   Local solve is same for all blocks, in the following KSP and PC objects:
>>>>>   KSP Object:    (fluid_complex_sub_)     1 MPI processes
>>>>>     type: preonly
>>>>>     maximum iterations=10000, initial guess is zero
>>>>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>>>     left preconditioning
>>>>>     using NONE norm type for convergence test
>>>>>   PC Object:    (fluid_complex_sub_)     1 MPI processes
>>>>>     type: lu
>>>>>       LU: out-of-place factorization
>>>>>       tolerance for zero pivot 2.22045e-14
>>>>>       matrix ordering: nd
>>>>>       factor fill ratio given 5., needed 5.70941
>>>>>         Factored matrix follows:
>>>>>           Mat Object:             1 MPI processes
>>>>>             type: seqaij
>>>>>             rows=36844, cols=36844
>>>>>             package used to perform factorization: petsc
>>>>>             total: nonzeros=14748816, allocated nonzeros=14748816
>>>>>             total number of mallocs used during MatSetValues calls =0
>>>>>               using I-node routines: found 9211 nodes, limit used is 5
>>>>>     linear system matrix = precond matrix:
>>>>>     Mat Object:      (fluid_complex_)       1 MPI processes
>>>>>       type: seqaij
>>>>>       rows=36844, cols=36844
>>>>>       total: nonzeros=2583248, allocated nonzeros=2583248
>>>>>       total number of mallocs used during MatSetValues calls =0
>>>>>         using I-node routines: found 9211 nodes, limit used is 5
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object:  (fluid_complex_)   1 MPI processes
>>>>>   type: seqaij
>>>>>   rows=73688, cols=73688, bs=2
>>>>>   total: nonzeros=5224384, allocated nonzeros=5224384
>>>>>   total number of mallocs used during MatSetValues calls =0
>>>>>     using I-node routines: found 18422 nodes, limit used is 5
>>>>> 
>>>>> 
>>>>> Likewise, I tried to use a more generic option: 
>>>>> -mat_set_block_size 2 -ksp_type gmres -pc_type ilu  -sub_ksp_type preonly -sub_pc_type lu  -ksp_view
>>>>> 
>>>>> with the following output:
>>>>> Linear fluid_complex_ solve converged due to CONVERGED_RTOL iterations 38
>>>>> KSP Object:(fluid_complex_) 1 MPI processes
>>>>> type: gmres
>>>>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>>>   GMRES: happy breakdown tolerance 1e-30
>>>>> maximum iterations=10000, initial guess is zero
>>>>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>>> left preconditioning
>>>>> using PRECONDITIONED norm type for convergence test
>>>>> PC Object:(fluid_complex_) 1 MPI processes
>>>>> type: ilu
>>>>>   ILU: out-of-place factorization
>>>>>   0 levels of fill
>>>>>   tolerance for zero pivot 2.22045e-14
>>>>>   matrix ordering: natural
>>>>>   factor fill ratio given 1., needed 1.
>>>>>     Factored matrix follows:
>>>>>       Mat Object:         1 MPI processes
>>>>>         type: seqaij
>>>>>         rows=73688, cols=73688, bs=2
>>>>>         package used to perform factorization: petsc
>>>>>         total: nonzeros=5224384, allocated nonzeros=5224384
>>>>>         total number of mallocs used during MatSetValues calls =0
>>>>>           using I-node routines: found 18422 nodes, limit used is 5
>>>>> linear system matrix = precond matrix:
>>>>> Mat Object:  (fluid_complex_)   1 MPI processes
>>>>>   type: seqaij
>>>>>   rows=73688, cols=73688, bs=2
>>>>>   total: nonzeros=5224384, allocated nonzeros=5224384
>>>>>   total number of mallocs used during MatSetValues calls =0
>>>>>     using I-node routines: found 18422 nodes, limit used is 5
>>>>> 
>>>>> Are other PC types expected to translate to the block matrices? 
>>>>> 
>>>>> I would appreciate any guidance. 
>>>>> 
>>>>> Thanks,
>>>>> Manav
>