[petsc-users] Saddle point problem with nested matrix and a relatively small number of Lagrange multipliers

Thu Sep 3 10:43:13 CDT 2020

Hello,

I am working on a finite-elements/finite-volumes code, whose distributed 
solver is based on petsc. For FE, it relies on Lagrange multipliers for 
the imposition of various boundary conditions or interactions (simple 
dirichlet, contact, ...). This results in saddle point problems:

[K C^t][U]=[F]
[C  0 ][L] [D]

Most of the time, the relations related to the matrix C are applied to 
dofs on the boundary of the domain. Then the size of L is much smaller 
than the size of U, which becomes more and more true as  the mesh is 
refined.

The code construct this matrix as a nested matrix (one of the reason is 
that for some interactions such as contact, whereas being quite small, 
the size of the matrix C change constantly, and having it 'blended' into 
a monolithic 'big' matrix would require to recompute its profile/ 
reallocate / ... each time), and use fieldsplit preconditioner of type 
PC_COMPOSITE_SCHUR. I would like to solve the system using iterative 
methods to access good extensibility on a large number of subdomains.

Simple BC such as Dirichlet can be eliminated into K (and must be in 
order to make K invertible).

My problem is the computational cost of these constraints treated with 
Lagrange multipliers, whereas their number becomes more and more 
neglectable as the mesh is refined. To give an idea, let's consider a 
simple elastic cube with dirichlet BCs which are all eliminated (to 
ensure invertibility of K) but one on a single dof.

-ksp_type preonly
-pc_type fieldsplit
-pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type full
-pc_fieldsplit_schur_precondition selfp

-fieldsplit_u_ksp_type cg
-fieldsplit_u_pc_type bjacobi

-fieldsplit_l_ksp_type cg
-fieldsplit_l_pc_type bjacobi

it seems that my computation time is multiplied by a factor 3: 3 ksp 
solves of the big block 'K' are needed to apply the schur preconditioner 
(assuming that the ksp(S,Sp) converges in 1 iteration). It seems 
expensive for a single dof dirichlet!

And for some relations treated by Lagrange multipliers which involve 
many dofs, the number of ksp solve of the big block 'K' is ( 2 + number 
of iteration of ksp(S,Sp)). To reduce this, one can think about solving 
the ksp(S,Sp) with a direct solver, but then one must use 
"-pc_fieldsplit_schur_precondition self" which is advised against in the 
documentation...

To illustrate this, on a small elasticity case: 32x32x32 cube on 8 
processors, dirichlet on the top and bottom faces:
* if all the dirichlet are eliminated (no C matrix, monolithic solve of 
the K bloc)
   - computation time for the solve: ~400ms
* if only the dirichlet of the bottom face are eliminated
   - computation time for the solve: ~35000ms
   - number of iteration of ksp(S,Sp): 37
   - total number of iterations of ksp(K): 4939
* only the dirichlet of the bottom face are eliminated with these options:
-ksp_type fgmres
-pc_type fieldsplit
-pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type full
-pc_fieldsplit_schur_precondition selfp

-fieldsplit_u_ksp_type cg
-fieldsplit_u_pc_type bjacobi

-fieldsplit_l_ksp_type cg
-fieldsplit_l_pc_type bjacobi
-fieldsplit_l_ksp_rtol 1e-10
-fieldsplit_l_inner_ksp_type preonly
-fieldsplit_l_inner_pc_type jacobi
-fieldsplit_l_upper_ksp_type preonly
-fieldsplit_l_upper_pc_type jacobi

   - computation time for the solve: ~50000ms
   - total number of iterations of ksp(K): 7424
   - 'main' ksp number of iterations: 7424

Then in the end, my question is: is there a smarter way to handle such 
'small' constraint matrices C, with the (maybe wrong) idea that a small 
number of extra dofs (the lagrange multipliers) should result in a small 
extra computation time ?

Thanks!