[petsc-users] KSP solve time and iterations

Sun Jan 13 08:57:44 CST 2019

 Hello everybody,

I have some doubts about the parallel solution of a simple FEM ksp problem.
I searched in the mailing list but the information I found are not helping
me.
I need to control manually the parallel assemble of the main matrix, but I
would like to use PetSC for the solution of the final linear system K*U=F.
The algorithm I am using, loads an already decomposed FEM domain, and each
process contributes to the global matrix assembly using its portion of
domain decomposition (made in pre-processing). In the element loop, at each
element I load the element submatrix in the global matrix via MatSetValue
function. Then, out of the element loop, I write
/*
...matrix preallocation
...calculate element local stiffness matrix (local in a FEM sense and local
in a domain decomposition sense)
MatSetValue (ADD)
*/
and then
ierr=MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
ierr=MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
ierr=VecAssemblyBegin(F);CHKERRQ(ierr);
ierr=VecAssemblyEnd(F);CHKERRQ(ierr);

This part scales perfectly and is sufficiently fast.
My K matrix is a sort of banded diagonal matrix: my preallocation is not
perfect, but it works at this stage of my work.
Finally I solve the system with the ksp logic:

ierr=KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);
ierr=KSPSetOperators(ksp,K,K);CHKERRQ(ierr);
ierr=KSPGetPC(ksp,&pc);CHKERRQ(ierr);
ierr=PCSetType(pc,PCBJACOBI);CHKERRQ(ierr); //or PCNONE, or PCJACOBI, the
behaviour is almost identical, the iteration number lowers but is still
really high
ierr=KSPSetTolerances(ksp,1.e-7,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);
ierr=KSPSetFromOptions(ksp);CHKERRQ(ierr);
ierr=KSPSolve(ksp,F,sol);

When I run the code I am noticing some strange behaviours. Indeed, the
numerical solution appear right, and the code scales if I make a simple
strong-scaling test. However, the ksp_log and log_view (see below, logs for
a 160k quad linear element uniform 2dmesh) shows me a huge number of
iterations, and the time spent in "solve" function is very high.
Furthermore, the more I increase the problem size keeping a constant load
per process (weak scaling) the more iterations seems to be needed for
convergence. In general, when increasing the problem size (i.e. 16M
elements) the convergence is not achieved before the maximum iteration
number is achieved.

Question 1: Am I doing it right? Or am I making some trouble with the MPI
communication management? (i'm aware that my domain decomposition does not
match the petsc matrix decomposition, I'm expecting it to cripple my
algorithm a bit)
Question 2: Why the solver takes so much time and so many iterations? the
problem is really simple, and the solution appears correct when plotted. Am
I ill conditioning the system?
Thanks in advance for your help. I can add any further information if
needed.

Valerio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190113/07156c74/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: log_weak16.out
Type: application/octet-stream
Size: 11143 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190113/07156c74/attachment-0001.obj>