<div dir="ltr"><div>Thank you Stefano for your quick answer, I will try to change PC as you suggest, and I will read something about MATIS type. I will let you know if I still
have
doubts.<br></div><div><br></div><div>Valerio<br></div></div><br><div class="gmail_quote"><div dir="ltr">Il giorno dom 13 gen 2019 alle ore 16:08 Stefano Zampini <<a href="mailto:stefano.zampini@gmail.com">stefano.zampini@gmail.com</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">What problem are you trying to solve?<div>For standard elliptic problems (e.g. poisson or elasticity) you should use a preconditioner that scales, for example PCGAMG or BoomerAMG from HYPRE (PCHYPRE, if you configured PETSc with support for HYPRE via --download-hypre)<div>Alternatively, since you are using FEM with local Assembly, you can use PCBDDC, but this requires few extra calls</div><div><br></div><div>- call MatSetType(K,MATIS)</div><div>- provide an ISLocalToGlobalMapping of local to global degrees of freedom before you set preallocation via MatSetLocalToGlobalMapping (if you provide this call, then you can use MatSetValuesLocal for both AIJ amd MATIS types)</div></div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr">Il giorno dom 13 gen 2019 alle ore 17:59 Valerio Barnabei via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">
<div>Hello everybody,<br></div><div><br></div><div>I have some doubts about the parallel solution of a simple FEM ksp problem. I searched in the mailing list but the information I found are not helping me.<br></div><div>I
need to control manually the parallel assemble of the main matrix, but I
would like to use PetSC for the solution of the final linear system
K*U=F. <br></div><div>The algorithm I am using, loads an
already decomposed FEM domain, and each process contributes to the
global matrix assembly using its portion of domain decomposition (made
in pre-processing). In the element loop, at each element I load the
element submatrix in the global matrix via MatSetValue function. Then,
out of the element loop, I write<br></div><div>/*</div><div>...matrix preallocation<br></div><div>...calculate element local stiffness matrix (local in a FEM sense and local in a domain decomposition sense)<br></div><div>MatSetValue (ADD)</div><div>*/<br></div><div>and then <br></div><div><span style="color:rgb(0,0,255)">ierr=MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);<br>ierr=MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);<br>ierr=VecAssemblyBegin(F);CHKERRQ(ierr);<br>ierr=VecAssemblyEnd(F);CHKERRQ(ierr);</span><br></div><div><br></div><div>This part scales perfectly and is sufficiently fast.</div><div>
My K matrix is a sort of banded diagonal matrix: my preallocation is not perfect, but it works at this stage of my work.
</div><div>Finally I solve the system with the ksp logic:</div><div><br></div><div><span style="color:rgb(0,0,255)">ierr=KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);<br>ierr=KSPSetOperators(ksp,K,K);CHKERRQ(ierr);<br>ierr=KSPGetPC(ksp,&pc);CHKERRQ(ierr);<br>ierr=PCSetType(pc,PCBJACOBI);CHKERRQ(ierr); //or PCNONE, or PCJACOBI, the behaviour is almost identical, the iteration number lowers but is still really high<br>ierr=KSPSetTolerances(ksp,1.e-7,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);<br>ierr=KSPSetFromOptions(ksp);CHKERRQ(ierr);<br>ierr=KSPSolve(ksp,F,sol);<br></span></div><div><br></div><div>When
I run the code I am noticing some strange behaviours. Indeed, the
numerical solution appear right, and the code scales if I make a simple
strong-scaling test. However, the ksp_log and log_view (see below, logs for a 160k quad linear element uniform 2dmesh) shows me a huge number of iterations, and the
time spent in "solve" function is very high. Furthermore, the more I
increase the problem size keeping a constant load per process (weak
scaling) the more iterations seems to be needed for convergence. In general, when increasing the problem size (i.e. 16M elements) the convergence is not achieved before the maximum iteration number is achieved. <br></div><div><br></div><div>Question 1: Am I doing it right? Or am I making some trouble with the MPI communication management? (i'm aware that my domain decomposition does not match the petsc matrix decomposition, I'm expecting it to cripple my algorithm a bit)<br></div><div>Question 2: Why the solver takes so much time and so many iterations? the problem is really simple, and the solution appears correct when plotted. Am I ill conditioning the system?</div><div>Thanks in advance for your help. I can add any further information if needed.</div><div><br></div><div>Valerio<br></div>
</div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail-m_-5858931685459960259gmail_signature">Stefano</div>
</blockquote></div>