[petsc-users] KSP solve time and iterations

Fri Jan 18 08:53:53 CST 2019

Hello,
I'm trying to figure out how to change my program in order to use MATIS
type and PCBDD.
I managed to build my MATIS matrix successfully, but my code is now slower.
I think the main issue is that i'm using MatSetValues inside the element
loop, and passing as values the e-th local element matrix. In a sort of
pseudocode:

for e in localElements:{
for (int qp=0;qp<ngp_tot;qp++)
       for(int A=0;A<Nnodesloc;A++){
                for(int B=0;B<Nnodesloc;B++){
                    //Bat dot temp (NB: ba_n0=bat_0n)
                    double
prod=temp[0][A][qp]*BaT[B][0][qp]+temp[1][A][qp]*BaT[B][1][qp];
                    //ke[A][B]+=weights[qp]*detJ[qp]*prod;
                    ke[A*Nnodesloc+B]+=weights[qp]*detJ[qp]*prod;

?
Actually i have lots of doubts:
-when using ISLocalToGlobalMappingCreate(), what's the blocksize?

Il giorno lun 14 gen 2019 alle ore 15:08 Valerio Barnabei <
valerio.barnabei at gmail.com> ha scritto:

> Thank you Stefano for your quick answer, I will try to change PC as you
> suggest, and I will read something about MATIS type. I will let you know if
> I still have doubts.
>
> Valerio
>
> Il giorno dom 13 gen 2019 alle ore 16:08 Stefano Zampini <
> stefano.zampini at gmail.com> ha scritto:
>
>> What problem are you trying to solve?
>> For standard elliptic problems (e.g. poisson or elasticity) you should
>> use a preconditioner that scales, for example PCGAMG or BoomerAMG from
>> HYPRE (PCHYPRE, if you configured PETSc with support for HYPRE via
>> --download-hypre)
>> Alternatively, since you are using FEM with local Assembly, you can use
>> PCBDDC, but this requires few extra calls
>>
>> - call MatSetType(K,MATIS)
>> - provide an ISLocalToGlobalMapping of local to global degrees of freedom
>> before you set preallocation via MatSetLocalToGlobalMapping (if you provide
>> this call, then you can use MatSetValuesLocal for both AIJ amd MATIS types)
>>
>>
>>
>> Il giorno dom 13 gen 2019 alle ore 17:59 Valerio Barnabei via petsc-users
>> <petsc-users at mcs.anl.gov> ha scritto:
>>
>>> Hello everybody,
>>>
>>> I have some doubts about the parallel solution of a simple FEM ksp
>>> problem. I searched in the mailing list but the information I found are not
>>> helping me.
>>> I need to control manually the parallel assemble of the main matrix, but
>>> I would like to use PetSC for the solution of the final linear system
>>> K*U=F.
>>> The algorithm I am using, loads an already decomposed FEM domain, and
>>> each process contributes to the global matrix assembly using its portion of
>>> domain decomposition (made in pre-processing). In the element loop, at each
>>> element I load the element submatrix in the global matrix via MatSetValue
>>> function. Then, out of the element loop, I write
>>> /*
>>> ...matrix preallocation
>>> ...calculate element local stiffness matrix (local in a FEM sense and
>>> local in a domain decomposition sense)
>>> MatSetValue (ADD)
>>> */
>>> and then
>>> ierr=MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>> ierr=MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>> ierr=VecAssemblyBegin(F);CHKERRQ(ierr);
>>> ierr=VecAssemblyEnd(F);CHKERRQ(ierr);
>>>
>>> This part scales perfectly and is sufficiently fast.
>>> My K matrix is a sort of banded diagonal matrix: my preallocation is not
>>> perfect, but it works at this stage of my work.
>>> Finally I solve the system with the ksp logic:
>>>
>>> ierr=KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);
>>> ierr=KSPSetOperators(ksp,K,K);CHKERRQ(ierr);
>>> ierr=KSPGetPC(ksp,&pc);CHKERRQ(ierr);
>>> ierr=PCSetType(pc,PCBJACOBI);CHKERRQ(ierr); //or PCNONE, or PCJACOBI,
>>> the behaviour is almost identical, the iteration number lowers but is still
>>> really high
>>>
>>> ierr=KSPSetTolerances(ksp,1.e-7,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);
>>> ierr=KSPSetFromOptions(ksp);CHKERRQ(ierr);
>>> ierr=KSPSolve(ksp,F,sol);
>>>
>>> When I run the code I am noticing some strange behaviours. Indeed, the
>>> numerical solution appear right, and the code scales if I make a simple
>>> strong-scaling test. However, the ksp_log and log_view (see below, logs for
>>> a 160k quad linear element uniform 2dmesh) shows me a huge number of
>>> iterations, and the time spent in "solve" function is very high.
>>> Furthermore, the more I increase the problem size keeping a constant load
>>> per process (weak scaling) the more iterations seems to be needed for
>>> convergence. In general, when increasing the problem size (i.e. 16M
>>> elements) the convergence is not achieved before the maximum iteration
>>> number is achieved.
>>>
>>> Question 1: Am I doing it right? Or am I making some trouble with the
>>> MPI communication management? (i'm aware that my domain decomposition does
>>> not match the petsc matrix decomposition, I'm expecting it to cripple my
>>> algorithm a bit)
>>> Question 2: Why the solver takes so much time and so many iterations?
>>> the problem is really simple, and the solution appears correct when
>>> plotted. Am I ill conditioning the system?
>>> Thanks in advance for your help. I can add any further information if
>>> needed.
>>>
>>> Valerio
>>>
>>
>>
>> --
>> Stefano
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190118/4a41dc4e/attachment-0001.html>