[petsc-users] Poor weak scaling when solving successive linear systems

Junchao Zhang jczhang at mcs.anl.gov
Thu May 24 11:24:10 CDT 2018


I noticed you used PETSc 3.9.1.  You can give a try to the master branch. I
added some VecScatter optimizations recently. I don't know if it could help.

--Junchao Zhang

On Thu, May 24, 2018 at 12:24 AM, Michael Becker <
Michael.Becker at physik.uni-giessen.de> wrote:

> Hello,
>
> I added a PETSc solver class to our particle-in-cell simulation code and
> all calculations seem to be correct. However, some weak scaling tests I did
> are rather disappointing because the solver's runtime keeps increasing with
> system size although the number of cores are scaled up accordingly. As a
> result, the solver's share of the total runtime becomes more and more
> dominant and the system sizes we aim for are unfeasible.
>
> It's a simple 3D Poisson problem on a structured grid with Dirichlet
> boundaries inside the domain, for which I found the cg/gamg combo to work
> the fastest. Since KSPsolve() is called during every timestep of the
> simulation to solve the same system with a new rhs vector, assembling the
> matrix and other PETSc objects should further not be a determining factor.
>
> What puzzles me is that the convergence rate is actually good (the
> residual decreases by an order of magnitude for every KSP iteration) and
> the number of KSP iterations remains constant over the course of a
> simulation and is equal for all tested systems.
>
> I even increased the (fixed) system size per processor to 30^3 unknowns
> (which is significantly more than the recommended 10,000), but runtime is
> still not even close to being constant.
>
> This leads me to the conclusion that either I configured PETSc wrong, I
> don't call the correct PETSc-related functions, or something goes terribly
> wrong with communication.
>
> Could you have a look at the attached log_view files and tell me if
> something is particularly odd? The system size per processor is 30^3 and
> the simulation ran over 1000 timesteps, which means KSPsolve() was called
> equally often. I introduced two new logging states - one for the first
> solve and the final setup and one for the remaining solves.
>
> The repeatedly called code segment is
>
> PetscScalar *b_array;
> VecGetArray(b, &b_array);
> get_b(b_array);
> VecRestoreArray(b, &barray);
>
> KSPSetTolerances(ksp,reltol,1E-50,1E5,1E4);
>
> PetscScalar *x_array;
> VecGetArray(x, &x_array);
> for (int i = 0; i < N_local; i++)
>   x_array[i] = x_array_prev[i];
> VecRestoreArray(x, &x_array);
>
> KSPSolve(ksp,b,x);
>
> KSPGetSolution(ksp,&x);
> for (int i = 0; i < N_local; i++)
>   x_array_prev[i] = x_array[i];
>
> set_x(x_array);
>
> I noticed that for every individual KSP iteration, six vector objects are
> created and destroyed (with CG, more with e.g. GMRES). This seems kind of
> wasteful, is this supposed to be like this? Is this even the reason for my
> problems? Apart from that, everything seems quite normal to me (but I'm not
> the expert here).
>
>
> Thanks in advance.
>
> Michael
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20180524/27e1821d/attachment.html>


More information about the petsc-users mailing list