[petsc-users] DMPlex in Firedrake: scaling of mesh distribution

Matthew Knepley knepley at buffalo.edu
Fri Mar 5 21:04:39 CST 2021


On Fri, Mar 5, 2021 at 4:06 PM Alexei Colin <acolin at isi.edu> wrote:

> To PETSc DMPlex users, Firedrake users, Dr. Knepley and Dr. Karpeev:
>
> Is it expected for mesh distribution step to
> (A) take a share of 50-99% of total time-to-solution of an FEM problem, and
>

No


> (B) take an amount of time that increases with the number of ranks, and
>

See below.


> (C) take an amount of memory on rank 0 that does not decrease with the
> number of ranks
>

The problem here is that a serial mesh is being partitioned and sent to all
processes. This is fundamentally
non-scalable, but it is easy and works well for modest clusters < 100 nodes
or so. Above this, it will take
increasing amounts of time. There are a few techniques for mitigating this.

a) For simple domains, you can distribute a coarse grid, then regularly
refine that in parallel with DMRefine() or -dm_refine <k>.
    These steps can be repeated easily, and redistribution in parallel is
fast, as shown for example in [1].

b) For complex meshes, you can read them in parallel, and then repeat a).
This is done in [1]. It is a little more involved,
    but not much.

c) You can do a multilevel partitioning, as they do in [2]. I cannot find
the paper in which they describe this right now. It is feasible,
     but definitely the most expert approach.

Does this make sense?

  Thanks,

    Matt

[1]  Fully Parallel Mesh I/O using PETSc DMPlex with an Application to
Waveform Modeling, Hapla et.al.
      https://arxiv.org/abs/2004.08729
[2] On the robustness and performance of entropy stable discontinuous
collocation methods for the compressible Navier-Stokes equations, ROjas .
et.al.
      https://arxiv.org/abs/1911.10966


> ?
>
> The attached plots suggest (A), (B), and (C) is happening for
> Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K
> unit-square mesh. The implementation is here [1]. Versions are
> Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3.
>
> Two questions, one on (A) and the other on (B)+(C):
>
> 1. Is (A) result expected? Given (A), any effort to improve the quality
> of the compiled assembly kernels (or anything else other than mesh
> distribution) appears futile since it takes 1% of end-to-end execution
> time, or am I missing something?
>
> 1a. Is mesh distribution fundamentally necessary for any FEM framework,
> or is it only needed by Firedrake? If latter, then how do other
> frameworks partition the mesh and execute in parallel with MPI but avoid
> the non-scalable mesh destribution step?
>
> 2. Results (B) and (C) suggest that the mesh distribution step does
> not scale. Is it a fundamental property of the mesh distribution problem
> that it has a central bottleneck in the master process, or is it
> a limitation of the current implementation in PETSc-DMPlex?
>
> 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of [2]
> suggests a way to reduce the time spent on sequential bottleneck by
> "parallel mesh refinment" that creates high-resolution meshes from an
> initial coarse mesh. Is this approach implemented in DMPLex?  If so, any
> pointers on how to try it out with Firedrake? If not, any other
> directions for reducing this bottleneck?
>
> 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale well
> up
> to 96 cores -- is mesh distribution included in those times?  Is anyone
> reading this aware of any other publications with evaluations of
> Firedrake that measure mesh distribution (or explain how to avoid or
> exclude it)?
>
> Thank you for your time and any info or tips.
>
>
> [1]
> https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py
>
> [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G.
> Knepley, Michael Lange, Gerard J. Gorman, 2015.
> https://arxiv.org/pdf/1506.06194.pdf
>
> [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael
> Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC,
> 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210305/569f09d4/attachment.html>


More information about the petsc-users mailing list