[petsc-dev] DMPlexInterpolate after DMPlexDistribute

Matthew Knepley knepley at gmail.com
Tue Apr 28 05:56:11 CDT 2020


On Tue, Apr 28, 2020 at 5:19 AM Pierre Jolivet <pierre.jolivet at enseeiht.fr>
wrote:

> On 14 Apr 2020, at 2:36 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Tue, Apr 14, 2020 at 6:36 AM Pierre Jolivet <pierre.jolivet at enseeiht.fr>
> wrote:
>
>> Hello,
>> I’d like to call DMPlexInterpolate after DMPlexDistribute and not the
>> other way around for performance reasons (please stop me here if this is
>> equivalent).
>> When there is no overlap in DMPlexDistribute, it goes through fine.
>> If there is overlap, I run into an error.
>> Is this the expected behavior?
>
>
> Sorry for taking so long to get back at this.
> I thought I got everything setup, but when looking at the performance, I’m
> quite surprised.
> I rewound and looked at src/dm/impls/plex/tutorials/ex2.c by just adding
> the DMPlexDistribute step. I guess this is equivalent to your steps #1 and
> #2.
> diff --git a/src/dm/impls/plex/tutorials/ex2.c
> b/src/dm/impls/plex/tutorials/ex2.c
> index a069d922b2..5e5c4fb584 100644
> --- a/src/dm/impls/plex/tutorials/ex2.c
> +++ b/src/dm/impls/plex/tutorials/ex2.c
> @@ -65,2 +65,5 @@ static PetscErrorCode CreateMesh(MPI_Comm comm, AppCtx
> *user, DM *dm)
>      ierr = DMPlexCreateFromFile(comm, user->filename, user->interpolate,
> dm);CHKERRQ(ierr);
> +    DM dmParallel;
> +    ierr = DMPlexDistribute(*dm, 0, NULL, &dmParallel);CHKERRQ(ierr);
> +    ierr = DMDestroy(&dmParallel);CHKERRQ(ierr);
>    }
>
> With a cube of 741000 nodes 4574068 elements.
> $ cat untitled.geo && ~/gmsh-4.5.4-Linux64/bin/gmsh untitled.geo -bin -3
> //+
> SetFactory("OpenCASCADE");
> Box(1) = {-0.5, -0.5, 0, 1, 1, 1};
> Characteristic Length {:} = 0.01;
>
> I get the following timings (optimized build, scalar-type=real,
> 64-bit-indices=false, SKL cluster).
> $ mpirun  -n 120 ./ex2 -filename untitled.msh -log_view -interpolate true
> DMPlexInterp           1 1.0 1.0444e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 3.0e+00 73  0  0  0  5  73  0  0  0  6     0
> DMPlexDistribute       1 1.0 2.8793e+01 1.0 0.00e+00 0.0 3.6e+03 6.2e+05
> 3.3e+01 21  0100100 60  21  0100100 69     0
> $ mpirun  -n 120 ./ex2 -filename untitled.msh -log_view -interpolate false
> DMPlexDistribute       1 1.0 7.0265e+02 1.0 0.00e+00 0.0 3.1e+03 2.0e+05
> 2.6e+01 99  0100100 58  99  0100100 68     0
>
> Do you think I messed up something else?
> Our legacy implementation, on the same mesh, takes around 20 seconds
> (accounting for the ParMETIS call) to partition and distribute the mesh and
> generate the underlying structures for our FEM kernels (kd tree,
> distributed numbering, so on and so forth).
> Of course, DMPlex is much more versatile and generic, so I’d understand if
> it’s a little bit slower, but that’s quite a lot slower right now.
> Are there any option I could try to play with to speed things up?
>

Yes, something is messed up. Vaclav's latest paper has almost the exact
setup for the cube benchmark (https://arxiv.org/pdf/2004.08729.pdf), and
you can see that the interpolation time is an order of magnitude smaller
for 128 procs and also scales linearly. What you show above in ex2 looks
serial, in that the mesh is loaded on 1 proc, and then interpolation is
also done on 1 proc, which corresponds to the time he shows as "Serial
Startup". I cannot explain the poor performance of your second run, however
the interpolation time in the first can be greatly reduced by interpolating
in parallel. Thus, the right answer is to load in parallel, interpolation,
and redistribute. That is what we do in that paper.

Short answer: If you want scalable load and interpolation, I think you
should be using the parallel stuff that Vaclav just put in. I think that
means first converting your mesh to the HDF5 format using CreateFromFile
and then DMView to hdf5. Then you can load it in parallel. Vaclav, has
everything been pushed for this?

  Thanks,

      Matt

Thanks in advance for your help,
> Pierre
>
> Yes. What we want you to do is:
>
> 1) Load/generate mesh
>
> 2) Distribute (this can be done at the same time as load with parallel
> load)
>
> 3) Interpolate (this is also an option from parallel load)
>
> 4) If necessary, redistribute for load balance
>
> 5) Construct overlap
>
>     When you pass '1' below to DMDistribute(), it distributes as normal
> and then calls
>
>
> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexDistributeOverlap.html
>
>     at the end. So you just postpone calling that until you have
> interpolated.
>
>   Thanks,
>
>     Matt
>
>
>> Here is a MWE.
>> $ patch -p1 < patch.txt
>> $ cd src/dm/impls/plex/tests/
>> $ make ex18
>> $ mpirun -n 2 ./ex18 -distribute -interpolate after_distribute
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [0]PETSC ERROR: Petsc has generated inconsistent data
>> [0]PETSC ERROR: Point SF contains 1 which is a cell
>>
>> Thanks,
>> Pierre
>>
>> diff --git a/src/dm/impls/plex/tests/ex18.c
>> b/src/dm/impls/plex/tests/ex18.c
>> index 07421b3522..dd62be58e5 100644
>> --- a/src/dm/impls/plex/tests/ex18.c
>> +++ b/src/dm/impls/plex/tests/ex18.c
>> @@ -806 +806 @@ static PetscErrorCode CreateMesh(MPI_Comm comm, AppCtx
>> *user, DM *dm)
>> -    ierr = DMPlexDistribute(*dm, 0, NULL, &pdm);CHKERRQ(ierr);
>> +    ierr = DMPlexDistribute(*dm, 1, NULL, &pdm);CHKERRQ(ierr);
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20200428/a2e8c78b/attachment.html>


More information about the petsc-dev mailing list