[petsc-dev] Memory problem with OpenMP and Fieldsplit sub solvers

Mark Adams mfadams at lbl.gov
Thu Jan 21 11:01:48 CST 2021


On Thu, Jan 21, 2021 at 11:25 AM Jed Brown <jed at jedbrown.org> wrote:

> Mark Adams <mfadams at lbl.gov> writes:
>
> > Yes, the problem is that each KSP solver is running in an OMP thread
>
> There can be more or less splits than OMP_NUM_THREADS. Each thread is
> still calling blocking operations.
>
> This is a concurrency problem, not a parallel efficiency problem. It can
> be solved with async interfaces


I don't know how to do that. I want a GPU solver, probably superLU, and am
starting with cuSparse ilu to get something running

or by making as many threads as splits and ensuring that you don't spin
> (lest contention kill performance).


I don't get correctness with Richardson with > 1 OMP threads currently.
This is on IBM with GNU.


> OpenMP is pretty orthogonal and probably not a good fit.
>

Do you have an alternative?


>
> > (So at this point it only works for SELF and its Landau so it is all I
> need). It looks like MPI reductions called with a comm_self are not thread
> safe (eg, the could say, this is one proc, thus, just copy send --> recv,
> but they don't)
> >
> > On Thu, Jan 21, 2021 at 10:46 AM Matthew Knepley <knepley at gmail.com>
> wrote:
> >
> >> On Thu, Jan 21, 2021 at 10:34 AM Mark Adams <mfadams at lbl.gov> wrote:
> >>
> >>> It looks like PETSc is just too clever for me. I am trying to get a
> >>> different MPI_Comm into each block, but PETSc is thwarting me:
> >>>
> >>
> >> It looks like you are using SELF. Is that what you want? Do you want a
> >> bunch of comms with the same group, but independent somehow? I am
> confused.
> >>
> >>    Matt
> >>
> >>
> >>>   if (jac->use_openmp) {
> >>>     ierr          = KSPCreate(MPI_COMM_SELF,&ilink->ksp);CHKERRQ(ierr);
> >>> PetscPrintf(PETSC_COMM_SELF,"In PCFieldSplitSetFields_FieldSplit with
> >>> -------------- link: %p. Comms %p
> >>>
> %p\n",ilink,PetscObjectComm((PetscObject)pc),PetscObjectComm((PetscObject)ilink->ksp));
> >>>   } else {
> >>>     ierr          =
> >>> KSPCreate(PetscObjectComm((PetscObject)pc),&ilink->ksp);CHKERRQ(ierr);
> >>>   }
> >>>
> >>> produces:
> >>>
> >>> In PCFieldSplitSetFields_FieldSplit with -------------- link:
> 0x7e9cb4f0.
> >>> Comms 0x660c6ad0 0x660c6ad0
> >>> In PCFieldSplitSetFields_FieldSplit with -------------- link:
> 0x7e88f7d0.
> >>> Comms 0x660c6ad0 0x660c6ad0
> >>>
> >>> How can I work around this?
> >>>
> >>>
> >>> On Thu, Jan 21, 2021 at 7:41 AM Mark Adams <mfadams at lbl.gov> wrote:
> >>>
> >>>>
> >>>>
> >>>> On Wed, Jan 20, 2021 at 6:21 PM Barry Smith <bsmith at petsc.dev> wrote:
> >>>>
> >>>>>
> >>>>>
> >>>>> On Jan 20, 2021, at 3:09 PM, Mark Adams <mfadams at lbl.gov> wrote:
> >>>>>
> >>>>> So I put in a temporary hack to get the first Fieldsplit apply to NOT
> >>>>> use OMP and it sort of works.
> >>>>>
> >>>>> Preonly/lu is fine. GMRES calls vector creates/dups in every solve so
> >>>>> that is a big problem.
> >>>>>
> >>>>>
> >>>>>   It should definitely not be creating vectors "in every" solve. But
> it
> >>>>> does do lazy allocation of needed restarted vectors which may make
> it look
> >>>>> like it is creating "every" vectors in every solve.  You can
> >>>>> use -ksp_gmres_preallocate to force it to create all the restart
> vectors up
> >>>>> front at KSPSetUp().
> >>>>>
> >>>>
> >>>> Well, I run the first solve w/o OMP and I see Vec dups in cuSparse
> Vecs
> >>>> in the 2nd solve.
> >>>>
> >>>>
> >>>>>
> >>>>>   Why is creating vectors "at every solve" a problem? It is not
> thread
> >>>>> safe I guess?
> >>>>>
> >>>>
> >>>> It dies when it looks at the options database, in a Free in the
> >>>> get-options method to be exact (see stacks).
> >>>>
> >>>> ======= Backtrace: =========
> >>>> /lib64/libc.so.6(cfree+0x4a0)[0x200021839be0]
> >>>>
> >>>>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PetscFreeAlign+0x4c)[0x2000002a368c]
> >>>>
> >>>>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PetscOptionsEnd_Private+0xf4)[0x2000002e53f0]
> >>>>
> >>>>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x7c6c28)[0x2000008b6c28]
> >>>>
> >>>>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecCreate_SeqCUDA+0x11c)[0x20000052c510]
> >>>>
> >>>>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecSetType+0x670)[0x200000549664]
> >>>>
> >>>>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecCreateSeqCUDA+0x150)[0x20000052c0b0]
> >>>>
> >>>>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x43c198)[0x20000052c198]
> >>>>
> >>>>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicate+0x44)[0x200000542168]
> >>>>
> >>>>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicateVecs_Default+0x148)[0x200000543820]
> >>>>
> >>>>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicateVecs+0x54)[0x2000005425f4]
> >>>>
> >>>>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(KSPCreateVecs+0x4b4)[0x2000016f0aec]
> >>>>
> >>>>
> >>>>
> >>>>>
> >>>>> Richardson works except the convergence test gets confused,
> presumably
> >>>>> because MPI reductions with PETSC_COMM_SELF is not threadsafe.
> >>>>>
> >>>>>
> >>>>>
> >>>>> One fix for the norms might be to create each subdomain solver with a
> >>>>> different communicator.
> >>>>>
> >>>>>
> >>>>>    Yes you could do that. It might actually be the correct thing to
> do
> >>>>> also, if you have multiple threads call MPI reductions on the same
> >>>>> communicator that would be a problem. Each KSP should get a new
> MPI_Comm.
> >>>>>
> >>>>
> >>>> OK. I will only do this.
> >>>>
> >>>>
> >>
> >> --
> >> What most experimenters take for granted before they begin their
> >> experiments is infinitely more interesting than any results to which
> their
> >> experiments lead.
> >> -- Norbert Wiener
> >>
> >> https://www.cse.buffalo.edu/~knepley/
> >> <http://www.cse.buffalo.edu/~knepley/>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210121/64599603/attachment-0001.html>


More information about the petsc-dev mailing list