[petsc-dev] Memory problem with OpenMP and Fieldsplit sub solvers

Mark Adams mfadams at lbl.gov
Thu Jan 21 09:33:41 CST 2021


It looks like PETSc is just too clever for me. I am trying to get a
different MPI_Comm into each block, but PETSc is thwarting me:

  if (jac->use_openmp) {
    ierr          = KSPCreate(MPI_COMM_SELF,&ilink->ksp);CHKERRQ(ierr);
PetscPrintf(PETSC_COMM_SELF,"In PCFieldSplitSetFields_FieldSplit with
-------------- link: %p. Comms %p
%p\n",ilink,PetscObjectComm((PetscObject)pc),PetscObjectComm((PetscObject)ilink->ksp));
  } else {
    ierr          =
KSPCreate(PetscObjectComm((PetscObject)pc),&ilink->ksp);CHKERRQ(ierr);
  }

produces:

In PCFieldSplitSetFields_FieldSplit with -------------- link: 0x7e9cb4f0.
Comms 0x660c6ad0 0x660c6ad0
In PCFieldSplitSetFields_FieldSplit with -------------- link: 0x7e88f7d0.
Comms 0x660c6ad0 0x660c6ad0

How can I work around this?


On Thu, Jan 21, 2021 at 7:41 AM Mark Adams <mfadams at lbl.gov> wrote:

>
>
> On Wed, Jan 20, 2021 at 6:21 PM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>
>> On Jan 20, 2021, at 3:09 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>
>> So I put in a temporary hack to get the first Fieldsplit apply to NOT use
>> OMP and it sort of works.
>>
>> Preonly/lu is fine. GMRES calls vector creates/dups in every solve so
>> that is a big problem.
>>
>>
>>   It should definitely not be creating vectors "in every" solve. But it
>> does do lazy allocation of needed restarted vectors which may make it look
>> like it is creating "every" vectors in every solve.  You can
>> use -ksp_gmres_preallocate to force it to create all the restart vectors up
>> front at KSPSetUp().
>>
>
> Well, I run the first solve w/o OMP and I see Vec dups in cuSparse Vecs in
> the 2nd solve.
>
>
>>
>>   Why is creating vectors "at every solve" a problem? It is not thread
>> safe I guess?
>>
>
> It dies when it looks at the options database, in a Free in the
> get-options method to be exact (see stacks).
>
> ======= Backtrace: =========
> /lib64/libc.so.6(cfree+0x4a0)[0x200021839be0]
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PetscFreeAlign+0x4c)[0x2000002a368c]
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PetscOptionsEnd_Private+0xf4)[0x2000002e53f0]
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x7c6c28)[0x2000008b6c28]
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecCreate_SeqCUDA+0x11c)[0x20000052c510]
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecSetType+0x670)[0x200000549664]
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecCreateSeqCUDA+0x150)[0x20000052c0b0]
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x43c198)[0x20000052c198]
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicate+0x44)[0x200000542168]
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicateVecs_Default+0x148)[0x200000543820]
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicateVecs+0x54)[0x2000005425f4]
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(KSPCreateVecs+0x4b4)[0x2000016f0aec]
>
>
>
>>
>> Richardson works except the convergence test gets confused, presumably
>> because MPI reductions with PETSC_COMM_SELF is not threadsafe.
>>
>>
>>
>> One fix for the norms might be to create each subdomain solver with a
>> different communicator.
>>
>>
>>    Yes you could do that. It might actually be the correct thing to do
>> also, if you have multiple threads call MPI reductions on the same
>> communicator that would be a problem. Each KSP should get a new MPI_Comm.
>>
>
> OK. I will only do this.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210121/d2a5a702/attachment.html>


More information about the petsc-dev mailing list