[petsc-dev] Memory problem with OpenMP and Fieldsplit sub solvers

Mark Adams mfadams at lbl.gov
Thu Jan 21 13:26:53 CST 2021


On Thu, Jan 21, 2021 at 2:11 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Jan 21, 2021 at 2:02 PM Mark Adams <mfadams at lbl.gov> wrote:
>
>> On Thu, Jan 21, 2021 at 1:44 PM Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Thu, Jan 21, 2021 at 11:16 AM Mark Adams <mfadams at lbl.gov> wrote:
>>>
>>>> Yes, the problem is that each KSP solver is running in an OMP thread
>>>> (So at this point it only works for SELF and its Landau so it is all I
>>>> need). It looks like MPI reductions called with a comm_self are not thread
>>>> safe (eg, the could say, this is one proc, thus, just copy send --> recv,
>>>> but they don't)
>>>>
>>>
>>> Instead of using SELF, how about Comm_dup() for each thread?
>>>
>>
>> OK, raw MPI_Comm_dup. I tried PetscCommDup. Let me this.
>> Thanks,
>>
>
> You would have to dup them all outside the OMP section, since it is not
> threadsafe. Then each thread uses one I think.
>

Yea sure. I do it in SetUp.

Well that worked to get *different Comms*, finally, I still get the same
problem. The number of iterations differ wildly. This two species and two
threads (13 SNES its that is not deterministic). Way below is one thread (8
its) and fairly uniform iteration counts.

Maybe this MPI is just not thread safe at all. Let me look into it.
Thanks anyway,

   0 SNES Function norm 4.974994975313e-03
In PCFieldSplitSetFields_FieldSplit with -------------- link: 0x80017c60.
Comms pc=0x67ad27c0 ksp=*0x7ffe1600* newcomm=0x8014b6e0
In PCFieldSplitSetFields_FieldSplit with -------------- link: 0x7ffdabc0.
Comms pc=0x67ad27c0 ksp=*0x7fff70d0* newcomm=0x7ffe9980
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
282
    1 SNES Function norm 1.836376279964e-05
      Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL iterations
19
    2 SNES Function norm 3.059930074740e-07
      Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL iterations
15
    3 SNES Function norm 4.744275398121e-08
      Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL iterations
4
    4 SNES Function norm 4.014828563316e-08
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
456
    5 SNES Function norm 5.670836337808e-09
      Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL iterations
2
    6 SNES Function norm 2.410421401323e-09
      Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL iterations
18
    7 SNES Function norm 6.533948191791e-10
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
458
    8 SNES Function norm 1.008133815842e-10
      Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL iterations
9
    9 SNES Function norm 1.690450876038e-11
      Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL iterations
4
   10 SNES Function norm 1.336383986009e-11
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
463
   11 SNES Function norm 1.873022410774e-12
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
113
   12 SNES Function norm 1.801834606518e-13
      Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL iterations
1
   13 SNES Function norm 1.004397317339e-13
  Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 13




    0 SNES Function norm 4.974994975313e-03
In PCFieldSplitSetFields_FieldSplit with -------------- link: 0x6e265010.
Comms pc=0x56450340 ksp=0x6e2168d0 newcomm=0x6e265090
In PCFieldSplitSetFields_FieldSplit with -------------- link: 0x6e25bc40.
Comms pc=0x56450340 ksp=0x6e22c1d0 newcomm=0x6e21e8f0
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
282
    1 SNES Function norm 1.836376279963e-05
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
380
    2 SNES Function norm 3.018499983019e-07
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
387
    3 SNES Function norm 1.826353175637e-08
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
391
    4 SNES Function norm 1.378600599548e-09
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
392
    5 SNES Function norm 1.077289085611e-10
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
394
    6 SNES Function norm 8.571891727748e-12
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
395
    7 SNES Function norm 6.897647643450e-13
      Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL iterations
395
    8 SNES Function norm 5.606434614114e-14
  Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 8









>
>    Matt
>
>
>>   Matt
>>>
>>>
>>>> On Thu, Jan 21, 2021 at 10:46 AM Matthew Knepley <knepley at gmail.com>
>>>> wrote:
>>>>
>>>>> On Thu, Jan 21, 2021 at 10:34 AM Mark Adams <mfadams at lbl.gov> wrote:
>>>>>
>>>>>> It looks like PETSc is just too clever for me. I am trying to get a
>>>>>> different MPI_Comm into each block, but PETSc is thwarting me:
>>>>>>
>>>>>
>>>>> It looks like you are using SELF. Is that what you want? Do you want a
>>>>> bunch of comms with the same group, but independent somehow? I am confused.
>>>>>
>>>>>    Matt
>>>>>
>>>>>
>>>>>>   if (jac->use_openmp) {
>>>>>>     ierr          =
>>>>>> KSPCreate(MPI_COMM_SELF,&ilink->ksp);CHKERRQ(ierr);
>>>>>> PetscPrintf(PETSC_COMM_SELF,"In PCFieldSplitSetFields_FieldSplit with
>>>>>> -------------- link: %p. Comms %p
>>>>>> %p\n",ilink,PetscObjectComm((PetscObject)pc),PetscObjectComm((PetscObject)ilink->ksp));
>>>>>>   } else {
>>>>>>     ierr          =
>>>>>> KSPCreate(PetscObjectComm((PetscObject)pc),&ilink->ksp);CHKERRQ(ierr);
>>>>>>   }
>>>>>>
>>>>>> produces:
>>>>>>
>>>>>> In PCFieldSplitSetFields_FieldSplit with -------------- link:
>>>>>> 0x7e9cb4f0. Comms 0x660c6ad0 0x660c6ad0
>>>>>> In PCFieldSplitSetFields_FieldSplit with -------------- link:
>>>>>> 0x7e88f7d0. Comms 0x660c6ad0 0x660c6ad0
>>>>>>
>>>>>> How can I work around this?
>>>>>>
>>>>>>
>>>>>> On Thu, Jan 21, 2021 at 7:41 AM Mark Adams <mfadams at lbl.gov> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jan 20, 2021 at 6:21 PM Barry Smith <bsmith at petsc.dev>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jan 20, 2021, at 3:09 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>>>>>>>
>>>>>>>> So I put in a temporary hack to get the first Fieldsplit apply to
>>>>>>>> NOT use OMP and it sort of works.
>>>>>>>>
>>>>>>>> Preonly/lu is fine. GMRES calls vector creates/dups in every
>>>>>>>> solve so that is a big problem.
>>>>>>>>
>>>>>>>>
>>>>>>>>   It should definitely not be creating vectors "in every" solve.
>>>>>>>> But it does do lazy allocation of needed restarted vectors which may make
>>>>>>>> it look like it is creating "every" vectors in every solve.  You can
>>>>>>>> use -ksp_gmres_preallocate to force it to create all the restart vectors up
>>>>>>>> front at KSPSetUp().
>>>>>>>>
>>>>>>>
>>>>>>> Well, I run the first solve w/o OMP and I see Vec dups in cuSparse
>>>>>>> Vecs in the 2nd solve.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>   Why is creating vectors "at every solve" a problem? It is not
>>>>>>>> thread safe I guess?
>>>>>>>>
>>>>>>>
>>>>>>> It dies when it looks at the options database, in a Free in the
>>>>>>> get-options method to be exact (see stacks).
>>>>>>>
>>>>>>> ======= Backtrace: =========
>>>>>>> /lib64/libc.so.6(cfree+0x4a0)[0x200021839be0]
>>>>>>>
>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PetscFreeAlign+0x4c)[0x2000002a368c]
>>>>>>>
>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PetscOptionsEnd_Private+0xf4)[0x2000002e53f0]
>>>>>>>
>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x7c6c28)[0x2000008b6c28]
>>>>>>>
>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecCreate_SeqCUDA+0x11c)[0x20000052c510]
>>>>>>>
>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecSetType+0x670)[0x200000549664]
>>>>>>>
>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecCreateSeqCUDA+0x150)[0x20000052c0b0]
>>>>>>>
>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x43c198)[0x20000052c198]
>>>>>>>
>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicate+0x44)[0x200000542168]
>>>>>>>
>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicateVecs_Default+0x148)[0x200000543820]
>>>>>>>
>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicateVecs+0x54)[0x2000005425f4]
>>>>>>>
>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(KSPCreateVecs+0x4b4)[0x2000016f0aec]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Richardson works except the convergence test gets confused,
>>>>>>>> presumably because MPI reductions with PETSC_COMM_SELF is not threadsafe.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> One fix for the norms might be to create each subdomain solver with
>>>>>>>> a different communicator.
>>>>>>>>
>>>>>>>>
>>>>>>>>    Yes you could do that. It might actually be the correct thing to
>>>>>>>> do also, if you have multiple threads call MPI reductions on the same
>>>>>>>> communicator that would be a problem. Each KSP should get a new MPI_Comm.
>>>>>>>>
>>>>>>>
>>>>>>> OK. I will only do this.
>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210121/896eaae7/attachment-0001.html>


More information about the petsc-dev mailing list