[petsc-dev] Memory problem with OpenMP and Fieldsplit sub solvers

Mark Adams mfadams at lbl.gov
Tue Jan 19 07:07:07 CST 2021


On Mon, Jan 18, 2021 at 11:06 PM Barry Smith <bsmith at petsc.dev> wrote:

>
>   Can valgrind run and help with OpenMP?
>

I am pretty sure. There is also cuda-memcheck that has the same semantics
that works on GPU code, but I'm not sure how good it is for CPU code.


>
>   You can run in the debugger and find any calls to the options checking
> inside your code block and comment them all out to see if that eliminates
> the problem.
>

The stack trace does give me the method that it calls the fatal Free in, so
I will try a breakpoint in there. DDT does work with threads but not GPU
code.


>
>   Also generically how safe is CUDA inside OpenMP? That is with multiple
> threads calling CUDA stuff?
>

I recall that the XGC code, which has a lot of OMP, Cuda (and Kokkos) does
this. Not 100% sure.

I know that they recently had to tear out some OMP loops that they
Kokkos'ized because they had some problem mixing Kokkos-OMP and Cuda so
they reverted back to pure OMP.


>
>
>   Barry
>
>
>
> On Jan 18, 2021, at 7:04 PM, Mark Adams <mfadams at lbl.gov> wrote:
>
>
> Added this w/o luck:
>
> #if defined(PETSC_HAVE_CUDA)
>   ierr = PetscOptionsCheckCUDA(logView);CHKERRQ(ierr);
> #if defined(PETSC_HAVE_THREADSAFETY)
>   ierr = PetscCUPMInitializeCheck();CHKERRQ(ierr);
> #endif
> #endif
>
> Do you think I should keep this in or take it out? Seems like a good idea
> and when it all works we can see if we can make it lazy.
>
> 1)  Calling PetscOptions inside threads. I looked quickly at the code and
>> it seems like it should be ok but perhaps not. This is one reason why
>> having stuff like PetscOptionsBegin inside a low-level creation
>> VecCreate_SeqCUDA_Private is normally not done in PETSc. Eventually this
>> needs to be moved or reworked.
>>
>>
> I will try this next. It is hard to see the stack here. I think I will put
> it in ddt and put a breakpoint PetscOptionsEnd_Private. Other ideas welcome.
>
> Mark
>
>
>> 2) PetscCUDAInitializeCheck is not thread safe.If it is being call for
>> the first timeby multiple threads there can be trouble. So edit init.c and
>> under
>>
>> #if defined(PETSC_HAVE_CUDA)
>>   ierr = PetscOptionsCheckCUDA(logView);CHKERRQ(ierr);
>> #endif
>>
>> #if defined(PETSC_HAVE_HIP)
>>   ierr = PetscOptionsCheckHIP(logView);CHKERRQ(ierr);
>> #endif
>>
>> put in
>> #if defined thread safety
>> PetscCUPMInitializeCheck
>> #endif
>>
>> this will force the initialize to be done before any threads are used
>>
>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210119/10845193/attachment.html>


More information about the petsc-dev mailing list