[petsc-dev] Memory problem with OpenMP and Fieldsplit sub solvers

Mark Adams mfadams at lbl.gov
Fri Jan 22 14:11:25 CST 2021


On Fri, Jan 22, 2021 at 12:57 PM Barry Smith <bsmith at petsc.dev> wrote:

>
> The library is thread safe and its functions can be called from multiple
> host threads, even with the same handle. When multiple threads share the
> same handle,
>
>
> extreme care needs to be taken when the handle configuration is changed
> because that change will affect potentially subsequent cuBLAS calls in all
> threads.
>
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> It is even more true for the destruction of the handle. So it is not
> recommended that multiple thread share the same cuBLAS handle.
>
> From my reading of this there should be absolutely no issue. The handle
> configuration is never being changed. It should be set in PetscInitialize()
> and destroyed in PetscFinalize(). Since lazy configuration of cuBLAS is
> turned off (right?).
>

You had me add to init.c:

#if defined(PETSC_HAVE_THREADSAFETY)
ierr = PetscCUPMInitializeCheck();CHKERRQ(ierr);
#endif


because getHandle, which does lazy initialization, was not thread safe. But
I think it is now.

I am going to test without it and remove if it's OK (I'm sure it will be)

And I don't know what they mean by "when the handle configuration is
changed". It makes sense to me that the handle would not be thread safe. It
is an object and it has state. The purpose of making an object, and not
have one global object is to make it thread safe...

I can see that it is not thread safe. There were clear
non-deterministic race conditions and it differed to a random degree from a
serial run each time. A vector norm returned 0 sometimes when the first
entry != 0.

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210122/df7d60b7/attachment.html>


More information about the petsc-dev mailing list