[petsc-users] Using OpenMP threads with PETSc

Jed Brown jed at jedbrown.org
Thu Apr 9 20:50:17 CDT 2015

Lucas Clemente Vella <lvella at gmail.com> writes:
> Will be no worse than separated MPI processes competing for the same
> cache slot. 

There can be LLC sharing, but L1 and usually L2 are private per core and
are used less efficiently when each process has non-contiguous blocks.

> At least with threads there is a chance that different tasks will hit
> the same cached memory. I do believe there should be a smart way to
> control and optimize thread work proximity on OMP for loops, i.e.:
> #pragma omp parallel for
> for(size_t i = 0; i < size; ++size) {
>     // something not dependent on previous steps
> }
> Each two threads working on ranges closely together should run on the
> two hyperthreads of the same core, to maximize cache reuse. Because
> this usage pattern of OpenMP, it seems to me that it is already
> unlikely that two threads will be working to far off each other, but
> if I wanted this level of control, I should now be hacking some OpenMP
> implementation and/or the kernel.

Test this hypothesis.

> On the other hand, pthread interface favors more loosely coupled
> tasks, that may yield worse cache reuse, but I confess I didn't take
> the time to look inside PETSc how each of the threading libraries was
> used.

Actually, the pthread threadcomm is not that much different from this
OpenMP strategy.  Either could be used at a higher level, but the user
interface would be different and something nobody says they want.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150409/9211a0d5/attachment.pgp>

More information about the petsc-users mailing list