[petsc-users] Using OpenMP threads with PETSc

Jed Brown jed at jedbrown.org
Thu Apr 9 17:33:11 CDT 2015


Lucas Clemente Vella <lvella at gmail.com> writes:
> I suspect the optimal setup is to have one process for each NUMA node,
> one thread for logical core, 

Why?  Are you packing buffers in parallel (extra openmp overhead) or
serial (Amdahl's law limitations)?  The NIC most likely supports as many
hardware contexts as cores, so there is a shorter critical path when
using flat MPI.

> and affinity lock each process (all threads) to the cores
> corresponding to its NUMA node. The mentioned article showed
> improvements on almost all cases they compared OpenMP threads with MPI
> processes.

This varies a lot by machine, network hardware and tuning, software
stack, and problem sizes/configuration.  It is not universal and I can
show you lots of examples where flat MPI is the way to go.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150409/a0b4fadc/attachment-0001.pgp>


More information about the petsc-users mailing list