[petsc-dev] Questions around benchmarking and data loading with PETSc
Victor Eijkhout
eijkhout at tacc.utexas.edu
Tue Dec 14 14:25:58 CST 2021
On , 2021Dec11, at 17:56, Rohan Yadav <rohany at alumni.cmu.edu<mailto:rohany at alumni.cmu.edu>> wrote:
40 mpi ranks on a single node should be similar performance as 40 threads. Both petsc and taco are doing a row-based parallelism strategy so it should line up.
An MPI division of rows is static. Petsc divides strictly by numbers of rows.
A thread based system can do things like “schedule(guided)” (OpenMP) and get better load balancing if the rows have widely differing numbers of nonzero.
Victor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20211214/eafba9a3/attachment.html>
More information about the petsc-dev
mailing list