[petsc-dev] Hybrid MPI/OpenMP reflections
Barry Smith
bsmith at mcs.anl.gov
Thu Aug 8 13:35:29 CDT 2013
On Aug 8, 2013, at 9:14 AM, Karl Rupp <rupp at mcs.anl.gov> wrote:
> Hi Michael,
>
> > We have recently been trying to re-align our OpenMP fork
>
>> 2) Nonzero-based thread partitioning:
>> Rather than evenly dividing the number of rows among threads, we can
>> partition the thread ownership ranges according to the number of
>> non-zeros in each row. This balances the work load between threads and
>> thus increases strong scalability due to optimised bandwidth
>> utilisation. In general, this optimisation should integrate well with
>> threadcomms, since it only changes the thread ownership ranges, but it
>> does require some structural changes since nnz is currently not passed
>> to PetscLayoutSetUp. Any thoughts on whether people regard such a scheme
>> as useful would be greatly appreciated.
>
> This is a reasonable optimization, I used a similar strategy for sparse matrices on the GPU. Others should comment on whether the interface change to PetscLayoutSetUp is acceptable.
I don't think PetscLayoutSetUp() should be complicated in this fashion. This kind of non-trivial parallel partitioning decisions that depend on the mesh or graph of the problem are made by DMs.
Barry
>
More information about the petsc-dev
mailing list