[petsc-dev] Hybrid MPI/OpenMP reflections

Michael Lange michael.lange at imperial.ac.uk
Thu Aug 8 05:37:39 CDT 2013


Hi,

We have recently been trying to re-align our OpenMP fork 
(https://bitbucket.org/ggorman/petsc-3.3-omp) with petsc/master. Much of 
our early work has now been superseded by the threadcomm 
implementations. Nevertheless, there are still a few algorithmic 
differences between the two branches:

1) Enforcing MPI latency hiding by using task-based spMV:
If the MPI implementation used does not actually provide truly 
asynchronous communication in hardware, performance can be increased by 
dedicating a single thread to overlapping MPI communication in PETSc. 
However, this is arguably a vendor-specific fix which requires 
significant code changes (ie the parallel section needs to be raised up 
by one level). So perhaps the strategy should be to give guilty vendors 
a hard time rather than messing up the current abstraction.

2) Nonzero-based thread partitioning:
Rather than evenly dividing the number of rows among threads, we can 
partition the thread ownership ranges according to the number of 
non-zeros in each row. This balances the work load between threads and 
thus increases strong scalability due to optimised bandwidth 
utilisation. In general, this optimisation should integrate well with 
threadcomms, since it only changes the thread ownership ranges, but it 
does require some structural changes since nnz is currently not passed 
to PetscLayoutSetUp. Any thoughts on whether people regard such a scheme 
as useful would be greatly appreciated.

3) MatMult_SeqBAIJ not threaded:
Is there a reason why MatMult has not been threaded for BAIJ matrices, 
or is somebody already working on this? If not, I would like to prepare 
a pull request for this using the same approach as MatMult_SeqAIJ.

We would welcome any suggestions/feedback on this, in particular the 
second point. Up to date benchmarking results for the first two methods, 
including BlueGene/Q, can be found in:
http://arxiv.org/abs/1307.4567

Kind regards,

Michael Lange



More information about the petsc-dev mailing list