[petsc-users] Pipelined CG (or Gropp's CG) and communication overlap

Mon Mar 17 04:21:39 CDT 2014

Chao Yang <chao.yang at Colorado.EDU> writes:

> The pipelined CG (or Gropp's CG) recently implemented in PETSc is very
> attractive since it has the ability of hiding the collective
> communication in vector dot product by overlapping it with the
> application of preconditioner and/or SpMV.
>
> However, there is an issue that may seriously degrade the
> performance. In the pipelined CG, the asynchronous MPI_Iallreduce is
> called before the application of preconditioner and/or SpMV, and then
> ended by MPI_Wait. In the application of preconditioner and/or SpMV,
> communication may also be required (such as halo updating), which I
> find is often slowed down by the unfinished MPI_Iallreduce in the
> background.
>
> As far as I know, the current MPI doesn't provide prioritized
> communication. 

No, and there is not much interest in adding it because it adds
complication and tends to create starvation situations in which raising
the priority actually makes it slower.

> Therefore, it's highly possible that the performance of the pipelined
> CG may be even worse than a classic one due to the slowdown of
> preconditioner and SpMV. Is there a way to avoid this?

This is an MPI quality-of-implementation issue and there isn't much we
can do about it.  There may be MPI tuning parameters that can help, but
the nature of these methods is that in exchange for creating
latency-tolerance in the reduction, it now overlaps the neighbor
communication in MatMult/PCApply.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140317/d906dd01/attachment.pgp>