[mpich-discuss] Why do predefined MPI_Ops function elementwise in MPI_Accumulate, but not in MPI-1 routines?

Tue Apr 24 17:27:38 CDT 2012

On Tue, Apr 24, 2012 at 16:40, Jim Dinan <dinan at mcs.anl.gov> wrote:

> I think I'm getting MPI 2.2 and 3.0 semantics mixed up.  MPI 2.2 only
> allows concurrent or same-epoch accumulates that use the same operation and
> have the same basic datatype.  So, this is fine in 2.2.  MPI 3.0 will relax
> the same operation restriction, which could make it challenging to have an
> efficient hardware implementation that maintains atomicity with an agent
> running on the CPU.

It would be interesting to understand why this is and if there would be a
way to get around that potential inefficiency. Is any system seriously
going to implement *every* MPI predefined type and operation in hardware?
If there are any exceptions, isn't the check to revert to a CPU-based
implementation going to have to exist anyway?

It's obnoxious that MPI-1 could be used reliably with new types (e.g.
__float128, user-defined complex, etc), but that newer features in MPI-2
cannot be and that the standard is evolving to further cement this position
that user-defined types and operations are second-class.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120424/9dee1774/attachment.htm>