[mpich-discuss] Why do predefined MPI_Ops function elementwise in MPI_Accumulate, but not in MPI-1 routines?

Tue Apr 24 09:37:29 CDT 2012

On Tue, Apr 24, 2012 at 08:30, Jim Dinan <dinan at mcs.anl.gov> wrote:

> I'm not sure if I'm following all the discussion, but here's my
> understanding:
>
> MPI_Reduce takes an array of elements and combines all elements from
> every process to produce a single element of the original datatype as a
> result.  It doesn't pick apart the datatype of each element to do a
> reduction on the individual basic units contained inside it.  The
> builtin operations only understand the builtin types; if you have a
> derived datatype, you have to tell MPI how to combine elements of that
> type and produce a new one of the same type.
>
> MPI_Accumulate, on the other hand, works at the level of the basic units
> which are described by the datatype and applies the operation at that
> level.
>
> So, the operation names are the same, but Accumulate and Reduce utilize
> datatypes in different ways.  Reduce applies the op at the level of the
> aggregate datatype and Accumulate applies the op at the level of the
> basic units in the datatype.
>
> It's an interesting suggestion to add the Accumulate interpretation to
> the Reduce operations.  We can't break backward compatibility, so we
> would probably have to define a new set of operations to do the
> reduction at the level of the basic datatypes.
>

It's not an API change to make something that was a run-time error become
not a run-time error.

Let's step over the implementation issues for a moment. Does anyone want to
make the claim that there shall be no (non-deprecated) way for a user to
create an environment where they can write the following?

typedef std::complex<double> Complex;
MPI_Op SUM; // MPI_SUM or a new op
MPI_Datatype COMPLEX; // from MPI or created here

Complex a,b;

MPI_Allreduce(&a,&b,1,COMPLEX,SUM,MPI_COMM_WORLD);
MPI_Accumulate(&a,1,COMPLEX,rank,0,1,COMPLEX,SUM,window);

If the above is a reasonable thing for a user to request, then either (a)
MPI_Accumulate must accept user-defined MPI_Ops, (b) MPI_SUM must operate
on the base elements of a derived type for collectives (instead of just for
MPI_Accumulate), or (c) the issue is delayed by adding a non-deprecated
predefined std::complex<double> type.

I think that both (a) and (b) should be done because then we could do
__float128 or quaternions without having to change the standard. (I cannot
currently use __float128 with one-sided, so any time I use one-sided
because it's a better algorithmic fit, I will also have to implement the
algorithm using MPI-1 so that __float128 works.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120424/57b13b3e/attachment.htm>