[mpich-discuss] Fortran integer support 4/8-bytes

Sat Jun 9 03:23:11 CDT 2012

> However, the big problem that the dirac authors completely ignore
> is that all "count" variables as in <snip> are int in C, i.e., 32-bit.

One strategy is to use the profiling interface to insert the count
check into all the routines that Dirac uses without affecting the
application code.  There should be some resources online explaining
how to use symbol interposition for this but you can post to this list
if you get stuck.

> ...in Fortran you should check first
> whether count is smaller than 2^31. Dirac does not do that and this
> has caused me tremenduous grief. There are a bunch of MPI_Allreduce
> statements that I needed to wrap, because otherwise they would cause
> integer overflows. There is a potential of this happening all over
> the code.

Indeed, that's unfortunate, but the aforementioned solution is a
write-once-use-many solution.  If you don't want to go that route, I'm
confident that you can do something like this:

==================================
int MyMPI64_Allreduce(..)
{
  /* multiple calls to MPI_Allreduce */
}

#define MPI_Allreduce MyMPI64_Allreduce
==================================

> I am not aware of the MPI-2 standard even addressing the problem of
> the size of the count variables.

You are correct that the MPI Forum has not done anything about this yet.

> If that would really be allowed then
> all MPI distributions would have to check the value of the Fortran count
> variable first and then call the corresponding C routine repeatedly, if
> necessary.

It should be possible to define a set of 64-bit count interface
functions using the same tools that MPICH2 uses to generate Fortran
and C++ interfaces on top of C implementation functions, but this
would be an extension and not something covered by the MPI standard.

> I am still hoping that MPI-3 will solve the problem and somehow make
> it possible to have "count" variables larger than 2^31 ...

I was present for a number of the discussions.  It was decided that
this was prohibitive and the MPI standard would not be extended to
address it.  Here are the reasons:

1. For backwards compatibility, the existing functions using 32-bit
integer counts cannot be changed.  All functions that support 64-bit
integer or longer (using MPI_Count) arguments would be new functions.
This would effectively double the number of functions in the MPI
standard, making life difficult on implementers and users.

2. There is no compelling reason to do this within the MPI standard
since there is no performance overhead associated with multiple
function calls with count of less than or equal to 2^31 since latency
is completely overwhelmed by bandwidth cost.

3. If function call overhead is believe to be an issue (it is not),
then the user can define a very simple derived datatype for sending
messages of more than 2^31 elements.

4. It is straightforward to implement wrapper functions that enable
the aforementioned solutions for users that do not want to implement
them on their own.  As noted previously, such extensions could be
auto-generated by implementations.

It is not a high priority for anyone I know but I suppose someone
working on MPICH2 could autogenerate MPIX_Foo64 for all MPI_Foo that
take integer count arguments and have the former function implemented
in terms of multiple calls to the latter.

Best,

Jeff

-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond