[mpich2-dev] Bug in ch3 MPIDI_Accumulate()?

Rajeev Thakur thakur at mcs.anl.gov
Fri Apr 4 15:27:32 CDT 2008


David,
           You are right. There is a bug. It didn't get caught so far
because the test suite doesn't have a test for an accumulate with derived
datatype at origin, basic datatype at target, and origin = target.
 
Rajeev


  _____  

From: owner-mpich2-dev at mcs.anl.gov [mailto:owner-mpich2-dev at mcs.anl.gov] On
Behalf Of David Gingold
Sent: Wednesday, April 02, 2008 3:23 PM
To: mpich2-dev at mcs.anl.gov
Subject: [mpich2-dev] Bug in ch3 MPIDI_Accumulate()?


I ran into this problem in code that I cribbed from the ch3 implementation.
I've not verified that it happens in ch3, but it seems, by inspection, that
it probably does: 

In MPIDI_Accumulate() (ch3u_rma_ops.c), in the case where target_rank ==
rank and the origin and target datatypes are not both pre-defined, the code
does this:

MPID_Datatype_get_ptr(target_datatype, dtp);
vec_len = dtp->n_contig_blocks * target_count + 1; 

The trouble is that we'll hit this case where the origin type is derived and
the target type is pre-defined. And MPID_Datatype_get_ptr() isn't workable
for pre-defined datatypes. So dtp->* will evaluate to zero. That causes
grief when we reference dtp->eltype a few lines down.

I think the simple fix is to add separate cases in here for when
target_datatype is predefined. But I also spotted the code in
MPIR_Datatype_builtin_fillin(), which, if called, would seem to make
MPID_Datatype_get_ptr() become valid for built-in datatypes. Is there a
reason why MPIR_Datatype_builtin_fillin() isn't simply called inside
MPI_Init() somwehere?

-dg


--
David Gingold
Principal Software Engineer
SiCortex
Three Clock Tower Place, Suite 210
Maynard MA 01754
(978) 897-0214 x224




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.mcs.anl.gov/mailman/private/mpich2-dev/attachments/20080404/35a7bf4f/attachment.htm>


More information about the mpich2-dev mailing list