[mpich2-dev] Hvector with Zero Blocks Asserts

Jeff Parker jjparker at us.ibm.com
Tue Mar 3 10:18:13 CST 2009


IBM Blue Gene/P has received a customer-reported problem that appears to be
in the stock MPICH2 code.  The application is committing a datatype
consisting of an hvector having 0 blocks, which results in an assertion
that is wanting this value to be positive.  The spec says the following,
specifically that count is a non-negative integer, so a value of zero
should be allowed:

Synopsis
#include "mpi.h"
int MPI_Type_hvector(
        int count,
        int blocklen,
        MPI_Aint stride,
        MPI_Datatype old_type,
        MPI_Datatype *newtype )

Input Parameters
                                                           
    count       number of blocks (nonnegative integer)     
                                                           
    blocklength number of elements in each block           
                (nonnegative integer)                      
                                                           
    stride      number of bytes between start of each      
                block (integer)                            
                                                           
    old_type    old datatype (handle)                      
                                                           


A reproducer is included below.  It fails on Blue Gene/P (MPICH2 1.0.7) and
on Linux (MPICH2 1.0.7rc1), but works on Blue Gene/L (MPICH2 1.0.4p1).
This assertion did not exist in MPICH2 1.0.5p4, but appears in MPICH2 1.0.6
and later versions.

The assertion is in src/mpid/common/datatype/dataloop/segment_ops.c in
function DLOOP_Segment_contig_count_block.  If the assertion is changed
from
DLOOP_Assert(*blocks_p > 0);
to
DLOOP_Assert(*blocks_p >= 0);
it works.

There are other places with this assertion, and other similar assertions
that may need fixing too:

grep -r "*blocks_p >" *
src/mpi/romio/common/dataloop/segment_ops.c:    DLOOP_Assert(*blocks_p >
0);
src/mpi/romio/common/dataloop/segment_ops.c:    DLOOP_Assert(count > 0 &&
blksz > 0 && *blocks_p > 0);
src/mpi/romio/common/dataloop/segment_ops.c:    DLOOP_Assert(count > 0 &&
blksz > 0 && *blocks_p > 0);
src/mpi/romio/common/dataloop/segment_ops.c:    DLOOP_Assert(count > 0 &&
*blocks_p > 0);
src/mpid/common/datatype/dataloop/segment_ops.c:    DLOOP_Assert(*blocks_p
>= 0);
src/mpid/common/datatype/dataloop/segment_ops.c:    DLOOP_Assert(count > 0
&& blksz > 0 && *blocks_p > 0);
src/mpid/common/datatype/dataloop/segment_ops.c:    DLOOP_Assert(count > 0
&& blksz > 0 && *blocks_p > 0);
src/mpid/common/datatype/dataloop/segment_ops.c:    DLOOP_Assert(count > 0
&& *blocks_p > 0);

Reproducer:

#include <stdio.h>

#include <mpi.h>

int main(int argc, char *argv[])
{
   MPI_Datatype mystruct, vecs[3];
   MPI_Aint stride = 5, displs[3];
   int i=0, blockcount[3];

   MPI_Init(&argc, &argv);

   for(i=0;i<3;i++)
   {
      /* important point appears to be the i==0 vectors here */
      MPI_Type_hvector(i, 1, stride, MPI_INT, &vecs[i]);
      MPI_Type_commit(&vecs[i]);
      blockcount[i]=1;
   }
   displs[0]=0; displs[1]=-100; displs[2]=-200; /* irrelevant */

   MPI_Type_struct(3, blockcount, displs, vecs, &mystruct);
   fprintf(stderr,"Before commiting structure\n");
   MPI_Type_commit(&mystruct);
   fprintf(stderr,"After commiting structure\n");

   MPI_Finalize();


   return 0;
}

Output (in and after MPICH2 1.0.6):
Before commiting structure
Before commiting structure
Assertion failed in
file /bglhome/usr6/bgbuild/V1R3M0_460_2008-081112P/ppc/bgp/comm/lib/dev/mpich2/src/mpid/common/datatype/dataloop/segment_ops.c
 at line 375: *blocks_p > 0
Assertion failed in
file /bglhome/usr6/bgbuild/V1R3M0_460_2008-081112P/ppc/bgp/comm/lib/dev/mpich2/src/mpid/common/datatype/dataloop/segment_ops.c
 at line 375: *blocks_p > 0
Abort(1) on node 1: Internal error
Abort(1) on node 0: Internal error

Jeff Parker
Blue Gene Messaging
61L/030-2 A407    507-253-4208    TieLine: 553-4208
Notes email: Jeff Parker/Rochester/IBM
INTERNET: jjparker at us.ibm.com     AFS: jeff at rchland



More information about the mpich2-dev mailing list