[mpich2-dev] Problem with MPI_Type_commit() and assert in segment_ops.c
Rob Ross
rross at mcs.anl.gov
Wed Jun 17 10:35:37 CDT 2009
No progress so far, but I haven't had a good day to concentrate on it
since you first identified the problem. I'm at a conference all week,
but I should be able to focus next week...
Rob
On Jun 17, 2009, at 10:27 AM, Joe Ratterman wrote:
> Rob,
>
> Have you (or your team) had any luck tracking this one down? We
> haven't been able to trace the cause ourselves.
>
> Thanks,
> Joe Ratterman
> jratt at us.ibm.com
>
>
> On Tue, Jun 9, 2009 at 3:50 PM, Rob Ross <rross at mcs.anl.gov> wrote:
> Hi,
>
> Those type casts to (size_t) should be to (MPI_Aint).
>
> That assertion is checking that a parameter being passed to
> Segment_mpi_flatten is > 0. The parameter is the length of the list
> of regions being passed in by reference to be filled in (the
> destination of the list of regions). So for some reason we're
> getting a zero (or possibly negative) value passed in as the length
> of the arrays.
>
> There's only one place in the struct creation where
> Segment_mpi_flatten() is called; it's line 666 (evil!) of
> dataloop_create_struct.c. This is in
> DLOOP_Dataloop_create_flattened_struct(), which is a function used
> to make a struct into an indexed type.
>
> The "pairtypes", such as MPI_SHORT_INT, are special cases in MPI in
> that some of them have more than one "element type" (e.g. MPI_INT,
> MPI_SHORT_INT) in them. My guess is that there's an assumption in
> the DLOOP_Dataloop_create_flattened_struct() code path that is
> having trouble with the pairtype.
>
> I'm surprised that we might have introduced something between 1.0.7
> and 1.1; I can't recall anything in particular that has changed in
> this code path. Someone should check the repo logs and see if
> something snuck in?
>
> Rob
>
>
> On Jun 9, 2009, at 3:13 PM, Joe Ratterman wrote:
>
> The specifics of this test come from an MPI excerciser that gathered
> (using MPIR_Gather) a variety of types, including MPI_SHORT_INT.
> The way that gather is implemented, it created and then sent a
> struct datatype of the tmp-data from the software tree and the local-
> data. I pulled out the important bits, and got this test-case. It
> asserts on PPC32 Linux 1.1 and BGP 1.1rc0, but runs fine on 1.0.7.
> The addresses/displacements are fake, but were originally based on
> the actual values used inside MPIR_Gather. It does the type-create
> on the first two types just to show that it doesn't always fail.
>
>
> Error message:
>
> Creating addr=[0x1,0x2] types=[8c000003,4c00010d]
> struct_displs=[1,2] blocks=[256,256] MPI_BOTTOM=(nil)
> foo:25
> Assertion failed in file segment_ops.c at line 994: *lengthp > 0
> internal ABORT - process 0
>
>
> Code
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <mpi.h>
>
> void foo(void *sendbuf,
> MPI_Datatype sendtype,
> void *recvbuf,
> MPI_Datatype recvtype)
> {
> int blocks[2];
> MPI_Aint struct_displs[2];
> MPI_Datatype types[2], tmp_type;
>
> blocks[0] = 256;
> struct_displs[0] = (size_t)sendbuf;
> types[0] = sendtype;
> blocks[1] = 256;
> struct_displs[1] = (size_t)recvbuf;
> types[1] = MPI_BYTE;
>
> printf("Creating addr=[%p,%p] types=[%x,%x] struct_displs=[%x,
> %x] blocks=[%d,%d] MPI_BOTTOM=%p\n",
> sendbuf, recvbuf, types[0], types[1], struct_displs[0],
> struct_displs[1], blocks[0], blocks[1], MPI_BOTTOM);
> MPI_Type_create_struct(2, blocks, struct_displs, types, &tmp_type);
> printf("%s:%d\n", __func__, __LINE__);
> MPI_Type_commit(&tmp_type);
> printf("%s:%d\n", __func__, __LINE__);
> MPI_Type_free (&tmp_type);
> puts("Done");
> }
>
>
> int main()
> {
> MPI_Init(NULL, NULL);
>
> foo((void*)0x1,
> MPI_FLOAT_INT,
> (void*)0x2,
> MPI_BYTE);
> sleep(1);
> foo((void*)0x1,
> MPI_DOUBLE_INT,
> (void*)0x2,
> MPI_BYTE);
> sleep(1);
> foo((void*)0x1,
> MPI_SHORT_INT,
> (void*)0x2,
> MPI_BYTE);
>
> MPI_Finalize();
> return 0;
> }
>
>
>
> I don't know anything about how this might be fixed, but we are
> looking into it as well.
>
> Thanks,
> Joe Ratterman
> jratt at us.ibm.com
>
>
More information about the mpich2-dev
mailing list