[mpich2-dev] Problem with MPI_Type_commit() and assert in segment_ops.c

Tue Jun 30 15:38:10 CDT 2009

Hi all,

The bug is somewhere in DLOOP_Segment_index_mpi_flatten(). A quick fix  
is to change the reference to DLOOP_Segment_index_mpi_flatten around  
line 1007 of segment_ops.c to NULL. This makes the test pass; however,  
certain types will be considerably more slowly flattened than before.

I will look into a more permanent fix.

Rob

On Jun 30, 2009, at 2:47 PM, Rob Ross wrote:

> Hi all,
>
> Ok, so interestingly it appears that MPID_Segment_mpi_flatten() is  
> incorrectly flattening the 256 instances of the MPI_SHORT_INT type.  
> As a reminder, the MPI_SHORT_INT (at least on my Mac) is a 2-byte  
> integer followed by a skip of two bytes and then a 4-byte integer  
> (size = 6, extent = 8). A contiguous region of these, flattened,  
> should have block lengths (in bytes) of 2, 6, 6, ..., 6, 4. In other  
> words the 4-byte and 2-byte regions in the middle should all be  
> combined into 6-byte regions.
>
> Looking at the resulting block lengths, I see something like this:
>
> (gdb) print *params.blklens at 256
> $25 = {2, 6, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2,  
> 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 6, 4, 2, 4, 2,  
> 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 6, 4, 2, 4, 2, 4, 2, 4, 2,  
> 4, 2, 4, 2, 4, 2, 4, 2, 6, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2,  
> 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, ...
>
> Weird! Anyway, because the flattening code is unable to correctly  
> combine those regions, it uses up all the indices that we previously  
> calculated that we would need when processing the first type (the  
> MPI_SHORT_INT), leaving nothing to hold the one last region that the  
> 256 MPI_BYTE types would consume. So when we make the second call to  
> MPID_Segment_mpi_flatten() to process the MPI_BYTE type, we hit that  
> assert.
>
> I'm looking into the problem with the flatten code now.
>
> Rob
>
> On Jun 30, 2009, at 1:40 PM, Rob Ross wrote:
>
>> Just a note that I've gotten myself up to speed, have replicated  
>> the bug on my laptop, and will try to figure out what is going on  
>> today.
>>
>> Rob
>>
>> On Jun 9, 2009, at 4:11 PM, Joe Ratterman wrote:
>>
>>> I know it's not that important--and clearly not relevant--but BG/P  
>>> will generate a compiler warning if I use an MPI_Aint cast there.   
>>> We want to avoid any ambiguity that such a cast would involve  
>>> (i.e. is it sign extended?), so I use a cast that works correctly  
>>> for this sort of micro-tests.  It is also correct on PPC32.  This  
>>> small example shows the warnings:
>>>
>>> $ cat -n size.c
>>>    1  #include <mpi.h>
>>>    2
>>>    3  extern void bar(MPI_Aint a, MPI_Aint b, MPI_Aint c);
>>>    4
>>>    5  void foo(void* p)
>>>    6  {
>>>    7    MPI_Aint aint = (MPI_Aint)p;
>>>    8
>>>    9    MPI_Aint one = (long long int)p;
>>>   10    MPI_Aint two = (int)p;
>>>   11
>>>   12    bar(aint, one, two);
>>>   13  }
>>>
>>> $ /bgsys/drivers/ppcfloor/comm/bin/mpicc -Wall -g -c size.csize.c:  
>>> In function 'foo':
>>> size.c:7: warning: cast from pointer to integer of different size
>>> size.c:9: warning: cast from pointer to integer of different size
>>>
>>>
>>> Thanks,
>>> Joe Ratterman
>>> jratt at us.ibm.com
>>>
>>> On Tue, Jun 9, 2009 at 3:50 PM, Rob Ross <rross at mcs.anl.gov> wrote:
>>> Hi,
>>>
>>> Those type casts to (size_t) should be to (MPI_Aint).
>>>
>>> That assertion is checking that a parameter being passed to  
>>> Segment_mpi_flatten is > 0. The parameter is the length of the  
>>> list of regions being passed in by reference to be filled in (the  
>>> destination of the list of regions). So for some reason we're  
>>> getting a zero (or possibly negative) value passed in as the  
>>> length of the arrays.
>>>
>>> There's only one place in the struct creation where  
>>> Segment_mpi_flatten() is called; it's line 666 (evil!) of  
>>> dataloop_create_struct.c. This is in  
>>> DLOOP_Dataloop_create_flattened_struct(), which is a function used  
>>> to make a struct into an indexed type.
>>>
>>> The "pairtypes", such as MPI_SHORT_INT, are special cases in MPI  
>>> in that some of them have more than one "element type" (e.g.  
>>> MPI_INT, MPI_SHORT_INT) in them. My guess is that there's an  
>>> assumption in the DLOOP_Dataloop_create_flattened_struct() code  
>>> path that is having trouble with the pairtype.
>>>
>>> I'm surprised that we might have introduced something between  
>>> 1.0.7 and 1.1; I can't recall anything in particular that has  
>>> changed in this code path. Someone should check the repo logs and  
>>> see if something snuck in?
>>>
>>> Rob
>>>
>>>
>>> On Jun 9, 2009, at 3:13 PM, Joe Ratterman wrote:
>>>
>>> The specifics of this test come from an MPI excerciser that  
>>> gathered (using MPIR_Gather) a variety of types, including  
>>> MPI_SHORT_INT.  The way that gather is implemented, it created and  
>>> then sent a struct datatype of the tmp-data from the software tree  
>>> and the local-data.  I pulled out the important bits, and got this  
>>> test-case.  It asserts on PPC32 Linux 1.1 and BGP 1.1rc0, but runs  
>>> fine on 1.0.7.  The addresses/displacements are fake, but were  
>>> originally based on the actual values used inside MPIR_Gather.  It  
>>> does the type-create on the first two types just to show that it  
>>> doesn't always fail.
>>>
>>>
>>> Error message:
>>>
>>> Creating  addr=[0x1,0x2]  types=[8c000003,4c00010d]   
>>> struct_displs=[1,2]  blocks=[256,256]  MPI_BOTTOM=(nil)
>>> foo:25
>>> Assertion failed in file segment_ops.c at line 994: *lengthp > 0
>>> internal ABORT - process 0
>>>
>>>
>>> Code
>>>
>>> #include <stdio.h>
>>> #include <stdlib.h>
>>> #include <unistd.h>
>>> #include <mpi.h>
>>>
>>> void foo(void *sendbuf,
>>>       MPI_Datatype sendtype,
>>>       void *recvbuf,
>>>       MPI_Datatype recvtype)
>>> {
>>> int blocks[2];
>>> MPI_Aint struct_displs[2];
>>> MPI_Datatype types[2], tmp_type;
>>>
>>> blocks[0] = 256;
>>> struct_displs[0] = (size_t)sendbuf;
>>> types[0] = sendtype;
>>> blocks[1] = 256;
>>> struct_displs[1] = (size_t)recvbuf;
>>> types[1] = MPI_BYTE;
>>>
>>> printf("Creating  addr=[%p,%p]  types=[%x,%x]  struct_displs=[%x, 
>>> %x]  blocks=[%d,%d]  MPI_BOTTOM=%p\n",
>>>       sendbuf, recvbuf, types[0], types[1], struct_displs[0],  
>>> struct_displs[1], blocks[0], blocks[1], MPI_BOTTOM);
>>> MPI_Type_create_struct(2, blocks, struct_displs, types, &tmp_type);
>>> printf("%s:%d\n", __func__, __LINE__);
>>> MPI_Type_commit(&tmp_type);
>>> printf("%s:%d\n", __func__, __LINE__);
>>> MPI_Type_free  (&tmp_type);
>>> puts("Done");
>>> }
>>>
>>>
>>> int main()
>>> {
>>> MPI_Init(NULL, NULL);
>>>
>>> foo((void*)0x1,
>>>    MPI_FLOAT_INT,
>>>    (void*)0x2,
>>>    MPI_BYTE);
>>> sleep(1);
>>> foo((void*)0x1,
>>>    MPI_DOUBLE_INT,
>>>    (void*)0x2,
>>>    MPI_BYTE);
>>> sleep(1);
>>> foo((void*)0x1,
>>>    MPI_SHORT_INT,
>>>    (void*)0x2,
>>>    MPI_BYTE);
>>>
>>> MPI_Finalize();
>>> return 0;
>>> }
>>>
>>>
>>>
>>> I don't know anything about how this might be fixed, but we are  
>>> looking into it as well.
>>>
>>> Thanks,
>>> Joe Ratterman
>>> jratt at us.ibm.com
>>>
>>>
>>
>