[mpich-discuss] wrong results from MPI_Scatter with derived data type

Siegmar Gross Siegmar.Gross at informatik.fh-fulda.de
Fri Jun 13 01:29:39 CDT 2008


Hi,

I have a problem with derived data types and MPI_Scatter/MPI_Gather
(Solaris 10 sparc, mpich2-1.0.6).

I want to distribute the columns of a matrix. At first I wrote a C
program which implemented the derived data type "coltype" and distributed
the columns via MPI_Send/MPI_Recv without problems. Next I modified the
program and used MPI_Scatter/MPI_Gather to distribute and collect the
columns. I implemented "coltype" once more with MPI_Type_struct. The
program didn't work, so I used a 2x2 matrix to figure out what's wrong.
Each process prints its column elements after MPI_Scatter. The process
with rank 1 didn't get the values "2" and "4" (see below), but more or
less 0. Now I used a 4x2 matrix and still a 2-element column (so I should
see the upper 2x2 "matrix" in my columns) to get an idea which values
are used for process 1. As you can see below it got "5" and "7", i.e.
the values of the block which starts just after the first block and not
the values of the block which starts after the first element of the
first block (a[2][0] instead of a[0][1]).

Since I wasn't sure if I could use MPI_Type_struct I rewrote the program
with MPI_Type_vector. This time the result was better but still not
satisfying. Process 1 got values from the second column but one value too
late (starting with a[1][1] instead of a[1][0]).

I assume that I have misunderstood a concept or I have a programming
error in my code, because I run into the same problem with MPICH,
LAM-MPI, and OpenMPI, and it is not very likely that all implementations
have a bug. Since I dont't know how to proceed, I would be very grateful
if someone could tell me if I must blame myself for the error or if it
is eventually a bug in the implementations of the MPI libraries (how
unlikely it is). The line "-n 2 -host <hostname> e5_1a" is stored into
file "app_e5_1a.mpich2".


MPI_Type_struct
===============

tyr e5 158 mpiexec -configfile app_e5_1a.mpich2

original matrix:

     1     2
     3     4

rank: 0  c0: 1  c1: 3
rank: 1  c0: 5.51719e-313  c1: 4.24399e-314


tyr e5 160 mpiexec -configfile app_e5_1a.mpich2

original matrix:

     1     2
     3     4
     5     6
     7     8

rank: 0  c0: 1  c1: 3
rank: 1  c0: 5  c1: 7



MPI_Type_vector
===============

tyr e5 119 mpiexec -configfile app_e5_1b.mpich2

original matrix:

     1     2
     3     4
     5     6
     7     8

rank: 0  c0: 1  c1: 3
rank: 1  c0: 4  c1: 6


Thank you very much for any help or suggestions in advance.


Kind regards

Siegmar
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/x-sun-c-file
Size: 2956 bytes
Desc: e5_1a.c
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080613/0457103d/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/x-sun-c-file
Size: 2620 bytes
Desc: e5_1b.c
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080613/0457103d/attachment-0001.bin>


More information about the mpich-discuss mailing list