[mpich-discuss] Sending structures made of MPI Datatypes

Mon Nov 15 10:10:07 CST 2010

Hi,
 And you can try using MPI_Gather() for getting the chunks from the different processes to rank=0 . Similarly you can use MPI_AllGather() if you want all the processes to receive the chunks (instead of just rank 0).

Regards,
Jayesh
----- Original Message -----
From: Jayesh Krishna <jayesh at mcs.anl.gov>
To: mpich-discuss at mcs.anl.gov
Sent: Mon, 15 Nov 2010 10:03:35 -0600 (CST)
Subject: Re: [mpich-discuss] Sending structures made of MPI Datatypes

Hi,
 What is the version of MPICH2 that you are using (From the error message it looks like you are using MPICH instead of MPICH2. Are you running your job on a heterogeneous cluster ?)? We recommend that you use the latest stable release of MPICH2 (http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads). 
 I tried compiling/running your job on a single node and it worked without any errors with MPICH2 1.3 .

Regards,
Jayesh
----- Original Message -----
From: zkovacs <zkovacs at hku.hk>
To: mpich-discuss at mcs.anl.gov
Sent: Sun, 14 Nov 2010 21:51:45 -0600 (CST)
Subject: [mpich-discuss] Sending structures made of MPI Datatypes

Dear all,

I defined a structure with the name "st1", created an MPI datatype for it and commited it in the function crt_dt(). I also defined another structure with the name "st2" containing 4 structures of the first type (st1). Its MPI data type is also defined and commited it in the function crt_dt(). After allocating an 1D data array of structures st2 with the size N x M, I initialized each structure member of all the structures in the array by calling loc_init(). Then I tried to send chunks of the array with the size of M from the process with ranks higher than 0 to the process 0. The offsets of the chunks for a process with the rank r were set to (r + i x n) x M < N*(M-1) with i =1,2,... , where n is the total number of the processes. That would cover the whole array for different porcesses. However, when the processes send the very first chunks the error message 'p4_error: interrupt SIGSEGV: 11; is generated.
I wonder if there is some bug in the code since the size of the chunks cannot so big. For M = 1500 it is about M*sizeof(st2) = 1500 x 140 bytes. I also tested it with N=10, and M=15 but it does not work.
Has anybody worked with huge nested structure arrays before? Is there any efficient algorithm to redistribute the results stored in the families of chunks to all the processes? 

Thanks a lot,
Zoltán

The code is simple:

#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#define  N 10
#define  M 15

typedef struct st1 {
  double m0, m1, m2, m3;
} st1;

typedef struct st2 {
  int m0;
  double m1;
  st1 m2, m3, m4, m5;
} st2;

st2 * data; 

MPI_Datatype st1_type, st2_type;

int crt_dt()
{
  int i, base;

    MPI_Datatype type1[4] = {MPI_DOUBLE,MPI_DOUBLE,MPI_DOUBLE,MPI_DOUBLE};
    MPI_Datatype type2[6]={MPI_INT,MPI_DOUBLE};
    int blk_len[6] = {1,1,1,1,1,1};
    MPI_Aint disp[6];
    st1 s1;
    st2 s2;

    MPI_Address( &s1.m0, disp );
    MPI_Address( &s1.m1, disp+1 );
    MPI_Address( &s1.m2, disp+2 );
    MPI_Address( &s1.m3, disp+3 );
    base = disp[0];
    for( i = 0; i < 4; i++ )
        disp[i] -= base;

    MPI_Type_struct( 4, blk_len, disp, type1, &st1_type );
    MPI_Type_commit(&st1_type);

    type2[2] = st1_type;
    type2[3] = st1_type;
    type2[4] = st1_type;
    type2[5] = st1_type;

    MPI_Address( &s2.m0, disp );
    MPI_Address( &s2.m1, disp+1 );
    MPI_Address( &s2.m2, disp+2 );
    MPI_Address( &s2.m3, disp+3 );
    MPI_Address( &s2.m4, disp+4 );
    MPI_Address( &s2.m5, disp+5 );
    base = disp[0];
    for( i = 0; i < 6; i++ )
        disp[i] -= base;

    MPI_Type_struct( 6, blk_len, disp, type2, &st2_type );
    MPI_Type_commit(&st2_type);

    return 0;
}

int loc_init( int rank )
{
  unsigned int i, j;

   for( i = 0; i < N; i ++ )
    {
      for( j = 0; j < M; j++ )
	{
	  (data+i*M+j)->m0 = rank + i*M + j;
	  (data+i*M+j)->m1   = (double)(rank + i*M + j); 

	  (data+i*M+j)->m2.m0 = (double)(rank + i*M + j);
	  (data+i*M+j)->m2.m1 = 0;
	  (data+i*M+j)->m2.m2 = 0;
	  (data+i*M+j)->m2.m3 = 0;

	  (data+i*M+j)->m3.m0 = (double)(rank + i*M + j);
	  (data+i*M+j)->m3.m1 = 0;
	  (data+i*M+j)->m3.m2 = 0;
	  (data+i*M+j)->m3.m3 = 0;

	  (data+i*M+j)->m4.m0 = (double)(rank + i*M + j);
	  (data+i*M+j)->m4.m1 = 0;
	  (data+i*M+j)->m4.m2 = 0;
	  (data+i*M+j)->m4.m3 = 0;

	  (data+i*M+j)->m5.m0 = (double)(rank + i*M + j);
	  (data+i*M+j)->m5.m1 = 0;
	  (data+i*M+j)->m5.m2 = 0;
	  (data+i*M+j)->m5.m3 = 0;
	}
    }

  return 0;
}

int main (int argc, char *argv[])
{
  int num_proc, rank, proc;
  unsigned int i, j;
  MPI_Status stat;

  /*** Initializations ***/
  MPI_Init(&argc, &argv);
  MPI_Comm_size(MPI_COMM_WORLD, &num_proc );
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
  printf ("MPI proc %d has started...\n", rank);

  /* local memory allocations for the data array */
  if( ( data = (st2 *)malloc(N*M*sizeof(st2)) ) == NULL )
    {
      fprintf( stderr, "Proc %d: Not enough memory. Exit.\n", rank );
      exit(1);
    }

  /* local initializiation of the data array  */
  loc_init(rank);

  /* create user defined data type  */
  crt_dt();

  MPI_Barrier(MPI_COMM_WORLD);

  /* data transfer */
  if( rank > 0  )  /* Proc 0 does not send data */
    {
		/* send each row in the portion */
      for( i = rank; i < N; i += num_proc )
	{
	  printf( "Proc %d: sending %u.th portion to proc 0.\n", rank, i );
	  /* tagged by the i  */
	  MPI_Send( &data[i*M], M, st2_type, 0, i, MPI_COMM_WORLD );  
        }       

    }
    else	/* Proc 0 recieves each portion */
    {
      for( proc = 1; proc < num_proc; proc++ )
	{
	  for( i = proc; i < N; i += num_proc )
	    {
	  printf( "Proc %d: recieving %u.th portion from proc %d ....\n",
		  rank, i );

	      /* tagged by the i */
	      MPI_Recv( &data[i*M], M, st2_type, proc, i, MPI_COMM_WORLD,
			&stat );	

	    }
	}

    }

 /* MPI_Barrier(MPI_COMM_WORLD);*/
 MPI_Finalize();
 free(data);

}  

_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss