[mpich-discuss] Sending structures made of MPI Datatypes

Jayesh Krishna jayesh at mcs.anl.gov
Tue Nov 16 12:56:09 CST 2010


Hi,
 There is a small bug in your code. You should use MPI_Aint (instead of int) to store the base address of the structures.

int crt_dt()
{
  int i;
  MPI_Aint base; /* <<<< USE MPI_AINT for storing the base address */
  ...
 }


 Let us know if it works for you.

Regards,
Jayesh
----- Original Message -----
From: zkovacs <zkovacs at hku.hk>
To: mpich-discuss at mcs.anl.gov
Sent: Tue, 16 Nov 2010 05:17:45 -0600 (CST)
Subject: Re: [mpich-discuss] Sending structures made of MPI Datatypes

Dear Jayesh,

Thank you for the answer! Indeed, I used MPICH1 but I have just migrated to MPICH2 1.2.1, which is  available on the local cluster. I run the program on one node with 8 processors but I received the error messages:
  
MPI proc 2 has started...
MPI proc 5 has started...
MPI proc 6 has started...
MPI proc 4 has started...
MPI proc 0 has started...
MPI proc 7 has started...
MPI proc 3 has started...
MPI proc 1 has started...
rank 7 in job 1  j09.hpcc.hku.hk_44436   caused collective abort of all ranks
  exit status of rank 7: killed by signal 9
rank 5 in job 1  j09.hpcc.hku.hk_44436   caused collective abort of all ranks
  exit status of rank 5: killed by signal 9
rank 4 in job 1  j09.hpcc.hku.hk_44436   caused collective abort of all ranks
  exit status of rank 4: killed by signal 11
rank 3 in job 1  j09.hpcc.hku.hk_44436   caused collective abort of all ranks
  exit status of rank 3: killed by signal 11
rank 0 in job 1  j09.hpcc.hku.hk_44436   caused collective abort of all ranks
  exit status of rank 0: killed by signal 9

which might due to the same problem as for MPICH1 where I had SIGVSEV 11 too.
If i use MPI_Allgather() with

#define N 8*164

for( i = 0; i<N*M; i += num_proc )
 MPI_Allgather( (data+i*M), M, st2_type, (data+i*M), M, st2_type, MPI_COMM_WORLD  );

instead of the code fragment

#define N 1500

  if( rank > 0  )  /* Proc 0 does not send data */
    {
                /* send each row in the portion */
      for( i = rank; i < N; i += num_proc )
        {
          printf( "Proc %d: sending %u.th portion to proc 0.\n", rank, i );
          /* tagged by the i  */
          MPI_Send( &data[i*M], M, st2_type, 0, i, MPI_COMM_WORLD );
        }

    }
    else        /* Proc 0 recieves each portion */
    {
      for( proc = 1; proc < num_proc; proc++ )
        {
          for( i = proc; i < N; i += num_proc )
            {
          printf( "Proc %d: recieving %u.th portion from proc %d ....\n",
                  rank, i );

              /* tagged by the i */
              MPI_Recv( &data[i*M], M, st2_type, proc, i, MPI_COMM_WORLD,
                        &stat );

            }
        }

    }

I obtain only one error message from the root process:
rank 0 in job 1  j09.hpcc.hku.hk_44436   caused collective abort of all ranks  exit status of rank 0: killed by signal 9
That' must be the same error  As if there were some problems with the addressing of the buffers; when the process tries to read the buffer its address points to an invalid memory area which makes no sense since it is allocated properly. Anyway, thanks a lot.
Kind regards,
Zoltán
  


________________________________________
From: Jayesh Krishna [jayesh at mcs.anl.gov]
Sent: Tuesday, November 16, 2010 12:29 AM
To: mpich-discuss at mcs.anl.gov
Cc: zkovacs
Subject: Re: [mpich-discuss] Sending structures made of MPI Datatypes

 Also you might want to add the sender proc id in your dbg stmt,

 printf( "Proc %d: recieving %u.th portion from proc %d ....\n", rank, i );

 TO

 printf( "Proc %d: recieving %u.th portion from proc %d ....\n", rank, i, proc);

Regards,
Jayesh
----- Original Message -----
From: Jayesh Krishna <jayesh at mcs.anl.gov>
To: mpich-discuss at mcs.anl.gov
Sent: Mon, 15 Nov 2010 10:10:07 -0600 (CST)
Subject: Re: [mpich-discuss] Sending structures made of MPI Datatypes

Hi,
 And you can try using MPI_Gather() for getting the chunks from the different processes to rank=0 . Similarly you can use MPI_AllGather() if you want all the processes to receive the chunks (instead of just rank 0).

Regards,
Jayesh
----- Original Message -----
From: Jayesh Krishna <jayesh at mcs.anl.gov>
To: mpich-discuss at mcs.anl.gov
Sent: Mon, 15 Nov 2010 10:03:35 -0600 (CST)
Subject: Re: [mpich-discuss] Sending structures made of MPI Datatypes

Hi,
 What is the version of MPICH2 that you are using (From the error message it looks like you are using MPICH instead of MPICH2. Are you running your job on a heterogeneous cluster ?)? We recommend that you use the latest stable release of MPICH2 (http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads).
 I tried compiling/running your job on a single node and it worked without any errors with MPICH2 1.3 .

Regards,
Jayesh
----- Original Message -----
From: zkovacs <zkovacs at hku.hk>
To: mpich-discuss at mcs.anl.gov
Sent: Sun, 14 Nov 2010 21:51:45 -0600 (CST)
Subject: [mpich-discuss] Sending structures made of MPI Datatypes

Dear all,

I defined a structure with the name "st1", created an MPI datatype for it and commited it in the function crt_dt(). I also defined another structure with the name "st2" containing 4 structures of the first type (st1). Its MPI data type is also defined and commited it in the function crt_dt(). After allocating an 1D data array of structures st2 with the size N x M, I initialized each structure member of all the structures in the array by calling loc_init(). Then I tried to send chunks of the array with the size of M from the process with ranks higher than 0 to the process 0. The offsets of the chunks for a process with the rank r were set to (r + i x n) x M < N*(M-1) with i =1,2,... , where n is the total number of the processes. That would cover the whole array for different porcesses. However, when the processes send the very first chunks the error message 'p4_error: interrupt SIGSEGV: 11; is generated.
I wonder if there is some bug in the code since the size of the chunks cannot so big. For M = 1500 it is about M*sizeof(st2) = 1500 x 140 bytes. I also tested it with N=10, and M=15 but it does not work.
Has anybody worked with huge nested structure arrays before? Is there any efficient algorithm to redistribute the results stored in the families of chunks to all the processes?

Thanks a lot,
Zoltán

The code is simple:

#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#define  N 10
#define  M 15


typedef struct st1 {
  double m0, m1, m2, m3;
} st1;

typedef struct st2 {
  int m0;
  double m1;
  st1 m2, m3, m4, m5;
} st2;

st2 * data;

MPI_Datatype st1_type, st2_type;

int crt_dt()
{
  int i, base;

    MPI_Datatype type1[4] = {MPI_DOUBLE,MPI_DOUBLE,MPI_DOUBLE,MPI_DOUBLE};
    MPI_Datatype type2[6]={MPI_INT,MPI_DOUBLE};
    int blk_len[6] = {1,1,1,1,1,1};
    MPI_Aint disp[6];
    st1 s1;
    st2 s2;

    MPI_Address( &s1.m0, disp );
    MPI_Address( &s1.m1, disp+1 );
    MPI_Address( &s1.m2, disp+2 );
    MPI_Address( &s1.m3, disp+3 );
    base = disp[0];
    for( i = 0; i < 4; i++ )
        disp[i] -= base;

    MPI_Type_struct( 4, blk_len, disp, type1, &st1_type );
    MPI_Type_commit(&st1_type);

    type2[2] = st1_type;
    type2[3] = st1_type;
    type2[4] = st1_type;
    type2[5] = st1_type;

    MPI_Address( &s2.m0, disp );
    MPI_Address( &s2.m1, disp+1 );
    MPI_Address( &s2.m2, disp+2 );
    MPI_Address( &s2.m3, disp+3 );
    MPI_Address( &s2.m4, disp+4 );
    MPI_Address( &s2.m5, disp+5 );
    base = disp[0];
    for( i = 0; i < 6; i++ )
        disp[i] -= base;

    MPI_Type_struct( 6, blk_len, disp, type2, &st2_type );
    MPI_Type_commit(&st2_type);

    return 0;
}


int loc_init( int rank )
{
  unsigned int i, j;

   for( i = 0; i < N; i ++ )
    {
      for( j = 0; j < M; j++ )
        {
          (data+i*M+j)->m0 = rank + i*M + j;
          (data+i*M+j)->m1   = (double)(rank + i*M + j);

          (data+i*M+j)->m2.m0 = (double)(rank + i*M + j);
          (data+i*M+j)->m2.m1 = 0;
          (data+i*M+j)->m2.m2 = 0;
          (data+i*M+j)->m2.m3 = 0;

          (data+i*M+j)->m3.m0 = (double)(rank + i*M + j);
          (data+i*M+j)->m3.m1 = 0;
          (data+i*M+j)->m3.m2 = 0;
          (data+i*M+j)->m3.m3 = 0;

          (data+i*M+j)->m4.m0 = (double)(rank + i*M + j);
          (data+i*M+j)->m4.m1 = 0;
          (data+i*M+j)->m4.m2 = 0;
          (data+i*M+j)->m4.m3 = 0;

          (data+i*M+j)->m5.m0 = (double)(rank + i*M + j);
          (data+i*M+j)->m5.m1 = 0;
          (data+i*M+j)->m5.m2 = 0;
          (data+i*M+j)->m5.m3 = 0;
        }
    }

  return 0;
}


int main (int argc, char *argv[])
{
  int num_proc, rank, proc;
  unsigned int i, j;
  MPI_Status stat;

  /*** Initializations ***/
  MPI_Init(&argc, &argv);
  MPI_Comm_size(MPI_COMM_WORLD, &num_proc );
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
  printf ("MPI proc %d has started...\n", rank);

  /* local memory allocations for the data array */
  if( ( data = (st2 *)malloc(N*M*sizeof(st2)) ) == NULL )
    {
      fprintf( stderr, "Proc %d: Not enough memory. Exit.\n", rank );
      exit(1);
    }

  /* local initializiation of the data array  */
  loc_init(rank);

  /* create user defined data type  */
  crt_dt();

  MPI_Barrier(MPI_COMM_WORLD);


  /* data transfer */
  if( rank > 0  )  /* Proc 0 does not send data */
    {
                /* send each row in the portion */
      for( i = rank; i < N; i += num_proc )
        {
          printf( "Proc %d: sending %u.th portion to proc 0.\n", rank, i );
          /* tagged by the i  */
          MPI_Send( &data[i*M], M, st2_type, 0, i, MPI_COMM_WORLD );
        }

    }
    else        /* Proc 0 recieves each portion */
    {
      for( proc = 1; proc < num_proc; proc++ )
        {
          for( i = proc; i < N; i += num_proc )
            {
          printf( "Proc %d: recieving %u.th portion from proc %d ....\n",
                  rank, i );

              /* tagged by the i */
              MPI_Recv( &data[i*M], M, st2_type, proc, i, MPI_COMM_WORLD,
                        &stat );

            }
        }

    }

 /* MPI_Barrier(MPI_COMM_WORLD);*/
 MPI_Finalize();
 free(data);

}



_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list