[MPICH] bug in mpich2 communicator context id allocation scheme?

Howard Pritchard howardp at cray.com
Thu Dec 14 17:16:38 CST 2006


Hello Folks,

We have a customer who is having a problem with mpich2
with the following code:

#include <stdio.h>
#include <mpi.h>

main(int argc, char** argv) {

   int i=0;
   int randval;
   int rank;
   MPI_Comm newcomm;

   MPI_Init(&argc,&argv);
   MPI_Comm_rank(MPI_COMM_WORLD,&rank);

   while (1) {

     i++;

     if ( i%100==0)

       printf("After %d\n",i);

     randval=rand();

     if (randval%(rank+2) == 0)

       MPI_Comm_split(MPI_COMM_WORLD,1,rank,&newcomm);

     else

       MPI_Comm_split(MPI_COMM_WORLD,MPI_UNDEFINED,rank,&newcomm);

     if (randval%(rank+2) == 0)

       MPI_Comm_free(&newcomm);

   }
}

The problem is, that at least with the mpich2 on the customer
machine, the context_id for the communicator being generated
for newcomm has to be called by all processes in the old
communicator, whether or not the process is going to actually
get a non NULL communicator back.  So there is a context id leak.

Thanks for any help,

Howard

-- 
Howard Pritchard      howardp at cray.com
206-579-2536(cell)
http://insidecray.us.cray.com/~howardp




More information about the mpich-discuss mailing list