[MPICH] bug in mpich2 communicator context id allocation scheme?
Howard Pritchard
howardp at cray.com
Thu Dec 14 17:16:38 CST 2006
Hello Folks,
We have a customer who is having a problem with mpich2
with the following code:
#include <stdio.h>
#include <mpi.h>
main(int argc, char** argv) {
int i=0;
int randval;
int rank;
MPI_Comm newcomm;
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
while (1) {
i++;
if ( i%100==0)
printf("After %d\n",i);
randval=rand();
if (randval%(rank+2) == 0)
MPI_Comm_split(MPI_COMM_WORLD,1,rank,&newcomm);
else
MPI_Comm_split(MPI_COMM_WORLD,MPI_UNDEFINED,rank,&newcomm);
if (randval%(rank+2) == 0)
MPI_Comm_free(&newcomm);
}
}
The problem is, that at least with the mpich2 on the customer
machine, the context_id for the communicator being generated
for newcomm has to be called by all processes in the old
communicator, whether or not the process is going to actually
get a non NULL communicator back. So there is a context id leak.
Thanks for any help,
Howard
--
Howard Pritchard howardp at cray.com
206-579-2536(cell)
http://insidecray.us.cray.com/~howardp
More information about the mpich-discuss
mailing list