[mpich-discuss] Right way to implement first passage problems?

Gideon Simpson gideon.simpson at gmail.com
Wed Jan 4 21:10:43 CST 2012


Hi, I'm trying to to implement a first passage type problem where I have N processes all performing a task , and when the first one completes the task (by exiting), it should alert the others to cease working.  I then want to communicate the results to everyone.  In principle, two of the processes could finish at the same time, and I'd like to guard against that, making a deterministic selection in the case of a tie.

Right now, I have a code which appears to work.  My concerns are:

1.  What happens two workers finish at the same time and start sending termination messages to each other?  Couldn't this lead to collisions with regard to exit_id?

2.  Am I shutting down the outstanding communications properly?  Again, it would seem that if two processes exited at the same time, and send messages, we could MPI_recv' requests which completed, but subsequently were given MPI_Cancel's.  

Thanks, 
-gideon

Here's the relevant (I think) bit of my code:

...initialization steps

local_exit = 0; // flag that the local process has completed
global_exit = 0; // flag that the some process has completed

exit_id = rank;  // id of the process which finishes
MPI_Irecv(&exit_id, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG,
	    MPI_COMM_WORLD, &request); 

for(i = 0; i < MAX_ITER && !local_exit && !global_exit; i++){

....local steps associated with the task


    if ( task complete  ) {

      local_exit = 1;  // flag that this process has exited

      /* Inform other processes that this process has exited.  */ 
      for(j = 0; j < size; j++) {
	if( j != rank){
	  MPI_Send(&exit_id, 1, MPI_INT, j, tag, MPI_COMM_WORLD);
	}
      }
    }
MPI_Test(&request, &global_exit, MPI_STATUS_IGNORE);
}
/* clean up any communications that are still running */
if (!local_exit){
	MPI_Wait(&request, MPI_STATUS_IGNORE);
}
else{
    MPI_Cancel(&request);
}



More information about the mpich-discuss mailing list