[mpich-discuss] Right way to implement first passage problems?
Gideon Simpson
gideon.simpson at gmail.com
Wed Jan 4 21:10:43 CST 2012
Hi, I'm trying to to implement a first passage type problem where I have N processes all performing a task , and when the first one completes the task (by exiting), it should alert the others to cease working. I then want to communicate the results to everyone. In principle, two of the processes could finish at the same time, and I'd like to guard against that, making a deterministic selection in the case of a tie.
Right now, I have a code which appears to work. My concerns are:
1. What happens two workers finish at the same time and start sending termination messages to each other? Couldn't this lead to collisions with regard to exit_id?
2. Am I shutting down the outstanding communications properly? Again, it would seem that if two processes exited at the same time, and send messages, we could MPI_recv' requests which completed, but subsequently were given MPI_Cancel's.
Thanks,
-gideon
Here's the relevant (I think) bit of my code:
...initialization steps
local_exit = 0; // flag that the local process has completed
global_exit = 0; // flag that the some process has completed
exit_id = rank; // id of the process which finishes
MPI_Irecv(&exit_id, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG,
MPI_COMM_WORLD, &request);
for(i = 0; i < MAX_ITER && !local_exit && !global_exit; i++){
....local steps associated with the task
if ( task complete ) {
local_exit = 1; // flag that this process has exited
/* Inform other processes that this process has exited. */
for(j = 0; j < size; j++) {
if( j != rank){
MPI_Send(&exit_id, 1, MPI_INT, j, tag, MPI_COMM_WORLD);
}
}
}
MPI_Test(&request, &global_exit, MPI_STATUS_IGNORE);
}
/* clean up any communications that are still running */
if (!local_exit){
MPI_Wait(&request, MPI_STATUS_IGNORE);
}
else{
MPI_Cancel(&request);
}
More information about the mpich-discuss
mailing list