[MPICH] Stopping processors?

Michaela Heyer mh4 at cs.ucc.ie
Thu Apr 26 04:09:40 CDT 2007


Hi Guys,
Thanks for the replies!

Rajeev, what you suggested was my initial idea also. The problem with this is 
that I have about 200 different algorithms so the way I see it I'd have to 
add MPI_Iprobe (actually I was gonna use MPI_Test) to all the algorithms and 
possible call it even more than once to really make sure I get stopped quite 
quickly...which just doesn't seem feasible.

Darius, I like the idea with the threads! I haven't really worked much with 
threads, so I will have to look into it but I think it might work...

Michaela




On Tuesday 24 April 2007 16:47, Darius Buntinas wrote:
> One idea would be to have each worker process spawn a "computation"
> thread which actually does the computation.  The "main" thread of the
> process would issue a blocking receive waiting for a "done" message.
>
> When a computation thread at some process finishes, it sends messages to
> the other processes and to itself.  When a main thread receives a done
> message, it checks whether it received the message from its own
> computation thread.  If so, it does a pthread_join() and reads the
> result, otherwise it does a pthread_cancel().
>
> At this point the main thread can wait for a message to start the next
> computation.
>
>
> If you have a lot of processes having one process send a done message to
> every other process isn't scalable.  You could optimize in that case by
> having the processes forward the message in a tree.
>
> -d




On Tuesday 24 April 2007 16:59, Rajeev Thakur wrote:
> You could have all processes post an MPI_Irecv and then periodically call
> MPI_Iprobe to see if there is an incoming message. The process that
> finishes first sends others a stop message.
>
> Rajeev
>
> > -----Original Message-----
> > From: owner-mpich-discuss at mcs.anl.gov
> > [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Michaela Heyer
> > Sent: Tuesday, April 24, 2007 9:21 AM
> > To: mpich-discuss at mcs.anl.gov
> > Subject: [MPICH] Stopping processors?
> >
> > Hi,
> > I'm hoping someone can help me with this...
> > Essentially I'm looking for a way to tell one or more
> > processors to stop
> > whatever they are currently doing and move on to something else. The
> > situation is as follows: Let's say we have n processors all
> > working on
> > different algorithms. I only really need the result of one of these
> > algorithms so whichever processor finishes first should tell
> > all the other
> > ones to stop so they can all move on to the next problem.
> > It's a bit like a
> > race...and speed is vital! (That's why I can't really wait
> > for all processors
> > to finish their algorithms)
> > At the moment I'm using MPI_Abort() to shutdown the whole
> > process. This works
> > fine and does exactly what I need but the problem is that it
> > is very very
> > slow as it shuts down and restarts all the processors. So
> > what I'm looking
> > for is something like a "milder" version of MPI_Abort() i.e.
> > stop everything
> > but don't shutdown the processors.
> > I have been looking but can't find anything sutitable so I'm
> > starting to think
> > that maybe it's impossible? It would be great if you could
> > prove me wrong :-)
> > Thanks,
> > Michaela




More information about the mpich-discuss mailing list