[mpich-discuss] thread MPI calls
Pavan Balaji
balaji at mcs.anl.gov
Fri Jul 31 14:57:13 CDT 2009
Whenever any process/thread is within MPI, it'll use 100% CPU. Whenever,
any process/thread is within do_work() or do_very_little_work(), it'll
use 100% CPU. The only time you might not see 100% CPU usage is when a
thread is blocked within a semaphore waiting for another thread to send
a signal.
So, in your example, all worker processes should always see 100% CPU
usage. The recv thread should always see 100% CPU usage. The main thread
should only see CPU usage when it is not blocking waiting for a signal
from the recv thread.
Is the above what you are noticing?
-- Pavan
On 07/31/2009 02:43 PM, chong tan wrote:
>
> I am running 1 master process and 3 slave process, master has
> a recv thread, that make it 5 processes total.
>
> THe box I have has 16 cores.
>
> tan
>
>
> ------------------------------------------------------------------------
> *From:* Pavan Balaji <balaji at mcs.anl.gov>
> *To:* mpich-discuss at mcs.anl.gov
> *Sent:* Friday, July 31, 2009 12:10:21 PM
> *Subject:* Re: [mpich-discuss] thread MPI calls
>
>
> How many processes+threads do you have? It looks like you are running 1
> master process (with two threads) + 3 slaves (= 5 processes/threads) on
> a 4-core system. Is this correct? If yes, all of these will contend for
> the 4 cores.
>
> -- Pavan
>
> On 07/30/2009 08:21 PM, chong tan wrote:
> > D,
> > the simplest test we have is 1 master and 3 slaves, 3 workers in
> total. Data szie
> > start at 5K byte (1 time), then dwindles down to less than 128 byte
> in a hurry. THe MPI_Send/IRecv/Recv were the only ones in our
> application (besides from
> > MPI_Init() and 1 partiticular MPI_Barrier() when the application is
> initialized, and 1
> > Finish() ).
> > do_work() has do to some amount of work, more in master proc than
> slave. do_litle_work()
> > does what it means, in our application, it is 3 function return with
> int value, and 1 check for
> > trace/monitor flag. (code is like "if( trace ) print_trace(..)" )
> > The test was ran on a 4XQuad box, each process on its own physcial
> CPU. THe
> > master proc (proc 0) is run on the same physical CPU as its thread.
> THe box
> > is AMD based, so no HT (our application filters cores created by
> hyperthreading
> > by default). on the 3 workers test, we see 20% to 50% sys activity
> constantly. WHich in term slow
> > down each proc to the point that master (proc 0)'s main thread
> becomes idle 40% of the time.
> > In the extreme case, we saw the threaded code being 70% slower than
> un-threaded one.
> > We have the tests ready to show the issues, it would be nice if you
> are around SF Bay area.
> > thanks
> > tan
> >
> > ------------------------------------------------------------------------
> > *From:* Darius Buntinas <buntinas at mcs.anl.gov
> <mailto:buntinas at mcs.anl.gov>>
> > *To:* mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.gov>
> > *Sent:* Thursday, July 30, 2009 1:49:10 PM
> > *Subject:* Re: [mpich-discuss] thread MPI calls
> >
> > OK. Yes, unless do_work and do_very_little_work make any blocking calls
> > (like I/O), process 1 should have 100% cpu utilization. This should be
> > fine (from a performance standpoint), as long as you aren't
> > oversubscribing your processors.
> >
> > I'm going to try to reproduce your tests on our machines. How many
> > worker processes do you have? Is this all on one node? If not how many
> > nodes? How many cores do you have per node?
> >
> > In the mean time can you check to which processor each process is bound?
> > Make sure that each process is bound to its own core, and not to a
> > hyperthread.
> >
> > Thanks,
> > -d
> >
> >
> >
> > On 07/30/2009 02:02 PM, chong tan wrote:
> > > D,
> > > sorry for the confusion. In our application, the setting is different
> > > from the code
> > > Pavan posted. I will try to have them lined up here, (<--- is between
> > > thread,
> > > <==== is between proc)
> > > > proc 0
> proc 1
> > > > main thread recv thread
> > > > do_work() MPI_Irecv do_work()
> > > MPI_Wait*() <======= MPI_Send()
> > > blocked <--- unblock >
> do_very_litle_work()
> > > MPI_Send ==========> MPI_Recv()
> > > > > I don't know if the MPI_Recv call in Proc 1 is interferring
> with the
> > > MPI_Wait*() in Proc 1. We
> > > see heavy system activity in Proc 1.
> > > > > tan
> > > >
> > > >
> > >
> ------------------------------------------------------------------------
> > > *From:* Darius Buntinas <buntinas at mcs.anl.gov
> <mailto:buntinas at mcs.anl.gov> <mailto:buntinas at mcs.anl.gov
> <mailto:buntinas at mcs.anl.gov>>>
> > > *To:* mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.gov>
> <mailto:mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.gov>>
> > > *Sent:* Thursday, July 30, 2009 11:17:52 AM
> > > *Subject:* Re: [mpich-discuss] thread MPI calls
> > >
> > > That sounds fishy. If process 1 is doing a sleep(), you shouldn't see
> > > any activity from that process! Can you double check that?
> > >
> > > -d
> > >
> > > On 07/30/2009 01:05 PM, chong tan wrote:
> > >> pavan,
> > >> the behavior you described is the expected behavior. However, using
> > >> your example, we are also seeing
> > >> a lot of system activity in process 1 in all of our experiments.
> That
> > >> contributes significantly
> > >> to the negative gain.
> > >>
> > >
> >
>
> -- Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list