[mpich-discuss] thread MPI calls

Fri Jul 31 15:13:33 CDT 2009

No, none of the processes ever get 100% CPU utilization.  All the slaves
get utilization between 50-90%.  The proc running on CPU 2 hops away
is seeing 50% CPU utilization, that jives well with the cross process probing
performance hit on AMD architecture.

master is getting <60% CPU utilitzation and 40% idle.  Recv thread is 20% CPU,
80% sys.

tan

________________________________
From: Pavan Balaji <balaji at mcs.anl.gov>
To: mpich-discuss at mcs.anl.gov
Sent: Friday, July 31, 2009 12:57:13 PM
Subject: Re: [mpich-discuss] thread MPI calls

Whenever any process/thread is within MPI, it'll use 100% CPU. Whenever, any process/thread is within do_work() or do_very_little_work(), it'll use 100% CPU. The only time you might not see 100% CPU usage is when a thread is blocked within a semaphore waiting for another thread to send a signal.

So, in your example, all worker processes should always see 100% CPU usage. The recv thread should always see 100% CPU usage. The main thread should only see CPU usage when it is not blocking waiting for a signal from the recv thread.

Is the above what you are noticing?

-- Pavan

On 07/31/2009 02:43 PM, chong tan wrote:
> 
> I am running 1 master process and 3 slave process, master has
> a recv thread, that make it 5 processes total.
>  THe box I have has 16 cores.
>  tan
> 
>  ------------------------------------------------------------------------
> *From:* Pavan Balaji <balaji at mcs.anl.gov>
> *To:* mpich-discuss at mcs.anl.gov
> *Sent:* Friday, July 31, 2009 12:10:21 PM
> *Subject:* Re: [mpich-discuss] thread MPI calls
> 
> 
> How many processes+threads do you have? It looks like you are running 1 master process (with two threads) + 3 slaves (= 5 processes/threads) on a 4-core system. Is this correct? If yes, all of these will contend for the 4 cores.
> 
> -- Pavan
> 
> On 07/30/2009 08:21 PM, chong tan wrote:
>  > D,
>  > the simplest test we have is 1 master and 3 slaves, 3 workers in total.  Data szie
>  > start at 5K byte (1 time), then dwindles down to less than 128 byte in a hurry. THe MPI_Send/IRecv/Recv were the only ones in our application (besides from
>  > MPI_Init() and 1 partiticular MPI_Barrier() when the application is initialized, and 1
>  > Finish() ).
>  >  do_work() has do to some amount of work, more in master proc than slave. do_litle_work()
>  > does what it means, in our application, it is 3 function return with int value, and 1 check for
>  > trace/monitor flag.  (code is like "if( trace ) print_trace(..)" )
>  >  The test was ran on a 4XQuad box, each process on its own physcial CPU.  THe
>  > master proc (proc 0) is run on the same physical CPU as  its thread.  THe box
>  > is AMD based, so no HT (our application filters cores created by hyperthreading
>  > by default).  on the 3 workers test, we see 20% to 50% sys activity constantly.  WHich in term slow
>  > down each proc to the point that master (proc 0)'s main thread becomes idle 40% of the time.
>  > In the extreme case, we saw the threaded code being 70% slower than un-threaded one.
>  >  We have the tests ready to show the issues, it would be nice if you are around SF Bay area.
>  >  thanks
>  > tan
>  >
>  >  ------------------------------------------------------------------------
>  > *From:* Darius Buntinas <buntinas at mcs.anl.gov <mailto:buntinas at mcs.anl.gov>>
>  > *To:* mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.gov>
>  > *Sent:* Thursday, July 30, 2009 1:49:10 PM
>  > *Subject:* Re: [mpich-discuss] thread MPI calls
>  >
>  > OK.  Yes, unless do_work and do_very_little_work make any blocking calls
>  > (like I/O), process 1 should have 100% cpu utilization.  This should be
>  > fine (from a performance standpoint), as long as you aren't
>  > oversubscribing your processors.
>  >
>  > I'm going to try to reproduce your tests on our machines.  How many
>  > worker processes do you have?  Is this all on one node?  If not how many
>  > nodes?  How many cores do you have per node?
>  >
>  > In the mean time can you check to which processor each process is bound?
>  > Make sure that each process is bound to its own core, and not to a
>  > hyperthread.
>  >
>  > Thanks,
>  > -d
>  >
>  >
>  >
>  > On 07/30/2009 02:02 PM, chong tan wrote:
>  >  > D,
>  >  > sorry for the confusion.  In our application, the setting is different
>  >  > from the code
>  >  > Pavan posted.  I will try to have them lined up here, (<--- is between
>  >  > thread,
>  >  > <==== is between proc)
>  >  >  >            proc 0                                                proc 1
>  >  >  >  main thread    recv thread
>  >  >  >  do_work()    MPI_Irecv                              do_work()
>  >  >                        MPI_Wait*()  <=======      MPI_Send()
>  >  >  blocked    <--- unblock                                > do_very_litle_work()
>  >  >  MPI_Send                        ==========>  MPI_Recv()
>  >  >  >  > I don't know if the MPI_Recv call in Proc 1 is interferring with the
>  >  > MPI_Wait*() in Proc 1.  We
>  >  > see heavy system activity in Proc 1.
>  >  >  >                    > tan
>  >  >  >
>  >  >  >
>  >  > ------------------------------------------------------------------------
>  >  > *From:* Darius Buntinas <buntinas at mcs.anl.gov <mailto:buntinas at mcs.anl.gov> <mailto:buntinas at mcs.anl.gov <mailto:buntinas at mcs.anl.gov>>>
>  >  > *To:* mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.gov> <mailto:mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.gov>>
>  >  > *Sent:* Thursday, July 30, 2009 11:17:52 AM
>  >  > *Subject:* Re: [mpich-discuss] thread MPI calls
>  >  >
>  >  > That sounds fishy.  If process 1 is doing a sleep(), you shouldn't see
>  >  > any activity from that process!  Can you double check that?
>  >  >
>  >  > -d
>  >  >
>  >  > On 07/30/2009 01:05 PM, chong tan wrote:
>  >  >> pavan,
>  >  >> the behavior you described is the expected behavior.  However, using
>  >  >> your example, we are also seeing
>  >  >> a lot of system activity in process 1 in all of our experiments.  That
>  >  >> contributes significantly
>  >  >> to the negative gain.
>  >  >>
>  >  >
>  >
> 
> -- Pavan Balaji
> http://www.mcs.anl.gov/~balaji
> 

-- Pavan Balaji
http://www.mcs.anl.gov/~balaji

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090731/bb279e3b/attachment.htm>