[mpich-discuss] thread MPI calls

chong tan chong_guan_tan at yahoo.com
Thu Jul 30 20:21:19 CDT 2009


D,
the simplest test we have is 1 master and 3 slaves, 3 workers in total.  Data szie
start at 5K byte (1 time), then dwindles down to less than 128 byte in a hurry.  
THe MPI_Send/IRecv/Recv were the only ones in our application (besides from
MPI_Init() and 1 partiticular MPI_Barrier() when the application is initialized, and 1
Finish() ).

do_work() has do to some amount of work, more in master proc than slave. do_litle_work()
does what it means, in our application, it is 3 function return with int value, and 1 check for
trace/monitor flag.  (code is like "if( trace ) print_trace(..)" )

The test was ran on a 4XQuad box, each process on its own physcial CPU.  THe 
master proc (proc 0) is run on the same physical CPU as  its thread.   THe box 
is AMD based, so no HT (our application filters cores created by hyperthreading
by default).  


on the 3 workers test, we see 20% to 50% sys activity constantly.  WHich in term slow
down each proc to the point that master (proc 0)'s main thread becomes idle 40% of the time.

In the extreme case, we saw the threaded code being 70% slower than un-threaded one.

We have the tests ready to show the issues, it would be nice if you are around SF Bay area.

thanks
tan

 

________________________________
From: Darius Buntinas <buntinas at mcs.anl.gov>
To: mpich-discuss at mcs.anl.gov
Sent: Thursday, July 30, 2009 1:49:10 PM
Subject: Re: [mpich-discuss] thread MPI calls

OK.  Yes, unless do_work and do_very_little_work make any blocking calls
(like I/O), process 1 should have 100% cpu utilization.  This should be
fine (from a performance standpoint), as long as you aren't
oversubscribing your processors.

I'm going to try to reproduce your tests on our machines.  How many
worker processes do you have?  Is this all on one node?  If not how many
nodes?  How many cores do you have per node?

In the mean time can you check to which processor each process is bound?
Make sure that each process is bound to its own core, and not to a
hyperthread.

Thanks,
-d



On 07/30/2009 02:02 PM, chong tan wrote:
> D,
> sorry for the confusion.  In our application, the setting is different
> from the code
> Pavan posted.  I will try to have them lined up here, (<--- is between
> thread,
> <==== is between proc)
>  
>            proc 0                                                proc 1
>  
>  main thread    recv thread
>  
>  do_work()    MPI_Irecv                              do_work()
>                        MPI_Wait*()  <=======      MPI_Send()
>  blocked    <--- unblock                                
> do_very_litle_work()
>  MPI_Send                        ==========>  MPI_Recv()
>  
>  
> I don't know if the MPI_Recv call in Proc 1 is interferring with the
> MPI_Wait*() in Proc 1.  We
> see heavy system activity in Proc 1. 
>  
>                    
> tan
>  
> 
>  
> 
> ------------------------------------------------------------------------
> *From:* Darius Buntinas <buntinas at mcs.anl.gov>
> *To:* mpich-discuss at mcs.anl.gov
> *Sent:* Thursday, July 30, 2009 11:17:52 AM
> *Subject:* Re: [mpich-discuss] thread MPI calls
> 
> That sounds fishy.  If process 1 is doing a sleep(), you shouldn't see
> any activity from that process!  Can you double check that?
> 
> -d
> 
> On 07/30/2009 01:05 PM, chong tan wrote:
>> pavan,
>> the behavior you described is the expected behavior.  However, using
>> your example, we are also seeing
>> a lot of system activity in process 1 in all of our experiments.  That
>> contributes significantly
>> to the negative gain.
>> 
> 



      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090730/a0c352c6/attachment.htm>


More information about the mpich-discuss mailing list