<html><head><style type="text/css"><!-- DIV {margin:0px;} --></style></head><body><div style="font-family:times new roman, new york, times, serif;font-size:12pt"><DIV>D,</DIV>
<DIV>the simplest test we have is 1 master and 3 slaves, 3 workers in total. Data szie</DIV>
<DIV>start at 5K byte (1 time), then dwindles down to less than 128 byte in a hurry. </DIV>
<DIV>THe MPI_Send/IRecv/Recv were the only ones in our application (besides from</DIV>
<DIV>MPI_Init() and 1 partiticular MPI_Barrier() when the application is initialized, and 1</DIV>
<DIV>Finish() ).</DIV>
<DIV> </DIV>
<DIV>do_work() has do to some amount of work, more in master proc than slave. do_litle_work()</DIV>
<DIV>does what it means, in our application, it is 3 function return with int value, and 1 check for</DIV>
<DIV>trace/monitor flag. (code is like "if( trace ) print_trace(..)" )</DIV>
<DIV> </DIV>
<DIV>The test was ran on a 4XQuad box, each process on its own physcial CPU. THe </DIV>
<DIV>master proc (proc 0) is run on the same physical CPU as its thread. THe box </DIV>
<DIV>is AMD based, so no HT (our application filters cores created by hyperthreading</DIV>
<DIV>by default). </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>on the 3 workers test, we see 20% to 50% sys activity constantly. WHich in term slow</DIV>
<DIV>down each proc to the point that master (proc 0)'s main thread becomes idle 40% of the time.<BR></DIV>
<DIV style="FONT-FAMILY: times new roman, new york, times, serif; FONT-SIZE: 12pt">In the extreme case, we saw the threaded code being 70% slower than un-threaded one.</DIV>
<DIV style="FONT-FAMILY: times new roman, new york, times, serif; FONT-SIZE: 12pt"> </DIV>
<DIV style="FONT-FAMILY: times new roman, new york, times, serif; FONT-SIZE: 12pt">We have the tests ready to show the issues, it would be nice if you are around SF Bay area.</DIV>
<DIV style="FONT-FAMILY: times new roman, new york, times, serif; FONT-SIZE: 12pt"> </DIV>
<DIV style="FONT-FAMILY: times new roman, new york, times, serif; FONT-SIZE: 12pt">thanks</DIV>
<DIV style="FONT-FAMILY: times new roman, new york, times, serif; FONT-SIZE: 12pt">tan</DIV>
<DIV style="FONT-FAMILY: times new roman, new york, times, serif; FONT-SIZE: 12pt"><BR> </DIV>
<DIV style="FONT-FAMILY: arial, helvetica, sans-serif; FONT-SIZE: 13px"><FONT size=2 face=Tahoma>
<HR SIZE=1>
<B><SPAN style="FONT-WEIGHT: bold">From:</SPAN></B> Darius Buntinas <buntinas@mcs.anl.gov><BR><B><SPAN style="FONT-WEIGHT: bold">To:</SPAN></B> mpich-discuss@mcs.anl.gov<BR><B><SPAN style="FONT-WEIGHT: bold">Sent:</SPAN></B> Thursday, July 30, 2009 1:49:10 PM<BR><B><SPAN style="FONT-WEIGHT: bold">Subject:</SPAN></B> Re: [mpich-discuss] thread MPI calls<BR></FONT><BR>OK. Yes, unless do_work and do_very_little_work make any blocking calls<BR>(like I/O), process 1 should have 100% cpu utilization. This should be<BR>fine (from a performance standpoint), as long as you aren't<BR>oversubscribing your processors.<BR><BR>I'm going to try to reproduce your tests on our machines. How many<BR>worker processes do you have? Is this all on one node? If not how many<BR>nodes? How many cores do you have per node?<BR><BR>In the mean time can you check to which processor each process is bound?<BR>Make sure that each process is
bound to its own core, and not to a<BR>hyperthread.<BR><BR>Thanks,<BR>-d<BR><BR><BR><BR>On 07/30/2009 02:02 PM, chong tan wrote:<BR>> D,<BR>> sorry for the confusion. In our application, the setting is different<BR>> from the code<BR>> Pavan posted. I will try to have them lined up here, (<--- is between<BR>> thread,<BR>> <==== is between proc)<BR>> <BR>> proc 0 proc 1<BR>> <BR>> main thread recv thread<BR>> <BR>> do_work() MPI_Irecv do_work()<BR>>
MPI_Wait*() <======= MPI_Send()<BR>> blocked <--- unblock <BR>> do_very_litle_work()<BR>> MPI_Send ==========> MPI_Recv()<BR>> <BR>> <BR>> I don't know if the MPI_Recv call in Proc 1 is interferring with the<BR>> MPI_Wait*() in Proc 1. We<BR>> see heavy system activity in Proc 1. <BR>> <BR>> <BR>> tan<BR>> <BR>> <BR>> <BR>> <BR>> ------------------------------------------------------------------------<BR>> *From:* Darius Buntinas <<A href="mailto:buntinas@mcs.anl.gov" ymailto="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</A>><BR>>
*To:* <A href="mailto:mpich-discuss@mcs.anl.gov" ymailto="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</A><BR>> *Sent:* Thursday, July 30, 2009 11:17:52 AM<BR>> *Subject:* Re: [mpich-discuss] thread MPI calls<BR>> <BR>> That sounds fishy. If process 1 is doing a sleep(), you shouldn't see<BR>> any activity from that process! Can you double check that?<BR>> <BR>> -d<BR>> <BR>> On 07/30/2009 01:05 PM, chong tan wrote:<BR>>> pavan,<BR>>> the behavior you described is the expected behavior. However, using<BR>>> your example, we are also seeing<BR>>> a lot of system activity in process 1 in all of our experiments. That<BR>>> contributes significantly<BR>>> to the negative gain.<BR>>> <BR>> <BR></DIV></div><br>
</body></html>