<table cellspacing='0' cellpadding='0' border='0' ><tr><td valign='top' style='font: inherit;'><P><BR>It could be the same.&nbsp; The OS actually could make them not the same. The membind will behave the same.&nbsp; Just that the OS has the freedom to move the processes among the cores in a physical CPU, resulting in context switching.&nbsp; </P>

<P>&nbsp;</P>

<P>Please spend sometime to do a small experiment on this.&nbsp; You will learn the 'behavior' of your HW and OS better.&nbsp; Try it on a few different HW, and you will be amused.</P>

<P>&nbsp;</P>

<P>tan</P>

<P><BR>--- On <B>Thu, 7/24/08, Franco Catalano <I>&lt;franco.catalano@uniroma1.it&gt;</I></B> wrote:<BR></P>

<BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: rgb(16,16,255) 2px solid">From: Franco Catalano &lt;franco.catalano@uniroma1.it&gt;<BR>Subject: Re: [mpich-discuss] processor/memory affinity on quad core systems<BR>To: mpich-discuss@mcs.anl.gov<BR>Date: Thursday, July 24, 2008, 2:38 AM<BR><BR><PRE>Hi,

Thanks to Chong Tan for the suggestion with numactl. The cluster of my

laboratory is being used primarly by me, so I am not facing with job

queue issues and this is a fairly good solution for my computations.

I have a question about the use of numactl. Assuming a 4 processor quad

core machine, is it the same doing this:

 mpiexec -np 4 numactl --cpunodebind=0

--membind=0 ./parallel_executable : -np 4 numactl --cpunodebind=1

--membind=1 ./parallel_executable : &lt;same for the remaining two nodes&gt;

instead of:

 mpiexec numactl --physcpubind=0 --membind=0 ./parallel_executable :

numactl --physcpubind=1 --membind=0 ./parallel_executable : numactl

--physcpubind=2 --membind=0 ./parallel_executable : numactl

--physcpubind=3 --membind=0 ./parallel_executable : numactl

--physcpubind=4 --membind=1 ./parallel_executable : numactl

--physcpubind=5 --membind=1 ./parallel_executable : &lt;same for the

remaining ten cores&gt;

In other words, since the memory in the numa architecture is assigned to

nodes, what is the difference of binding 4 mpi jobs to each quad core

processors instead of binding each core to a single mpi job?

Thanks.

Franco

Il giorno mar, 22/07/2008 alle 10.11 -0700, chong tan ha scritto:

&gt; no easy way with mpiexec, especially if you do mpiexec -n.  But this

&gt; should work

&gt; 

&gt;  

&gt; 

&gt;  

&gt; 

&gt; mpiexec numactl --physcpubind N0 &lt;1 of your proc&gt; :

&gt; 

&gt;              numactl  -- physcpubind N1 &lt;2nd of oof proc&gt;  :

&gt; 

&gt;              .&lt;same for the rest&gt;

&gt; 

&gt;  

&gt; 

&gt; add --membind if you want (and you definately want it for Opteron).  

&gt; 

&gt;  

&gt; 

&gt; tan

&gt; 

&gt; 

&gt; 

&gt; --- On Tue, 7/22/08, Franco Catalano &lt;franco.catalano@uniroma1.it&gt;

&gt; wrote:

&gt; 

&gt; 

&gt;         From: Franco Catalano &lt;franco.catalano@uniroma1.it&gt;

&gt;         Subject: [mpich-discuss] processor/memory affinity on quad

&gt;         core systems

&gt;         To: mpich-discuss@mcs.anl.gov

&gt;         Date: Tuesday, July 22, 2008, 2:28 AM

&gt;         

&gt;         Hi,

&gt;         Is it possible to ensure processor/memory affinity on mpi jobs

launched

&gt;         with mpiexec (or mpirun)?

&gt;         I am using mpich2 1.0.7 with WRF on a 4 processor Opteron quad

core (16

&gt;         cores total) machine and I have observed a sensible (more than

20%)

&gt;         variability of the time needed to compute a single time step.

Taking a

&gt;         look to the output of top, I have noticed that the system moves

&gt;         processes over the 16 cores regardless of processor/memory

affinity. So,

&gt;         when processes are running on cores away from their memory, the

time

&gt;         needed for the time advancement is longer.

&gt;         I know that, for example, OpenMPI provides a command line option

for

&gt;         mpiexec (or mpirun) to ensure the affinity binding:

&gt;         --mca param mpi_paffinity_alone = 1

&gt;         I have tried this with WRF and it works.

&gt;         Is there a way to do this with mpich2?

&gt;         Otherwise, I think that it would be very useful to include such

&gt;         cabability into the next release.

&gt;         Thank you for any suggestion.

&gt;         

&gt;         Franco

&gt;         

&gt;         -- 

&gt;         ____________________________________________________

&gt;         Eng. Franco Catalano

&gt;         Ph.D. Student

&gt;         

&gt;         D.I.T.S.

&gt;         Department of Hydraulics, Transportation and Roads.

&gt;         Via Eudossiana 18, 00184 Rome 

&gt;         University of Rome "La Sapienza".

&gt;         tel: +390644585218</PRE></BLOCKQUOTE></td></tr></table><br>