<table cellspacing='0' cellpadding='0' border='0' ><tr><td valign='top' style='font: inherit;'><P> </P>
<P>you can use numactl on AMD box. There are some many variance of this that I can only tell you to do :</P>
<P> </P>
<P>numactl -help</P>
<P> </P>
<P>to see what option your numactl support. The try to affine each process to a physical/logic CPU, and the memory to each physical CPU.</P>
<P> </P>
<P>the way to run the 'latest' numactl is this :</P>
<P> </P>
<P>numactl --physcpubind=C --membind=M <your application> -- bind to a physical CPU</P>
<P>numactl --cpunodebind=C --membind=M <your application> -- bind to a node</P>
<P> </P>
<P>If you are runing linux, try to fish out the cpu infor from /proc/cpuinfo.</P>
<P> </P>
<P>hope this help.</P>
<P> </P>
<P>tan</P>
<P><BR>--- On <B>Fri, 7/11/08, H. Sami Sozuer <I><hss@photon.iyte.edu.tr></I></B> wrote:<BR></P>
<BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: rgb(16,16,255) 2px solid">From: H. Sami Sozuer <hss@photon.iyte.edu.tr><BR>Subject: Re: [mpich-discuss] Why is my quad core slower than cluster<BR>To: mpich-discuss@mcs.anl.gov<BR>Date: Friday, July 11, 2008, 2:55 PM<BR><BR><PRE>We have a quad cpu opteron system, and each cpu has 2 cores for a total
of 8 cores.
When I run an mpich job on up to 4 processors, everything is as expected
and I get
a decrease in turnaround time. The scaling is not linear, but still
there is a speedup of about
a factor of 2.5 with 4 nodes. But when I increase the number of nodes to
5-8, I actually get a slowdown,
even though there are in fact a total of 8 cores.
My interpretation of this is that each Opteron has an independent bus to
its own memory,
so when you use 4 nodes in mpich, each processor is still using its own
independent bus.
But when running on 8 nodes, now the two cores in each cpu are competing
for memory access
on the same bus, resulting in an overall slowdown.
BTW, when running on 4 nodes in multicore systems, is there any way to
guarantee that mpich
uses the cores on physically different processors, rather than using the
two cores on the same CPU?
It seems to me SMP architecture implies that each core is treated as an
independent processor
by the OS. If my understanding is correct, then the jobs may be assigned
two two cores on the
same processor while both cores of another CPU remain idle, resulting in
loss of efficiency due
to competition for memory access of the two cores on the same CPU.
Any thoughts ?
Sami
Matthew Bettencourt wrote:
>
> We have the same issue, the issue is memory bandwidth for us. We
> can't utilize the extra processing power that tht multicore provides
> because we can't keep those cores fed.
> M
>
>
> zach wrote:
>> Following up on these suggestions and info queries...
>> (Thanks for the help!)
>>
>> I noticed my home pc processor is a
>> Core2 Quad CPU Q6600 @ 2.40GHz (Kentsfield)
>> whereas the cluster Xeon is 3.20GHz.
>> I don't think this is causing the degree of 3 in speed.
>>
>> compiler is gcc on home pc and cluster.
>>
>> same optimization option for both systems
>> yes cpuinfo and meminfo show the right #cpus and mem
>>
>> mpich is different versions i have discovered.
>>
>> on cluster (faster one),
>> MPICH Version: 1.2.7p1
>> MPICH Release date: $Date: 2005/11/04 11:54:51$
>> MPICH Patches applied: none
>> MPICH configure: --prefix=/opt/mpich/intel --enable-sharedlib
>> --with-romio --enable-f90modules -c++=icpc -cc=icc -fc=ifort
>> -f90=ifort
>> MPICH Device: ch_p4
>>
>> on home pc (the slug),
>> MPICH2 Version: 1.0.7
>> MPICH2 Release date: Unknown, built on Tue Jul 8 19:28:07 CDT 2008
>> MPICH2 Device: ch3:nemesis
>> MPICH2 configure: --prefix=/home/code/mpich
>> --with-device=ch3:nemesis
>> MPICH2 CC: gcc -O2
>> MPICH2 CXX: c++ -O2
>> MPICH2 F77:
>> MPICH2 F90:
>>
>> on the cluster I have been compiling with mpiCC and on the home pc
>> with mpicxx.
>>
>> kernel on home pc:
>> Linux myPC 2.6.24-18-generic #1 SMP Wed May 28 19:28:38 UTC 2008
>> x86_64 GNU/Linux
>>
>> I am using ubuntu hardy and did not use 'sudo' during
installation of
>> mpich2 (not logged in as superuser) -don't know if this matters.
>>
>> zach
>>
>></PRE></BLOCKQUOTE></td></tr></table><br>