[mpich-discuss] MPICH2-1.0.8 performance issues on Opteron Cluster

Darius Buntinas buntinas at mcs.anl.gov
Tue Dec 16 18:12:28 CST 2008


I'd like to see if the problem has to do with internode or intranode
communication.

Can you try running 10 processes on one node with sock and nemesis?

If nemesis is still doing worse, please try 5 processes on one node.

Thanks,
-d

On 12/16/2008 05:50 PM, Sarat Sreepathi wrote:
> Hello,
> 
> We got a new 10-node Opteron cluster in our research group. Each node
> has two quad core Opterons. I installed MPICH2-1.0.8 with Pathscale(3.2)
> compilers and three device configurations (nemesis,ssm,sock). I built
> and tested using the Linpack(HPL) benchmark with ACML 4.2 BLAS library
> for the three different device configurations.
> 
> I observed some unexpected results as the 'nemesis' configuration gave
> the worst performance. For the same problem parameters, the 'sock'
> version was faster and the 'ssm' version hangs. For further analysis, I
> obtained screenshots from the Ganglia monitoring tool for the three
> different runs. As you can see from the attached screenshots, the
> 'nemesis' version is consuming more 'system cpu' according to Ganglia.
> The 'ssm' version fares slightly better but it hangs towards the end.
> 
> I may be missing something trivial here but can anyone account for this
> discrepancy? Isn't 'nemesis' device or 'ssm' device recommended for this
> cluster configuration? Your help is greatly appreciated.
> 
> Thanks,
> Sarat.
> 
> _*Details:*_
> HPL built with AMD ACML 4.2 blas libraries
> HPL Output for a problem size N=60000
> *nemesis - 1.653e+02 Gflops
> ssm - hangs
> sock - 2.029e+02 Gflops*
> 
> c2master:~ # mpich2version
> MPICH2 Version:         1.0.8
> MPICH2 Release date:    Unknown, built on Fri Dec 12 16:31:15 EST 2008
> MPICH2 Device:          ch3:nemesis
> MPICH2 configure:       --with-device=ch3:nemesis --enable-f77
> --enable-f90 --enable-cxx
> --prefix=/usr/local/mpich2-1.0.8-pathscale-k8-nemesis
> MPICH2 CC:      pathcc -march=opteron -O3
> MPICH2 CXX:     pathCC -march=opteron -O3
> MPICH2 F77:     pathf90 -march=opteron -O3
> MPICH2 F90:     pathf90 -march=opteron -O3
> 
> and similar configuration using ch3:ssm and ch3:sock devices.
> 
> *> nohup mpiexec  -machinefile ./mf -n 80 ./xhpl < /dev/null &*
> *Machine file used:*
>> cat mf
> c2node2:8
> c2node3:8
> c2node4:8
> c2node5:8
> c2node6:8
> c2node7:8
> c2node8:8
> c2node9:8
> c2node10:8
> c2node11:8
> 
> c2master:~ # uname -a
> Linux c2master 2.6.22.18-0.2-default #1 SMP 2008-06-09 13:53:20 +0200
> x86_64 x86_64 x86_64 GNU/Linux
> Processor: Quad-Core AMD Opteron(tm) Processor 2350 - 2 GHz
> 
> -- 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Sarat Sreepathi
> Doctoral Student
> Dept. of Computer Science
> North Carolina State University
> sarat_s at ncsu.edu ~ (919)645-7775
> http://www.sarats.com
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> 
> ------------------------------------------------------------------------
> 
> 
> ------------------------------------------------------------------------
> 



More information about the mpich-discuss mailing list