[mpich-discuss] MPI how to support the smp machine?

Gustavo Correa gus at ldeo.columbia.edu
Sat Dec 5 13:06:42 CST 2009


Hi Liaoyin

Your problem size N=1000 is too small.
The order of P,Q also matters.
See my previous message for suggestions.

Gus Correa

On Dec 5, 2009, at 8:06 AM, liaoyin wrote:

> thank you .
> my mpi is mpich2 1.2.
>
> My cluster is a  diskless platform that is load kernel and  mount  
> filesystem from the server.
> Single node  has  one cpu with four core.(the theoretical peak  
> performance in single core is 3.2 Gflops )
>
> I am runing the  HPL ( A Portable Implementation of the High- 
> Performance Linpack Benchmark for Distributed-Memory Computers) to  
> do linpcke test.
>
> I use the Atlas to as a blas lib that is a single thread version.
>
> case 1, I run the mpirun -np 1 ./xhlp   the Gflops is 0.4 ( I found  
> that one core to run with the top command  HPL.dat is that  
> n=1000,ps*qs=1*1)
>
> case 2, I run the mpirun -np 2 ./xhlp   the Gflops is 0.04  (I found  
> that one core to run with the top command HPL.dat is that  
> n=1000,ps*qs=1*2)
>
> I use the netpipe(http://www.scl.ameslab.gov/netpipe/) and IMB-MPI  
> to test the communication between two core in one cpu.
>
>
> with no diskless platform.
>
> I run mpirun -np 2 ./NPmpi  and mpirun -np 2 ./IMB-mpi in a node  
> (two prcesses run on the two core), the bandwidth is high and  
> latency is small
>
> but with diskless.
>
> I run mpirun -np 2 ./NPmpi and mpirun -np 2 ./IMB-mpi in a node (two  
> prcesses run on the two core), the bandwidth is very low and latency  
> is large.
>
> why?
>
>
> 2009/12/5 Gus Correa <gus at ldeo.columbia.edu>
> Hi Liaoyin
>
> Besides Dave Goodell's questions,
> I wonder if this may be more of an HPL problem
> than of an MPICH2 problem.
>
> Here are some questions/suggestions about HPL:
>
> ***
>
> 1) If you just want to check if MPICH2 is working,
> then don't use HPL.  Use something simpler.
> The cpi.c and hellow.c programs in the MPICH2 "examples" directory
> will tell you if MPICH2 is working properly, and are way much
> simpler than HPL to setup and run.
>
> ***
>
> 2) Note, there may be a typo on the mpirun command line
> on your message, it should be "xhpl" not "xhlp" as you wrote,
> unless you changed the executable name.
>
> ***
>
> 3) Are you sure you are running HPL on 2 cores?
>
> If you use Linux, you can submit your HPL job,
> then use "top" (and type "1") to see how many
> cores are actually running xhpl.
>
> Do you really see 2 cores in action on "top"?
>
> ***
>
> 4) What are the contents of your HPL.dat parameter file,
> when you try to run on 2 cores?
>
> Is is the same that you use for the one core run or is it different?
>
> This may not be news to you, but here it goes just in case:
>
> To run HPL on a given number of cores on your machine,
> the product of the values or Ps and Qs in HPL.dat has to be
> equal to the N value on your "mpirun -np N"
> For example, if you want to use 2 cores (mpirun -np 2),
> you could use this on your HPL.dat file:
>
> 1 2 Ps
> 2 1 Qs
>
> Or to use four cores:
>
> 1 2 4 Ps
> 4 2 1 Qs
>
> (Tip: You *must* use a *single blank space* field separator in  
> HPL.dat.)
>
> ***
>
> 5) Very small problem sizes don't scale well with
> the number of processors.
>
> To see speedups when more cores are added,
> you need to choose a relatively large value for Ns in HPL.dat.
> However, Ns is constrained by how much memory (RAM) your computer has.
> A reasonable upper bound to Ns is sqrt(0.8*memory_in_bytes/8).
>
> See item 5) below.
>
> ***
>
> 6) Finally, if you haven't read it, the HPL TUNING file is a
> *must read* for anybody who wants to run the HPL benchmark:
>
> http://www.netlib.org/benchmark/hpl/tuning.html
>
> ***
>
> Good luck!
>
> I hope this helps,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
>
>
>
>
> Dave Goodell wrote:
> What MPI implementation are you using?  If you are using MPICH2,  
> please make sure that you are using the latest version (1.2.1).
>
> -Dave
>
> On Dec 4, 2009, at 3:02 AM, liaoyin wrote:
>
> I am runing the  HPL ( A Portable Implementation of the High- 
> Performance Linpack Benchmark for Distributed-Memory Computers) to  
> do linpcke test.
>
> I use the Atlas to as blas lib.
>
> My machine has  one cpu with four core.
>
> I run the mpirun -np 1 ./xhlp   the Gflops is 0.4  (one core to run)
>
> but  I run the mpirun -np 2  ./xhlp the Gflops is 0.04 (two core to  
> run)
>
> why is the two core is slower.
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list