[mpich-discuss] MPI how to support the smp machine?

liaoyin ustcliao at gmail.com
Sat Dec 5 07:06:25 CST 2009


thank you .
my mpi is mpich2 1.2.

*My cluster is a  diskless platform that is load kernel and  mount
filesystem from the server.*
Single node  has  one cpu with four core.(the theoretical peak performance
in single core is 3.2 Gflops )

I am runing the  HPL ( A Portable Implementation of the High-Performance
Linpack Benchmark for Distributed-Memory Computers) to do linpcke test.

I use the Atlas to as a blas lib that is a single thread version.

case 1, I run the mpirun -np 1 ./xhlp   the Gflops is 0.4 ( I found that one
core to run with the top command  HPL.dat is that n=1000,ps*qs=1*1)

case 2, I run the mpirun -np 2 ./xhlp   the Gflops is 0.04  (I found that
one core to run with the top command HPL.dat is that n=1000,ps*qs=1*2)

I use the netpipe(http://www.scl.ameslab.gov/netpipe/) and IMB-MPI to test
the communication between two core in one cpu.


with no diskless platform.

I run mpirun -np 2 ./NPmpi  and mpirun -np 2 ./IMB-mpi in a node (two
prcesses run on the two core), the bandwidth is high and latency is small

but with diskless.

I run mpirun -np 2 ./NPmpi and mpirun -np 2 ./IMB-mpi in a node (two
prcesses run on the two core), the bandwidth is very low and latency is
large.

why?


2009/12/5 Gus Correa <gus at ldeo.columbia.edu>

> Hi Liaoyin
>
> Besides Dave Goodell's questions,
> I wonder if this may be more of an HPL problem
> than of an MPICH2 problem.
>
> Here are some questions/suggestions about HPL:
>
> ***
>
> 1) If you just want to check if MPICH2 is working,
> then don't use HPL.  Use something simpler.
> The cpi.c and hellow.c programs in the MPICH2 "examples" directory
> will tell you if MPICH2 is working properly, and are way much
> simpler than HPL to setup and run.
>
> ***
>
> 2) Note, there may be a typo on the mpirun command line
> on your message, it should be "xhpl" not "xhlp" as you wrote,
> unless you changed the executable name.
>
> ***
>
> 3) Are you sure you are running HPL on 2 cores?
>
> If you use Linux, you can submit your HPL job,
> then use "top" (and type "1") to see how many
> cores are actually running xhpl.
>
> Do you really see 2 cores in action on "top"?
>
> ***
>
> 4) What are the contents of your HPL.dat parameter file,
> when you try to run on 2 cores?
>
> Is is the same that you use for the one core run or is it different?
>
> This may not be news to you, but here it goes just in case:
>
> To run HPL on a given number of cores on your machine,
> the product of the values or Ps and Qs in HPL.dat has to be
> equal to the N value on your "mpirun -np N"
> For example, if you want to use 2 cores (mpirun -np 2),
> you could use this on your HPL.dat file:
>
> 1 2 Ps
> 2 1 Qs
>
> Or to use four cores:
>
> 1 2 4 Ps
> 4 2 1 Qs
>
> (Tip: You *must* use a *single blank space* field separator in HPL.dat.)
>
> ***
>
> 5) Very small problem sizes don't scale well with
> the number of processors.
>
> To see speedups when more cores are added,
> you need to choose a relatively large value for Ns in HPL.dat.
> However, Ns is constrained by how much memory (RAM) your computer has.
> A reasonable upper bound to Ns is sqrt(0.8*memory_in_bytes/8).
>
> See item 5) below.
>
> ***
>
> 6) Finally, if you haven't read it, the HPL TUNING file is a
> *must read* for anybody who wants to run the HPL benchmark:
>
> http://www.netlib.org/benchmark/hpl/tuning.html
>
> ***
>
> Good luck!
>
> I hope this helps,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
>
>
>
>
> Dave Goodell wrote:
>
>> What MPI implementation are you using?  If you are using MPICH2, please
>> make sure that you are using the latest version (1.2.1).
>>
>> -Dave
>>
>> On Dec 4, 2009, at 3:02 AM, liaoyin wrote:
>>
>>  I am runing the  HPL ( A Portable Implementation of the High-Performance
>>> Linpack Benchmark for Distributed-Memory Computers) to do linpcke test.
>>>
>>> I use the Atlas to as blas lib.
>>>
>>> My machine has  one cpu with four core.
>>>
>>> I run the mpirun -np 1 ./xhlp   the Gflops is 0.4  (one core to run)
>>>
>>> but  I run the mpirun -np 2  ./xhlp the Gflops is 0.04 (two core to run)
>>>
>>> why is the two core is slower.
>>>
>>> _______________________________________________
>>> mpich-discuss mailing list
>>> mpich-discuss at mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>
>>
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20091205/91e1be0a/attachment.htm>


More information about the mpich-discuss mailing list