<div dir="ltr"><div><div><div>Hi Matt,<br><br></div><div>Thank you for the reply. <br><br></div>I am using University HPC which has multiple nodes, and should be good for parallel computing. The bad performance might be due to the way I install and run PETSc...<br><br></div>Looking at the output when running streams, I can see that the Processor names were the same. <br>Does that mean only one processor involved in computing, did it cause the bad performance?<br><br>Thank you very much.<br><br></div>Ph. <br><div><br>Below is testing output:<br><br><span lang="en-US"><font size="2" face="Tahoma" color="black"><span style="font-size:10pt" dir="ltr">[mpepvs@atlas5-c01
petsc-3.7.5]$ make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
PETSC_ARCH=arch-linux-cxx-opt
streams
<br>
cd src/benchmarks/streams; /usr/bin/gmake --no-print-directory
PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
PETSC_ARCH=arch-linux-cxx-opt streams<br>
/app1/centos6.3/Intel/xe_2015/impi/<a href="http://5.0.3.048/intel64/bin/mpicxx">5.0.3.048/intel64/bin/mpicxx</a> -o
MPIVersion.o c -wd1572 -g -O3 -fPIC
-I/home/svu/mpepvs/petsc/petsc-3.7.5/include
-I/hom
e/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
-I/app1/centos6.3/Intel/xe_2015/impi/<a href="http://5.0.3.048/intel64/include">5.0.3.048/intel64/include</a>
`pwd`/MPIVersion.c<br>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<br>
The version of PETSc you are using is out-of-date, we recommend updating to the new release<br>
Available Version: 3.7.6 Installed Version: 3.7.5<br>
<a href="http://www.mcs.anl.gov/petsc/download/index.html">http://www.mcs.anl.gov/petsc/download/index.html</a><br>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<br>
Running streams with 'mpiexec.hydra ' using 'NPMAX=12'<br>
Number of MPI processes 1 Processor names atlas5-c01<br>
Triad: 11026.7604 Rate (MB/s)<br>
Number of MPI processes 2 Processor names atlas5-c01 atlas5-c01<br>
Triad: 14669.6730 Rate (MB/s)<br>
Number of MPI processes 3 Processor names atlas5-c01 atlas5-c01 atlas5-c01<br>
Triad: 12848.2644 Rate (MB/s)<br>
Number of MPI processes 4 Processor names atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01<br>
Triad: 15033.7687 Rate (MB/s)<br>
Number of MPI processes 5 Processor names atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01<br>
Triad: 13299.3830 Rate (MB/s)<br>
Number of MPI processes 6 Processor names atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01<br>
Triad: 14382.2116 Rate (MB/s)<br>
Number of MPI processes 7 Processor names atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01<br>
Triad: 13194.2573 Rate (MB/s)<br>
Number of MPI processes 8 Processor names atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01<br>
Triad: 14199.7255 Rate (MB/s)<br>
Number of MPI processes 9 Processor names atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01<br>
Triad: 13045.8946 Rate (MB/s)<br>
Number of MPI processes 10 Processor names atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01<br>
Triad: 13058.3283 Rate (MB/s)<br>
Number of MPI processes 11 Processor names atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01<br>
Triad: 13037.3334 Rate (MB/s)<br>
Number of MPI processes 12 Processor names atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01<br>
Triad: 12526.6096 Rate (MB/s)<br>
------------------------------------------------<br>
np speedup<br>
1 1.0<br>
2 1.33<br>
3 1.17<br>
4 1.36<br>
5 1.21<br>
6 1.3<br>
7 1.2<br>
8 1.29<br>
9 1.18<br>
10 1.18<br>
11 1.18<br>
12 1.14<br>
Estimation of possible speedup of MPI programs based on Streams benchmark.<br>
It appears you have 1 node(s)<br>
See graph in the file src/benchmarks/streams/scaling.png</span></font></span> </div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, May 5, 2017 at 11:26 PM, Matthew Knepley <span dir="ltr"><<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="">On Fri, May 5, 2017 at 10:18 AM, Pham Pham <span dir="ltr"><<a href="mailto:pvsang002@gmail.com" target="_blank">pvsang002@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><div>Hi Satish,<br><br></div>It runs now, and shows a bad speed up:<br></div>Please help to improve this.<br></div></div></blockquote><div><br></div></span><div><a href="http://www.mcs.anl.gov/petsc/documentation/faq.html#computers" target="_blank">http://www.mcs.anl.gov/petsc/<wbr>documentation/faq.html#<wbr>computers</a><br></div><div><br></div><div>The short answer is: You cannot improve this without buying a different machine. This is</div><div>a fundamental algorithmic limitation that cannot be helped by threads, or vectorization, or</div><div>anything else.</div><div><br></div><div> Matt</div><div><div class="h5"><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div></div><div>Thank you.<br></div><br><div><div><div><div><img src="cid:ii_j2bzavzv0_15bd92ab3e055ee0" width="568" height="426"><br><br></div></div></div></div></div><div class="m_-2541224196895441124gmail-HOEnZb"><div class="m_-2541224196895441124gmail-h5"><div class="gmail_extra"><br><div class="gmail_quote">On Fri, May 5, 2017 at 10:02 PM, Satish Balay <span dir="ltr"><<a href="mailto:balay@mcs.anl.gov" target="_blank">balay@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">With Intel MPI - its best to use mpiexec.hydra [and not mpiexec]<br>
<br>
So you can do:<br>
<br>
make PETSC_DIR=/home/svu/mpepvs/pet<wbr>sc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test<br>
<br>
<br>
[you can also specify --with-mpiexec=mpiexec.hydra at configure time]<br>
<br>
Satish<br>
<br>
<br>
On Fri, 5 May 2017, Pham Pham wrote:<br>
<br>
> *Hi,*<br>
> *I can configure now, but fail when testing:*<br>
<div><div class="m_-2541224196895441124gmail-m_6268152859678446353h5">><br>
> [mpepvs@atlas7-c10 petsc-3.7.5]$ make<br>
> PETSC_DIR=/home/svu/mpepvs/pet<wbr>sc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt<br>
> test Running test examples to verify correct installation<br>
> Using PETSC_DIR=/home/svu/mpepvs/pet<wbr>sc/petsc-3.7.5 and<br>
> PETSC_ARCH=arch-linux-cxx-opt<br>
> Possible error running C/C++ src/snes/examples/tutorials/ex<wbr>19 with 1 MPI<br>
> process<br>
> See <a href="http://www.mcs.anl.gov/petsc/documentation/faq.html" rel="noreferrer" target="_blank">http://www.mcs.anl.gov/petsc/d<wbr>ocumentation/faq.html</a><br>
> mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);<br>
> possible causes:<br>
> 1. no mpd is running on this host<br>
> 2. an mpd is running but was started without a "console" (-n option)<br>
> Possible error running C/C++ src/snes/examples/tutorials/ex<wbr>19 with 2 MPI<br>
> processes<br>
> See <a href="http://www.mcs.anl.gov/petsc/documentation/faq.html" rel="noreferrer" target="_blank">http://www.mcs.anl.gov/petsc/d<wbr>ocumentation/faq.html</a><br>
> mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);<br>
> possible causes:<br>
> 1. no mpd is running on this host<br>
> 2. an mpd is running but was started without a "console" (-n option)<br>
> Possible error running Fortran example src/snes/examples/tutorials/ex<wbr>5f<br>
> with 1 MPI process<br>
> See <a href="http://www.mcs.anl.gov/petsc/documentation/faq.html" rel="noreferrer" target="_blank">http://www.mcs.anl.gov/petsc/d<wbr>ocumentation/faq.html</a><br>
> mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);<br>
> possible causes:<br>
> 1. no mpd is running on this host<br>
> 2. an mpd is running but was started without a "console" (-n option)<br>
> Completed test examples<br>
> ==============================<wbr>===========<br>
> Now to evaluate the computer systems you plan use - do:<br>
> make PETSC_DIR=/home/svu/mpepvs/pet<wbr>sc/petsc-3.7.5<br>
> PETSC_ARCH=arch-linux-cxx-opt streams<br>
><br>
><br>
><br>
><br>
</div></div>> *Please help on this.*<br>
> *Many thanks!*<br>
<div class="m_-2541224196895441124gmail-m_6268152859678446353HOEnZb"><div class="m_-2541224196895441124gmail-m_6268152859678446353h5">><br>
><br>
> On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <<a href="mailto:balay@mcs.anl.gov" target="_blank">balay@mcs.anl.gov</a>> wrote:<br>
><br>
> > Sorry - should have mentioned:<br>
> ><br>
> > do 'rm -rf arch-linux-cxx-opt' and rerun configure again.<br>
> ><br>
> > The mpich install from previous build [that is currently in<br>
> > arch-linux-cxx-opt/]<br>
> > is conflicting with --with-mpi-dir=/app1/centos6.3<wbr>/gnu/mvapich2-1.9/<br>
> ><br>
> > Satish<br>
> ><br>
> ><br>
> > On Wed, 19 Apr 2017, Pham Pham wrote:<br>
> ><br>
> > > I reconfigured PETSs with installed MPI, however, I got serous error:<br>
> > ><br>
> > > **************************ERRO<wbr>R*****************************<wbr>********<br>
> > > Error during compile, check arch-linux-cxx-opt/lib/petsc/c<wbr>onf/make.log<br>
> > > Send it and arch-linux-cxx-opt/lib/petsc/c<wbr>onf/configure.log to<br>
> > > <a href="mailto:petsc-maint@mcs.anl.gov" target="_blank">petsc-maint@mcs.anl.gov</a><br>
> > > ******************************<wbr>******************************<wbr>********<br>
> > ><br>
> > > Please explain what is happening?<br>
> > ><br>
> > > Thank you very much.<br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <<a href="mailto:balay@mcs.anl.gov" target="_blank">balay@mcs.anl.gov</a>><br>
> > wrote:<br>
> > ><br>
> > > > Presumably your cluster already has a recommended MPI to use [which is<br>
> > > > already installed. So you should use that - instead of<br>
> > > > --download-mpich=1<br>
> > > ><br>
> > > > Satish<br>
> > > ><br>
> > > > On Wed, 19 Apr 2017, Pham Pham wrote:<br>
> > > ><br>
> > > > > Hi,<br>
> > > > ><br>
> > > > > I just installed petsc-3.7.5 into my university cluster. When<br>
> > evaluating<br>
> > > > > the computer system, PETSc reports "It appears you have 1 node(s)", I<br>
> > > > donot<br>
> > > > > understand this, since the system is a multinodes system. Could you<br>
> > > > please<br>
> > > > > explain this to me?<br>
> > > > ><br>
> > > > > Thank you very much.<br>
> > > > ><br>
> > > > > S.<br>
> > > > ><br>
> > > > > Output:<br>
> > > > > ==============================<wbr>===========<br>
> > > > > Now to evaluate the computer systems you plan use - do:<br>
> > > > > make PETSC_DIR=/home/svu/mpepvs/pet<wbr>sc/petsc-3.7.5<br>
> > > > > PETSC_ARCH=arch-linux-cxx-opt streams<br>
> > > > > [mpepvs@atlas7-c10 petsc-3.7.5]$ make<br>
> > > > > PETSC_DIR=/home/svu/mpepvs/pet<wbr>sc/petsc-3.7.5<br>
> > > > PETSC_ARCH=arch-linux-cxx-opt<br>
> > > > > streams<br>
> > > > > cd src/benchmarks/streams; /usr/bin/gmake --no-print-directory<br>
> > > > > PETSC_DIR=/home/svu/mpepvs/pet<wbr>sc/petsc-3.7.5<br>
> > > > PETSC_ARCH=arch-linux-cxx-opt<br>
> > > > > streams<br>
> > > > > /home/svu/mpepvs/petsc/petsc-3<wbr>.7.5/arch-linux-cxx-opt/bin/mp<wbr>icxx -o<br>
> > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing<br>
> > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O<br>
> > > > > -I/home/svu/mpepvs/petsc/petsc<wbr>-3.7.5/include<br>
> > > > > -I/home/svu/mpepvs/petsc/petsc<wbr>-3.7.5/arch-linux-cxx-opt/incl<wbr>ude<br>
> > > > > `pwd`/MPIVersion.c<br>
> > > > > Running streams with<br>
> > > > > '/home/svu/mpepvs/petsc/petsc-<wbr>3.7.5/arch-linux-cxx-opt/bin/m<wbr>piexec '<br>
> > > > using<br>
> > > > > 'NPMAX=12'<br>
> > > > > Number of MPI processes 1 Processor names atlas7-c10<br>
> > > > > Triad: 9137.5025 Rate (MB/s)<br>
> > > > > Number of MPI processes 2 Processor names atlas7-c10 atlas7-c10<br>
> > > > > Triad: 9707.2815 Rate (MB/s)<br>
> > > > > Number of MPI processes 3 Processor names atlas7-c10 atlas7-c10<br>
> > > > atlas7-c10<br>
> > > > > Triad: 13559.5275 Rate (MB/s)<br>
> > > > > Number of MPI processes 4 Processor names atlas7-c10 atlas7-c10<br>
> > > > atlas7-c10<br>
> > > > > atlas7-c10<br>
> > > > > Triad: 14193.0597 Rate (MB/s)<br>
> > > > > Number of MPI processes 5 Processor names atlas7-c10 atlas7-c10<br>
> > > > atlas7-c10<br>
> > > > > atlas7-c10 atlas7-c10<br>
> > > > > Triad: 14492.9234 Rate (MB/s)<br>
> > > > > Number of MPI processes 6 Processor names atlas7-c10 atlas7-c10<br>
> > > > atlas7-c10<br>
> > > > > atlas7-c10 atlas7-c10 atlas7-c10<br>
> > > > > Triad: 15476.5912 Rate (MB/s)<br>
> > > > > Number of MPI processes 7 Processor names atlas7-c10 atlas7-c10<br>
> > > > atlas7-c10<br>
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10<br>
> > > > > Triad: 15148.7388 Rate (MB/s)<br>
> > > > > Number of MPI processes 8 Processor names atlas7-c10 atlas7-c10<br>
> > > > atlas7-c10<br>
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10<br>
> > > > > Triad: 15799.1290 Rate (MB/s)<br>
> > > > > Number of MPI processes 9 Processor names atlas7-c10 atlas7-c10<br>
> > > > atlas7-c10<br>
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10<br>
> > > > > Triad: 15671.3104 Rate (MB/s)<br>
> > > > > Number of MPI processes 10 Processor names atlas7-c10 atlas7-c10<br>
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10<br>
> > > > > atlas7-c10 atlas7-c10<br>
> > > > > Triad: 15601.4754 Rate (MB/s)<br>
> > > > > Number of MPI processes 11 Processor names atlas7-c10 atlas7-c10<br>
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10<br>
> > > > > atlas7-c10 atlas7-c10 atlas7-c10<br>
> > > > > Triad: 15434.5790 Rate (MB/s)<br>
> > > > > Number of MPI processes 12 Processor names atlas7-c10 atlas7-c10<br>
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10<br>
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10<br>
> > > > > Triad: 15134.1263 Rate (MB/s)<br>
> > > > > ------------------------------<wbr>------------------<br>
> > > > > np speedup<br>
> > > > > 1 1.0<br>
> > > > > 2 1.06<br>
> > > > > 3 1.48<br>
> > > > > 4 1.55<br>
> > > > > 5 1.59<br>
> > > > > 6 1.69<br>
> > > > > 7 1.66<br>
> > > > > 8 1.73<br>
> > > > > 9 1.72<br>
> > > > > 10 1.71<br>
> > > > > 11 1.69<br>
> > > > > 12 1.66<br>
> > > > > Estimation of possible speedup of MPI programs based on Streams<br>
> > > > benchmark.<br>
> > > > > It appears you have 1 node(s)<br>
> > > > > Unable to plot speedup to a file<br>
> > > > > Unable to open matplotlib to plot speedup<br>
> > > > > [mpepvs@atlas7-c10 petsc-3.7.5]$<br>
> > > > > [mpepvs@atlas7-c10 petsc-3.7.5]$<br>
> > > > ><br>
> > > ><br>
> > > ><br>
> > ><br>
> ><br>
> ><br>
><br>
<br>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div></div></div><span class="HOEnZb"><font color="#888888"><br><br clear="all"><div><br></div>-- <br><div class="m_-2541224196895441124gmail_signature">What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div>
</font></span></div></div>
</blockquote></div><br></div>