<br>
<p class="MsoNormal"><span style="">Satish,</span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">I have
reconfigured the PETSC with –download-mpich=1 and –with-device=ch3:sock. The
results show that the speed up can now remain increasing when computing cores
increase from 1 to 16. However, the maximum speed up is still only around 6.0
with 16 cores. The new log files can be found in the attachment.</span></p>
<p class="MsoNormal"><span style="" lang="EN-GB"> </span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">(1) </span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">I checked
the configuration of the first server again. This server is a shared-memory
computer, with</span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">Processors:
4 CPUS * 4Cores/CPU, with each core 2500MHz</span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">Memories:
16 *2 GB DDR2 333 MHz, dual channel, data width 64 bit, so the memory Bandwidth
for 2 memories is 64/8*166*2*2=5.4GB/s.</span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">It seems
that each core can get 2.7GB/s memory bandwidth which can fulfill the basic
requirement for sparse iterative solvers.</span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">Is this
correct? Does the shared-memory type of computer have no benefit for PETSC when
the memory bandwidth is limited?</span></p>
<p class="MsoNormal"><span style="" lang="EN-GB"> </span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">(2) </span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">Beside, we
would like to continue our work by employing a matrix partitioning / reordering
algorithm, such as Metis or ParMetis, to improve the speed up performance of
the program. (The current program works without any matrix decomposition.)</span></p>
<p class="MsoNormal"><span style="" lang="EN-GB"> </span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">Matt, as
you said in </span><span style=""><a href="http://lists.mcs.anl.gov/pipermail/petsc-users/2007-January/001017.html"><span style="" lang="EN-GB">http://lists.mcs.anl.gov/pipermail/petsc-users/2007-January/001017.html</span></a></span><span style=""> <span lang="EN-GB">,“Reordering a matrix can
result in fewer iterations for an iterative solver“. </span></span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">Do you
think the matrix partitioning/reordering will work for this program? Or any
further suggestions?</span></p>
<p class="MsoNormal"><span style="" lang="EN-GB"> </span></p>
<p class="MsoNormal"><span style="" lang="EN-GB">Any
comments are very welcome! </span><span style="">Thank you!</span></p>
<p class="MsoNormal"><span style=""> </span></p>
<br><br><br><br><br><br><div class="gmail_quote">On Mon, Dec 20, 2010 at 11:04 PM, Satish Balay <span dir="ltr"><<a href="mailto:balay@mcs.anl.gov">balay@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<div class="im">On Mon, 20 Dec 2010, Yongjun Chen wrote:<br>
<br>
> Matt, Barry, thanks a lot for your reply! I will try mpich hydra firstly and<br>
> see what I can get.<br>
<br>
</div>hydra is just the process manager.<br>
<br>
Also --download-mpich uses a slightly older version - with<br>
device=ch3:sock for portability and valgrind reasons [development]<br>
<br>
You might want to install latest mpich manually with the defaut<br>
device=ch3:nemsis and recheck..<br>
<font color="#888888"><br>
satish<br>
</font></blockquote></div><br><br clear="all"><br>