<div dir="ltr">I tried periodic boundary conditions but found load-imbalance still existed. So the boundary may not be a big issue. I am debugging the code to see why. Thanks. </div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">--Junchao Zhang</div></div></div>
<br><div class="gmail_quote">On Fri, Jun 15, 2018 at 2:26 AM, Michael Becker <span dir="ltr"><<a href="mailto:michael.becker@physik.uni-giessen.de" target="_blank">michael.becker@physik.uni-giessen.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p>Hello,<br>
</p>
<p>thanks again for your efforts. <br>
</p>
<p><span class="">
</span></p><blockquote type="cite"><span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">Boundary
processors have less nonzeros than interior processors.</span></blockquote>
Doesn't that mean that boundary processors have less work to do
than the others? Or does this affect the size of the coarse grids?<p></p>
<p>I know that defining the outermost nodes as boundary is not the
most efficient way in this particular case (using Dirichlet
boundary conditions on a smaller grid would do the same), but I
need the solver to be able to handle arbitrarily shaped boundaries
inside the domain, e.g. to calculate the potential inside a
spherical capacitor (constant potential on the boundaries, charge
distribution inside). Is there a better way to do that?</p><span class="HOEnZb"><font color="#888888">
<p>Michael<br>
</p></font></span><div><div class="h5">
<p><br>
</p>
<p><br>
</p>
<br>
<div class="m_576271657985320242moz-cite-prefix">Am 12.06.2018 um 22:07 schrieb Junchao
Zhang:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hello, Michael,
<div> Sorry for the delay. I am actively doing experiments with
your example code. I tested it on a cluster with 36
cores/node. To distribute MPI ranks evenly among nodes, I used
216 and 1728 ranks instead of 125, 1000. So far I have these
findings:<br>
</div>
<div> 1) It is not a strict weak scaling test since with <span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">1728
ranks it needs more KPS iterations, and more calls to MatSOR
etc functions.</span></div>
<div><span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"> 2)
If I use half cores per node but double the nodes (keep MPI
ranks the same), the performance is 60~70% better. It
implies memory bandwidth plays an important role in
performance.</span></div>
<div><span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"> 3)
I find you define the outermost two layers of nodes of the
grid as boundary. Boundary processors have less nonzeros
than interior processors. It is a source of load imbalance.
At coarser grids, it gets worse. But I need to confirm this
caused the poor scaling and big vecscatter delays in the
experiment. </span></div>
<div><span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"><br>
</span></div>
<div><span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"> Thanks.</span></div>
<div><span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"><br>
</span></div>
</div>
<div class="gmail_extra"><br clear="all">
<div>
<div class="m_576271657985320242gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">--Junchao Zhang</div>
</div>
</div>
<br>
<div class="gmail_quote">On Tue, Jun 12, 2018 at 12:42 AM,
Michael Becker <span dir="ltr"><<a href="mailto:michael.becker@physik.uni-giessen.de" target="_blank">michael.becker@physik.uni-<wbr>giessen.de</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p>Hello,</p>
<p>any new insights yet?</p>
<span class="m_576271657985320242HOEnZb"><font color="#888888">
<p>Michael<br>
</p>
</font></span><span> <br>
<br>
<br>
<div class="m_576271657985320242m_4694688645138750811moz-cite-prefix">Am
04.06.2018 um 21:56 schrieb Junchao Zhang:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Miachael, I can compile and run you
test. I am now profiling it. Thanks.</div>
<div class="gmail_extra"><br clear="all">
<div>
<div class="m_576271657985320242m_4694688645138750811gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">--Junchao Zhang</div>
</div>
</div>
<br>
</div>
</blockquote>
<br>
</span></div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div></div></div>
</blockquote></div><br></div>