<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Sep 15, 2014 at 1:42 PM, Katy Ghantous <span dir="ltr"><<a href="mailto:katyghantous@gmail.com" target="_blank">katyghantous@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Matt, thanks! i will look into that and find other ways to make the computation faster.<br><br>Barry, the benchmark reports up to 2 speedup, but says 1 node in the end. but either way i was expecting a higher speedup.. 2 is the limit for two cpus despite the multiple cores?</div></div></blockquote><div><br></div><div>This is a bit of a scam on the part of chip makers. The speed of your computations, say VecAXPY and simple stencil operations,</div><div>is not determined by the flop rate, but by the memory bandwidth. They sell you a computer with a great flop rate, but not much</div><div>bandwidth at all. This is much like a car dealer who sells you a car with an incredible amount of torque, just loads of torque, enough</div><div>torque to tear down a building, but that is not going to make you go faster.</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">please let me know if the file attached is what you are asking for. <br>Thank you!<br><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Sep 15, 2014 at 8:23 PM, Barry Smith <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
Please send the output from running<br>
<br>
make steams NPMAX=32<br>
<br>
in the PETSc root directory.<br>
<br>
<br>
Barry<br>
<br>
My guess is that it reports “one node” is just because it uses the “hostname” to distinguish nodes and though your machine has two CPUs, from the point of view of the OS it has only a single hostname and hence reports just one “node”.<br>
<div><div><br>
<br>
On Sep 15, 2014, at 12:45 PM, Katy Ghantous <<a href="mailto:katyghantous@gmail.com" target="_blank">katyghantous@gmail.com</a>> wrote:<br>
<br>
> Hi,<br>
> I am using DMDA to run in parallel TS to solves a set of N equations. I am using DMDAGetCorners in the RHSfunction with setting the stencil size at 2 to solve a set of coupled ODEs on 30 cores.<br>
> The machine has 32 cores (2 physical CPUs with 2x8 core each with speed of 3.4Ghz per core).<br>
> However, mpiexec with more than one core is showing no speedup.<br>
> Also at the configuring/testing stage for petsc on that machine, there was no speedup and it only reported one node.<br>
> Is there somehting wrong with how i configured petsc or is the approach inappropriate for the machine?<br>
> I am not sure what files (or sections of the code) you would need to be able to answer my question.<br>
><br>
> Thank you!<br>
<br>
</div></div></blockquote></div><br></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener
</div></div>