On Thu, Oct 4, 2012 at 5:38 PM, TAY wee-beng <span dir="ltr"><<a href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>On 4/10/2012 9:21 PM, Jed Brown wrote:<br>
</div>
<blockquote type="cite">Can you send a picture of what your domain looks like
and what shape the part owned by a given processor looks like?
Best would be to write out the mesh with a variable marking the
rank owning each vertex, then do a color plot in Paraview or
whatever you use to show the partition.
<div>
<br>
</div>
<div>VecScatterBegin/End is taking much more time than these, and
really a pretty unreasonable amount of time in general.<br>
</div>
</blockquote>
<br>
Hi,<br>
<br>
I have attached my grid. I just use a simple paint software to color
a particular partition. They are Cartesian grids. The center
portion, where the wings are (in blue), have much closer spaced
grids, due to the importance of the boundary layer. Hence although
the partitions there seem thinner, the cells number for each
partition is roughly the same.<br>
<br>
As mentioned earlier, the grid is partitioned in the Z direction.
Hence, the variables are allocated as
u(1:size_x,1:size_y,ksta:kend), where ksta,kend refer to the
starting/ending indices in the z direction. Same for v,w etc. I hope
it is clear enough now.<br></div></blockquote><div><br></div><div>This is way too many emails on this list. As I said before, the Mom-Z solve is bad because the assembly of the</div><div>operator is screwed up. You are communicating too many values. So, jsut go into your code and count how many</div>
<div>off process entries you set.</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">
<blockquote type="cite">
<div>
<div class="gmail_quote">On Thu, Oct 4, 2012 at 2:16 PM,
Wee-Beng Tay <span dir="ltr"><<a href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>
<div>On 4/10/2012 5:11 PM, Matthew Knepley wrote:<br>
</div>
<blockquote type="cite">On Thu, Oct 4, 2012 at 11:01 AM,
TAY wee-beng <span dir="ltr"><<a href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span>
wrote:<br>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>On 4/10/2012 3:40 AM, Matthew Knepley
wrote:<br>
</div>
<blockquote type="cite">On Wed, Oct 3, 2012 at
4:05 PM, TAY wee-beng <span dir="ltr"><<a href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span>
wrote:<br>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>Hi Jed,<br>
<br>
I believe they are real cores. Anyway,
I have attached the log summary for
the 12/24/48 cores. I re-run a smaller
case because the large problem can't
run with 12cores.<br>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>Okay, look at VecScatterBegin/End for
24 and 48 cores (I am guessing you have 4
16-core chips, but please figure this
out).</div>
<div>The messages are logged in
ScatterBegin, and the time is logged in
ScatterEnd. From 24 to 48 cores the time
is cut in half.</div>
<div>If you were only communicating the
boundary, this is completely backwards, so
you are communicating a fair fraction of
ALL</div>
<div>the values in a subdomain. Figure out
why your partition is so screwed up and
this will go away.</div>
</div>
</blockquote>
<br>
What do you mean by "If you were only
communicating the boundary, this is completely
backwards, so you are communicating a fair
fraction of ALL the values in a subdomain"?<br>
</div>
</blockquote>
<div><br>
</div>
<div>If you have 48 partitions instead of 24, you
have a larger interface, so AssemblyEnd() should
take</div>
<div>slightly longer. However, your AssemblyEnd()
takes HALF the time, which means its communicating</div>
<div>much fewer values, which means you are not
sending interface values, you are sending interior
values,</div>
<div>since the interior shrinks when you have more
partitions.</div>
<div><br>
</div>
<div>What this probably means is that your assembly
routines are screwed up, and sending data all over
the place.</div>
<div><br>
</div>
</div>
</blockquote>
</div>
Ok I got it now. Looking at the AssemblyEnd time,<br>
<br>
12 procs<br>
<br>
MatAssemblyEnd 145 1.0 1.6342e+01 1.8 0.00e+00 0.0
4.4e+01 6.0e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0
0<br>
<br>
VecAssemblyEnd 388 1.0 1.4472e-03 1.4 0.00e+00 0.0
0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0
0<br>
<br>
24 procs<br>
<br>
MatAssemblyEnd 145 1.0 1.1618e+01 2.4 0.00e+00 0.0
9.2e+01 6.0e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0
0<br>
<br>
VecAssemblyEnd 388 1.0 2.3527e-03 2.4 0.00e+00 0.0
0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0
0<br>
<br>
48 procs<br>
<br>
MatAssemblyEnd 145 1.0 7.4327e+00 2.4 0.00e+00 0.0
1.9e+02 6.0e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0
<br>
<br>
<br>
VecAssemblyEnd 388 1.0 2.8818e-03 3.7 0.00e+00 0.0
0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0
0<br>
<br>
VecAssemblyEnd time increases with procs, does it mean
that there is nothing wrong with it?<br>
<br>
On the other hand, MatAssemblyEnd time decreases with
procs. So that's where the problem lies, is that so? <br>
<br>
I'm still scanning my code but haven't found the error
yet. It seems strange because I inserted the matrix and
vector exactly the same way for x,y,z. The u,v,w are also
allocated with the same indices. Shouldn't the error be
the same for x, y and z too?<br>
<br>
Trying to get some hints as to where else I need to look
in my code...<br>
<br>
Tks
<div>
<div><br>
<br>
<br>
<br>
<br>
<br>
<br>
<blockquote type="cite">
<div class="gmail_quote">
<div> Matt</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> I
partition my domain in the z direction, as
shown in the attached pic. The circled region
is where the airfoils are. I'm using an
immersed boundary method (IBM) code so the
grid is all Cartesian.<br>
<br>
I created my Z matrix using:<br>
<br>
call
MatCreateAIJ(MPI_COMM_WORLD,ijk_end-ijk_sta,ijk_end-ijk_sta,PETSC_DECIDE,PETSC_DECIDE,7,PETSC_NULL_INTEGER,7,PETSC_NULL_INTEGER,A_semi_z,ierr)<br>
<br>
where ijk_sta / ijk_end are the
starting/ending global indices of the row.<br>
<br>
7 is because the star-stencil is used in 3D.<br>
<br>
I create my RHS vector using:<br>
<br>
<i>call
VecCreateMPI(MPI_COMM_WORLD,ijk_end-ijk_sta,PETSC_DECIDE,b_rhs_semi_z,ierr)</i><br>
<br>
<div>The values for the matrix and vector were
calculated before PETSc logging so they
don't come into play.<br>
<br>
They are also done in a similar fashion for
matrix x and y. I still can't get it why
solving the z momentum eqn takes so much
time. Which portion should I focus on?<br>
<br>
Tks!<br>
<br>
</div>
<blockquote type="cite">
<div class="gmail_quote">
<div><br>
</div>
<div> Matt</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>
<pre cols="72">Yours sincerely,
TAY wee-beng</pre>
<div>
<div> On 3/10/2012 5:59 PM, Jed
Brown wrote:<br>
</div>
</div>
</div>
<div>
<div>
<blockquote type="cite">There is
an inordinate amount of time
being spent in VecScatterEnd().
That sometimes indicates a very
bad partition. Also, are your
"48 cores" real physical cores
or just "logical cores" (look
like cores to the operating
system, usually advertised as
"threads" by the vendor, nothing
like cores in reality)? That can
cause a huge load imbalance and
very confusing results as
over-subscribed threads compete
for shared resources. Step it
back to 24 threads and 12
threads, send log_summary for
each.<br>
<br>
<div class="gmail_quote">On Wed,
Oct 3, 2012 at 8:08 AM, TAY
wee-beng <span dir="ltr"><<a href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>
<div>On 2/10/2012 2:43
PM, Jed Brown wrote:<br>
</div>
<blockquote type="cite">On
Tue, Oct 2, 2012 at
8:35 AM, TAY wee-beng
<span dir="ltr"><<a href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span>
wrote:<br>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>Hi,<br>
<br>
I have
combined the
momentum
linear eqns
involving
x,y,z into 1
large matrix.
The Poisson
eqn is solved
using HYPRE
strcut format
so it's not
included. I
run the code
for 50
timesteps
(hence 50
kspsolve)
using 96
procs. The
log_summary is
given below. I
have some
questions:<br>
<br>
1. After
combining the
matrix, I
should have
only 1 PETSc
matrix. Why
does it says
there are 4
matrix, 12
vector etc? <br>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>They are part
of
preconditioning.
Are you sure
you're using Hypre
for this? It looks
like you are using
bjacobi/ilu.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div> <br>
2. I'm looking
at the stages
which take the
longest time.
It seems that
MatAssemblyBegin,
VecNorm,
VecAssemblyBegin,
VecScatterEnd
have very high
ratios. The
ratios of some
others are
also not too
good (~ 1.6 -
2). So are
these stages
the reason why
my code is not
scaling well?
What can I do
to improve it?<br>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>3/4 of the
solve time is
evenly balanced
between MatMult,
MatSolve,
MatLUFactorNumeric,
and
VecNorm+VecDot.</div>
<div><br>
</div>
<div>The high
VecAssembly time
might be due to
generating a lot
of entries
off-process?</div>
<div><br>
</div>
<div>In any case,
this looks like an
_extremely_ slow
network, perhaps
it's
misconfigured?</div>
</div>
</blockquote>
<br>
</div>
My cluster is configured
with 48 procs per node. I
re-run the case, using
only 48 procs, thus
there's no need to pass
over a 'slow'
interconnect. I'm now also
using GAMG and BCGS for
the poisson and momentum
eqn respectively. I have
also separated the x,y,z
component of the momentum
eqn to 3 separate linear
eqns to debug the problem.
<br>
<br>
Results show that stage
"momentum_z" is taking a
lot of time. I wonder if
it has to do with the fact
that I am partitioning my
grids in the z direction.
VecScatterEnd, MatMult are
taking a lot of time.
VecNormalize,
VecScatterEnd, VecNorm,
VecAssemblyBegin 's ratio
are also not good.<br>
<br>
I wonder why a lot of
entries are generated
off-process.<br>
<br>
I create my RHS vector
using:<br>
<br>
<i>call
VecCreateMPI(MPI_COMM_WORLD,ijk_xyz_end-ijk_xyz_sta,PETSC_DECIDE,b_rhs_semi_z,ierr)</i><br>
<br>
where ijk_xyz_sta and
ijk_xyz_end are obtained
from<br>
<br>
<i>call
MatGetOwnershipRange(A_semi_z,ijk_xyz_sta,ijk_xyz_end,ierr)</i><br>
<br>
I then insert the values
into the vector using:<br>
<br>
<i>call
VecSetValues(b_rhs_semi_z
, ijk_xyz_end -
ijk_xyz_sta ,
(/ijk_xyz_sta :
ijk_xyz_end - 1/) ,
q_semi_vect_z(ijk_xyz_sta
+ 1 : ijk_xyz_end) ,
INSERT_VALUES , ierr)</i><br>
<br>
What should I do to
correct the problem?<br>
<br>
Thanks
<div>
<div><br>
<br>
<blockquote type="cite">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div> <br>
Btw, I insert
matrix using:<br>
<br>
<i>do
ijk=ijk_xyz_sta+1,ijk_xyz_end</i><i><br>
</i><i><br>
</i><i> II
= ijk - 1</i><i>
!Fortran shift
to 0-based</i><i><br>
</i><i> </i><i><br>
</i><i>
call
MatSetValues(A_semi_xyz,1,II,7,int_semi_xyz(ijk,1:7),semi_mat_xyz(ijk,1:7),INSERT_VALUES,ierr)</i><i><br>
</i><i><br>
</i><i>end do</i><br>
<br>
where
ijk_xyz_sta/ijk_xyz_end
are the
starting/end
index<br>
<br>
int_semi_xyz(ijk,1:7)
stores the 7
column global
indices<br>
<br>
semi_mat_xyz
has the
corresponding
values.<br>
<br>
and I insert
vectors using:<br>
<br>
call
VecSetValues(b_rhs_semi_xyz,ijk_xyz_end_mz-ijk_xyz_sta_mz,(/ijk_xyz_sta_mz:ijk_xyz_end_mz-1/),q_semi_vect_xyz(ijk_xyz_sta_mz+1:ijk_xyz_end_mz),INSERT_VALUES,ierr)<br>
<br>
Thanks!<br>
<br>
<i><br>
</i><br>
<pre cols="72">Yours sincerely,
TAY wee-beng</pre>
<div>
<div> On
30/9/2012
11:30 PM, Jed
Brown wrote:<br>
</div>
</div>
</div>
<div>
<div>
<blockquote type="cite">
<p>You can
measure the
time spent in
Hypre via
PCApply and
PCSetUp, but
you can't get
finer grained
integrated
profiling
because it was
not set up
that way.</p>
<div class="gmail_quote">On
Sep 30, 2012
3:26 PM, "TAY
wee-beng" <<a href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>>
wrote:<br type="attribution">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>On
27/9/2012 1:44
PM, Matthew
Knepley wrote:<br>
</div>
<blockquote type="cite">On
Thu, Sep 27,
2012 at 3:49
AM, TAY
wee-beng <span dir="ltr"><<a href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span>
wrote:<br>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi,<br>
<br>
I'm doing a
log summary
for my 3d cfd
code. I have
some
questions:<br>
<br>
1. if I'm
solving 3
linear
equations
using ksp, is
the result
given in the
log summary
the total of
the 3 linear
eqns'
performance?
How can I get
the
performance
for each
individual
eqn?<br>
</blockquote>
<div><br>
</div>
<div>Use
logging
stages: <a href="http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogStagePush.html" target="_blank">http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogStagePush.html</a></div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
2. If I run my
code for 10
time steps,
does the log
summary gives
the total or
avg
performance/ratio?<br>
</blockquote>
<div><br>
</div>
<div>Total.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
3. Besides
PETSc, I'm
also using
HYPRE's native
geometric MG
(Struct) to
solve my
Cartesian's
grid CFD
poisson eqn.
Is there any
way I can use
PETSc's log
summary to get
HYPRE's
performance?
If I use
boomerAMG thru
PETSc, can I
get its
performance?</blockquote>
<div><br>
</div>
<div>If you
mean flops,
only if you
count them
yourself and
tell PETSc
using <a href="http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogFlops.html" target="_blank">http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogFlops.html</a></div>
<div><br>
</div>
<div>This is
the
disadvantage
of using
packages that
do not
properly
monitor things
:)</div>
<div><br>
</div>
<div> Matt</div>
<div> </div>
</div>
</blockquote>
So u mean if I
use boomerAMG
thru PETSc,
there is no
proper way of
evaluating its
performance,
beside using
PetscLogFlops?<br>
<blockquote type="cite">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<span><font color="#888888"><br>
-- <br>
Yours
sincerely,<br>
<br>
TAY wee-beng<br>
<br>
</font></span></blockquote>
</div>
<br>
<br clear="all">
<span><font color="#888888">
<div><br>
</div>
-- <br>
What most
experimenters
take for
granted before
they begin
their
experiments is
infinitely
more
interesting
than any
results to
which their
experiments
lead.<br>
-- Norbert
Wiener<br>
</font></span></blockquote>
<span><font color="#888888">
<br>
</font></span></div>
<span><font color="#888888">
</font></span></blockquote>
<span><font color="#888888">
</font></span></div>
<span><font color="#888888">
</font></span></blockquote>
<span><font color="#888888">
<br>
</font></span></div>
<span><font color="#888888">
</font></span></div>
<span><font color="#888888">
</font></span></div>
<span><font color="#888888">
</font></span></blockquote>
<span><font color="#888888">
</font></span></div>
<span><font color="#888888">
<br>
</font></span></blockquote>
<span><font color="#888888"> <br>
</font></span></div>
<span><font color="#888888"> </font></span></div>
<span><font color="#888888"> </font></span></div>
<span><font color="#888888">
</font></span></blockquote>
<span><font color="#888888"> </font></span></div>
<span><font color="#888888"> <br>
</font></span></blockquote>
<span><font color="#888888"> <br>
</font></span></div>
<span><font color="#888888"> </font></span></div>
<span><font color="#888888"> </font></span></div>
<span><font color="#888888"> </font></span></blockquote>
<span><font color="#888888"> </font></span></div>
<span><font color="#888888"> <br>
<br clear="all"><span class="HOEnZb"><font color="#888888">
<div><br>
</div>
-- <br>
What most experimenters take for granted
before they begin their experiments is
infinitely more interesting than any
results to which their experiments lead.<br>
-- Norbert Wiener<br>
</font></span></font></span></blockquote><span class="HOEnZb"><font color="#888888">
<br>
</font></span></div><span class="HOEnZb"><font color="#888888">
</font></span></blockquote><span class="HOEnZb"><font color="#888888">
</font></span></div><span class="HOEnZb"><font color="#888888">
<br>
<br clear="all">
<div><br>
</div>
-- <br>
What most experimenters take for granted before they
begin their experiments is infinitely more
interesting than any results to which their
experiments lead.<br>
-- Norbert Wiener<br>
</font></span></blockquote><span class="HOEnZb"><font color="#888888">
<br>
</font></span></div><span class="HOEnZb"><font color="#888888">
</font></span></div><span class="HOEnZb"><font color="#888888">
</font></span></div><span class="HOEnZb"><font color="#888888">
</font></span></blockquote><span class="HOEnZb"><font color="#888888">
</font></span></div><span class="HOEnZb"><font color="#888888">
<br>
</font></span></div>
</blockquote>
<br>
</div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener<br>