On Wed, Jul 29, 2009 at 4:17 PM, BAYRAKTAR Harun <span dir="ltr"><<a href="mailto:Harun.BAYRAKTAR@3ds.com">Harun.BAYRAKTAR@3ds.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div link="blue" vlink="purple" lang="EN-US">
<div>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);">Matt,</span></p>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);">It is from the pressure poisson equation for incompressible
navier-stokes so it is elliptic. Also on 1 cpu, I am able to solve it with
reason able iteration count (i.e., 43 to 1.e-5 true res norm rel tolerance). It
is the parallel runs that really concern me.</span></p></div></div></blockquote><div>Actually, it was the 43 that really concerned me. In my experience, an MG that is doing what it is supposed<br>to on Poisson takes < 10 iterations. However, if your grid is pretty distorted, maybe it can get this bad.<br>
<br> Matt<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div link="blue" vlink="purple" lang="EN-US"><div><p><span style="font-size: 11pt; color: rgb(31, 73, 125);"> <br>
</span></p>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);">Thanks,</span></p>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);">Harun</span></p>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);"> </span></p>
<p><span style="font-size: 11pt; color: rgb(31, 73, 125);"> </span></p>
<div style="border-style: solid none none; border-color: rgb(181, 196, 223) -moz-use-text-color -moz-use-text-color; border-width: 1pt medium medium; padding: 3pt 0in 0in;">
<p><b><span style="font-size: 10pt;">From:</span></b><span style="font-size: 10pt;">
<a href="mailto:petsc-users-bounces@mcs.anl.gov" target="_blank">petsc-users-bounces@mcs.anl.gov</a> [mailto:<a href="mailto:petsc-users-bounces@mcs.anl.gov" target="_blank">petsc-users-bounces@mcs.anl.gov</a>] <b>On
Behalf Of </b>Matthew Knepley<br>
<b>Sent:</b> Wednesday, July 29, 2009 5:00 PM<br>
<b>To:</b> PETSc users list<br>
<b>Subject:</b> Re: Smoother settings for AMG</span></p>
</div>
<p> </p>
<p>On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun <<a href="mailto:Harun.BAYRAKTAR@3ds.com" target="_blank">Harun.BAYRAKTAR@3ds.com</a>> wrote:</p>
<div>
<blockquote style="border-style: none none none solid; border-color: -moz-use-text-color -moz-use-text-color -moz-use-text-color rgb(204, 204, 204); border-width: medium medium medium 1pt; padding: 0in 0in 0in 6pt; margin-left: 4.8pt; margin-right: 0in;">
<p>Hi,<br>
<br>
I am trying to solve a system of equations and I am having difficulty<br>
picking the right smoothers for AMG (using ML as pc_type) in PETSc for<br>
parallel execution. First here is what happens in terms of CG (ksp_type)<br>
iteration counts (both columns use block jacobi):</p>
</blockquote>
<div>
<p><br>
Are you sure you have an elliptic system? These iteration counts are extremely<br>
high.<br>
<br>
Matt<br>
</p>
</div>
<blockquote style="border-style: none none none solid; border-color: -moz-use-text-color -moz-use-text-color -moz-use-text-color rgb(204, 204, 204); border-width: medium medium medium 1pt; padding: 0in 0in 0in 6pt; margin-left: 4.8pt; margin-right: 0in;">
<p style="margin-bottom: 12pt;"><br>
cpus | AMG w/ ICC(0) x1
| AMG w/ SOR x4<br>
------------------------------------------------------<br>
1 | 43
|
243<br>
4 | 699
|
379<br>
<br>
x1 or x4 means 1 or 4 iterations of smoother application at each AMG<br>
level (all details from ksp view for the 4 cpu run are below). The main<br>
observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls<br>
apart in parallel. SOR on the other hand experiences a 1.5X increase in<br>
iteration count which is totally expected from the quality of coarsening<br>
ML delivers in parallel.<br>
<br>
I basically would like to find a way (if possible) to have the number of<br>
iterations in parallel stay with 1-2X of 1 cpu iteration count for the<br>
AMG w/ ICC case. Is there a way to achieve this?<br>
<br>
Thanks,<br>
Harun<br>
<br>
%%%%%%%%%%%%%%%%%%%%%%%%%<br>
AMG w/ ICC(0) x1 ksp_view<br>
%%%%%%%%%%%%%%%%%%%%%%%%%<br>
KSP Object:<br>
type: cg<br>
maximum iterations=10000<br>
tolerances: relative=1e-05, absolute=1e-50, divergence=10000<br>
left preconditioning<br>
PC Object:<br>
type: ml<br>
MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,<br>
post-smooths=1<br>
Coarse gride solver -- level 0 -------------------------------<br>
KSP Object:(mg_coarse_)<br>
type: preonly<br>
maximum iterations=1, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_coarse_)<br>
type: redundant<br>
Redundant preconditioner: First (color=0) of 4 PCs
follows<br>
KSP Object:(mg_coarse_redundant_)<br>
type: preonly<br>
maximum iterations=10000, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50, divergence=10000<br>
left preconditioning<br>
PC Object:(mg_coarse_redundant_)<br>
type: lu<br>
LU: out-of-place factorization<br>
matrix ordering: nd<br>
LU: tolerance for zero pivot 1e-12<br>
LU: factor fill ratio needed 2.17227<br>
Factored matrix follows<br>
Matrix Object:<br>
type=seqaij, rows=283,
cols=283<br>
total: nonzeros=21651,
allocated nonzeros=21651<br>
using I-node
routines: found 186 nodes, limit used is<br>
5<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=seqaij, rows=283, cols=283<br>
total: nonzeros=9967, allocated
nonzeros=14150<br>
not using I-node routines<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=283, cols=283<br>
total: nonzeros=9967, allocated nonzeros=9967<br>
not using I-node (on process 0) routines<br>
Down solver (pre-smoother) on level 1 -------------------------------<br>
KSP Object:(mg_levels_1_)<br>
type: richardson<br>
Richardson: damping factor=0.9<br>
maximum iterations=1, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_1_)<br>
type: bjacobi<br>
block Jacobi: number of blocks = 4<br>
Local solve is same for all blocks, in the following
KSP and PC<br>
objects:<br>
KSP Object:(mg_levels_1_sub_)<br>
type: preonly<br>
maximum iterations=10000, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_1_sub_)<br>
type: icc<br>
ICC: 0 levels of fill<br>
ICC: factor fill ratio allocated 1<br>
ICC: using Manteuffel shift<br>
ICC: factor fill ratio needed 0.514899<br>
Factored matrix follows<br>
Matrix Object:<br>
type=seqsbaij,
rows=2813, cols=2813<br>
total: nonzeros=48609,
allocated nonzeros=48609<br>
block size
is 1<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=seqaij, rows=2813, cols=2813<br>
total: nonzeros=94405, allocated
nonzeros=94405<br>
not using I-node routines<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=10654, cols=10654<br>
total: nonzeros=376634, allocated nonzeros=376634<br>
not using I-node (on process 0) routines<br>
Up solver (post-smoother) on level 1 -------------------------------<br>
KSP Object:(mg_levels_1_)<br>
type: richardson<br>
Richardson: damping factor=0.9<br>
maximum iterations=1<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_1_)<br>
type: bjacobi<br>
block Jacobi: number of blocks = 4<br>
Local solve is same for all blocks, in the following
KSP and PC<br>
objects:<br>
KSP Object:(mg_levels_1_sub_)<br>
type: preonly<br>
maximum iterations=10000, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_1_sub_)<br>
type: icc<br>
ICC: 0 levels of fill<br>
ICC: factor fill ratio allocated 1<br>
ICC: using Manteuffel shift<br>
ICC: factor fill ratio needed 0.514899<br>
Factored matrix follows<br>
Matrix Object:<br>
type=seqsbaij,
rows=2813, cols=2813<br>
total: nonzeros=48609,
allocated nonzeros=48609<br>
block size
is 1<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=seqaij, rows=2813, cols=2813<br>
total: nonzeros=94405, allocated
nonzeros=94405<br>
not using I-node routines<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=10654, cols=10654<br>
total: nonzeros=376634, allocated nonzeros=376634<br>
not using I-node (on process 0) routines<br>
Down solver (pre-smoother) on level 2 -------------------------------<br>
KSP Object:(mg_levels_2_)<br>
type: richardson<br>
Richardson: damping factor=0.9<br>
maximum iterations=1, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_2_)<br>
type: bjacobi<br>
block Jacobi: number of blocks = 4<br>
Local solve is same for all blocks, in the following
KSP and PC<br>
objects:<br>
KSP Object:(mg_levels_2_sub_)<br>
type: preonly<br>
maximum iterations=10000, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_2_sub_)<br>
type: icc<br>
ICC: 0 levels of fill<br>
ICC: factor fill ratio allocated 1<br>
ICC: using Manteuffel shift<br>
ICC: factor fill ratio needed 0.519045<br>
Factored matrix follows<br>
Matrix Object:<br>
type=seqsbaij,
rows=101164, cols=101164<br>
total: nonzeros=1378558,
allocated nonzeros=1378558<br>
block size
is 1<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=seqaij, rows=101164, cols=101164<br>
total: nonzeros=2655952, allocated
nonzeros=5159364<br>
not using I-node routines<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=411866, cols=411866<br>
total: nonzeros=10941434, allocated
nonzeros=42010332<br>
not using I-node (on process 0) routines<br>
Up solver (post-smoother) on level 2 -------------------------------<br>
KSP Object:(mg_levels_2_)<br>
type: richardson<br>
Richardson: damping factor=0.9<br>
maximum iterations=1<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_2_)<br>
type: bjacobi<br>
block Jacobi: number of blocks = 4<br>
Local solve is same for all blocks, in the following
KSP and PC<br>
objects:<br>
KSP Object:(mg_levels_2_sub_)<br>
type: preonly<br>
maximum iterations=10000, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_2_sub_)<br>
type: icc<br>
ICC: 0 levels of fill<br>
ICC: factor fill ratio allocated 1<br>
ICC: using Manteuffel shift<br>
ICC: factor fill ratio needed 0.519045<br>
Factored matrix follows<br>
Matrix Object:<br>
type=seqsbaij,
rows=101164, cols=101164<br>
total: nonzeros=1378558,
allocated nonzeros=1378558<br>
block size
is 1<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=seqaij, rows=101164, cols=101164<br>
total: nonzeros=2655952, allocated
nonzeros=5159364<br>
not using I-node routines<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=411866, cols=411866<br>
total: nonzeros=10941434, allocated
nonzeros=42010332<br>
not using I-node (on process 0) routines<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=411866, cols=411866<br>
total: nonzeros=10941434, allocated nonzeros=42010332<br>
not using I-node (on process 0) routines<br>
<br>
%%%%%%%%%%%%%%%%%%%%%%<br>
AMG w/ SOR x4 ksp_view<br>
%%%%%%%%%%%%%%%%%%%%%%<br>
<br>
KSP Object:<br>
type: cg<br>
maximum iterations=10000<br>
tolerances: relative=1e-05, absolute=1e-50, divergence=10000<br>
left preconditioning<br>
PC Object:<br>
type: ml<br>
MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,<br>
post-smooths=1<br>
Coarse gride solver -- level 0 -------------------------------<br>
KSP Object:(mg_coarse_)<br>
type: preonly<br>
maximum iterations=1, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_coarse_)<br>
type: redundant<br>
Redundant preconditioner: First (color=0) of 4 PCs
follows<br>
KSP Object:(mg_coarse_redundant_)<br>
type: preonly<br>
maximum iterations=10000, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_coarse_redundant_)<br>
type: lu<br>
LU: out-of-place factorization<br>
matrix ordering: nd<br>
LU: tolerance for zero pivot 1e-12<br>
LU: factor fill ratio needed 2.17227<br>
Factored matrix follows<br>
Matrix Object:<br>
type=seqaij, rows=283,
cols=283<br>
total: nonzeros=21651,
allocated nonzeros=21651<br>
using I-node
routines: found 186 nodes, limit used is<br>
5<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=seqaij, rows=283, cols=283<br>
total: nonzeros=9967, allocated nonzeros=14150<br>
not using I-node routines<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=283, cols=283<br>
total: nonzeros=9967, allocated nonzeros=9967<br>
not using I-node (on process 0) routines<br>
Down solver (pre-smoother) on level 1 -------------------------------<br>
KSP Object:(mg_levels_1_)<br>
type: richardson<br>
Richardson: damping factor=1<br>
maximum iterations=4, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_1_)<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, omega =
1<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=10654, cols=10654<br>
total: nonzeros=376634, allocated nonzeros=376634<br>
not using I-node (on process 0) routines<br>
Up solver (post-smoother) on level 1 -------------------------------<br>
KSP Object:(mg_levels_1_)<br>
type: richardson<br>
Richardson: damping factor=1<br>
maximum iterations=4<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_1_)<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, omega =
1<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=10654, cols=10654<br>
total: nonzeros=376634, allocated nonzeros=376634<br>
not using I-node (on process 0) routines<br>
Down solver (pre-smoother) on level 2 -------------------------------<br>
KSP Object:(mg_levels_2_)<br>
type: richardson<br>
Richardson: damping factor=1<br>
maximum iterations=4, initial guess is zero<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_2_)<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, omega =
1<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=411866, cols=411866<br>
total: nonzeros=10941434, allocated
nonzeros=42010332<br>
not using I-node (on process 0) routines<br>
Up solver (post-smoother) on level 2 -------------------------------<br>
KSP Object:(mg_levels_2_)<br>
type: richardson<br>
Richardson: damping factor=1<br>
maximum iterations=4<br>
tolerances: relative=1e-05, absolute=1e-50,
divergence=10000<br>
left preconditioning<br>
PC Object:(mg_levels_2_)<br>
type: sor<br>
SOR: type = local_symmetric, iterations = 1, omega =
1<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=411866, cols=411866<br>
total: nonzeros=10941434, allocated
nonzeros=42010332<br>
not using I-node (on process 0) routines<br>
linear system matrix = precond matrix:<br>
Matrix Object:<br>
type=mpiaij, rows=411866, cols=411866<br>
total: nonzeros=10941434, allocated nonzeros=42010332<br>
not using I-node (on process 0) routines<br>
<br>
</p>
</blockquote>
</div>
<p><br>
<br clear="all">
<br>
-- <br>
What most experimenters take for granted before they begin their experiments is
infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener</p>
</div>
</div>
</blockquote></div><br><br clear="all"><br>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener<br>