<div class="gmail_quote">On Mon, Jul 4, 2011 at 12:09, Haren, S.W. van (Steven) <span dir="ltr">&lt;<a href="mailto:vanharen@nrg.eu">vanharen@nrg.eu</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div id=":gm">one of the ksp solvers (Conjugate Gradient method with ILU(0) preconditioning) gives poor parallel performance for the </div></blockquote><div><br></div><div><div>We need to identify how much the poor scaling is due to the preconditioner changing (e.g. block Jacobi with ILU(0)) such that more iterations are needed versus memory bandwidth. Run with -ksp_monitor or -ksp_converged_reason to see the iterations. You can try -pc_type asm (or algebraic multigrid using third-party libraries) to improve the iteration count.</div>

<div><br></div><div>If you want help seeing what&#39;s going on, send -log_summary output for each case.</div></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div id=":gm">following settings:<br>

<br>

- number of unknowns ~ 2 million<br>

- 1, 2 and 4 processors (quad core CPU)<br></div></blockquote><div><br></div><div><div>What kind? In particular, what memory bus and how many channels? Sparse matrix kernels are overwhelmingly limited by memory performance, so extra cores do very little good unless the memory system is very good (or the matrix fits in cache).</div>

</div><div><br></div></div>