<div class="gmail_quote">On Thu, Dec 2, 2010 at 18:21, Matthew Knepley <span dir="ltr"><<a href="mailto:knepley@gmail.com">knepley@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div>Yes, you just captures the indices. You could even split up velocity components as well. I do</div><div>for elasticity, but for Laplace it probably not necessary.</div></blockquote><div><br></div><div>I don't know where Laplace entered the discussion.  You can split the components in the viscous part, it tends to make it less likely that incomplete factorization will produce negative pivots, and sometimes leads to a faster solver, but it is more data movement, usually more synchronization points, and definitely lower flop/s, so it's far from a clear win.</div>

<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


> Regarding all this, I have to mention that until now I've had to reorder<br>

> my matrix elements with<br>

><br>

> -pc_factor_mat_ordering_type rcm<br>

><br>

> because it seems like the nodes of my mesh are not numbered<br>

> appropriately. How should I proceed for parallel computations? In<br>

> sequential, I gain about a factor 8 in CPU time doing this.<br></blockquote><div><br></div></div><div>You can do that same thing on each process. However, I would reorder my mesh</div><div>up front.</div></blockquote>

</div><br><div>I also recommend reordering the mesh up-front, -pc_factor_mat_ordering_type only changes the order in which factorization occurs, it doesn't change the ordering of unknowns in the vector so the matrix kernels will not see the higher throughput that comes from getting the vector to properly reuse cache.</div>

<div><br></div><div>Jed</div>