<div class="gmail_quote">On Tue, Nov 15, 2011 at 10:57, Brandt Belson <span dir="ltr">&lt;<a href="mailto:bbelson@princeton.edu">bbelson@princeton.edu</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div>The matrix solves in each x-y plane are linear. The matrices depend on the z wavenumber and so are different at each x-y slice. The equations are basically Helmholtz and Poisson type.</div></blockquote><div><br></div>

<div>What is the sign of the shift (&quot;good&quot; or &quot;bad&quot; Helmholtz)? If bad, is the wave number high?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div> They are 3D, but when done in Fourier space, they decouple so each x-y plane can be solved independently. </div></blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


<div><br></div><div>I&#39;d like to run on a few hundred processors, but if possible I&#39;d like it to scale to more processors for higher Re. I agree that keeping the z-dimension data local is beneficial for FFTs.</div>

</blockquote></div><br><div>That process count still means about 1M dofs per process, so having 500 in one direction is still fine. It would be nice to avoid a direct solve on each slice, in which case the partition you describe should be fine. If you can&#39;t avoid it, then you may want to do a parallel &quot;transpose&quot; where you can solve planar problems on sub-communicators. Jack Poulson (Cc&#39;d) may have some advice because he has been doing this for high frequency Helmholtz.</div>