I&#39;m not sure what you mean by the sign of the shift, but the equations are roughly of the form:<div><br></div><div>(I dt/Re - L) u = f</div><div><br></div><div>where dt~0.1, Re~1000, and L is the Laplacian in 3D, so once it is Fourier transformed each x-y plane has equations like this:</div>


<div><br></div><div>(I dt/Re + I k_z^2 - L_{k_z}) \hat{u}_{k_z} = \hat{f}_{k_z}<br><br>I&#39;m not sure which wavenumber you mean, but k_z goes as nz. </div><div><br></div><div>Thanks,</div><div>Brandt<br><br><br><div class="gmail_quote">


On Tue, Nov 15, 2011 at 12:12 PM, Jed Brown <span dir="ltr">&lt;<a href="mailto:jedbrown@mcs.anl.gov">jedbrown@mcs.anl.gov</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


<div class="gmail_quote"><div class="im">On Tue, Nov 15, 2011 at 10:57, Brandt Belson <span dir="ltr">&lt;<a href="mailto:bbelson@princeton.edu" target="_blank">bbelson@princeton.edu</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div>The matrix solves in each x-y plane are linear. The matrices depend on the z wavenumber and so are different at each x-y slice. The equations are basically Helmholtz and Poisson type.</div></blockquote><div><br></div>


</div><div>What is the sign of the shift (&quot;good&quot; or &quot;bad&quot; Helmholtz)? If bad, is the wave number high?</div><div class="im"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div> They are 3D, but when done in Fourier space, they decouple so each x-y plane can be solved independently. </div></blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div><br></div><div>I&#39;d like to run on a few hundred processors, but if possible I&#39;d like it to scale to more processors for higher Re. I agree that keeping the z-dimension data local is beneficial for FFTs.</div>


</blockquote></div></div><br><div>That process count still means about 1M dofs per process, so having 500 in one direction is still fine. It would be nice to avoid a direct solve on each slice, in which case the partition you describe should be fine. If you can&#39;t avoid it, then you may want to do a parallel &quot;transpose&quot; where you can solve planar problems on sub-communicators. Jack Poulson (Cc&#39;d) may have some advice because he has been doing this for high frequency Helmholtz.</div>


</blockquote></div><br></div>