# [petsc-users] Using petsc for banded matrices and 2D finite differences

Brandt Belson bbelson at princeton.edu
Tue Nov 15 15:42:36 CST 2011

```Yes, I'm treating the nonlinear term explicitly.

Ok, I now see what you mean about b=sqrt(N). I misunderstood the definition
of bandwidth. I guess with petsc it is easy to experiment with multigrid vs
sparse direct solvers like Jack mentioned, so maybe I'll try both.

I see what you and Jack mean about using blocks of processes for xy planes,
transposing the data, and using multiple processes on each z. I'll code
things flexibly so I can change how the data is distributed.

I'll probably have more questions as I get deeper, but I think I'm sold on
using petsc. Thanks for your quick replies and guidance!
Brandt

On Tue, Nov 15, 2011 at 3:59 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:

> On Tue, Nov 15, 2011 at 14:49, Brandt Belson <bbelson at princeton.edu>wrote:
>
>> Sorry, I was mistaken about the dt/Re coefficient. I use an implicit time
>> step on the linear part of incompressible Navier-Stokes, so roughly
>> discretizing du/dt = 1/Re * Lu with implicit Euler for simplicity gives:
>>
>> (u^{n+1} - u^n) / dt = 1/Re L u^{n+1}
>> rearranging:
>> (I - dt/Re L) u^{n+1} = u^n
>>
>>
>
> Right. And you are treating the (nonlinear) convective term explicitly?
>
>
>>
>> For tri-diagonal matrices, I believe the direct solve is O(N), and
>> similar methods exist for block tri-diagonal matrices. I know multigrid is
>> also O(N), but I've heard it tends to be slower.
>>
>
> The "block tri-diagonal" matrices have to deal with fill. They are not
> O(N) because the bandwidth b=sqrt(N), so you need b*N storage and b^2*N
> time (as Jack said). At high Reynolds number and short time steps (high
> resolution), your system is pretty well-conditioned because it is mostly
> identity. You should be able to solve it very fast using iterative methods
> (perhaps with a lightweight multigrid or other preconditioner).
>
>
>>
>> I think the bandwidth (b) of the matrices is small. For example for a
>> 5-point stencil (center, left, right, below, above) the bandwidth is 5 and
>> for a 9-point stencil it is 9.
>>
>
> No, if the neighbors in the x-direction are "close", then the neighbors in
> the y-direction will be "far" (order n=sqrt(N) for an n*n grid).
>
>
>>
>> The other limiting factor of doing option 1, with the all-to-all, is the
>> number of nodes is limited by nz. That might be ok, but I would like the
>> code to scale to larger grids in the future.
>>
>
> Hence Jack's hybrid suggestion of using process blocks to handle chunks.
> As usual with parallel programming, choose a distribution and write your
> code, but as you write it, keep in mind that you will likely change the
> distribution later (e.g. don't hard-code the bounds for "local"
> computations).
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20111115/87a9f3ac/attachment.htm>
```