[petsc-users] Tuning the parallel performance of a 3D FEM CFD code

Jed Brown jed at 59A2.org
Fri May 13 09:34:50 CDT 2011

How small is the Reynolds number? There is a difference between 0.1 and 100,
although both may be laminar.

On Fri, May 13, 2011 at 15:50, Henning Sauerland <uerland at gmail.com> wrote:

> The problem is discretized using FEM (more precisely XFEM) with stabilized,
> trilinear hexahedral elements. As the XFEM approximation space is
> time-dependant as well as the physical properties at the nodes the resulting
> system may change quite significantly between time steps.

I assume the XFEM basis is just resolving the jump across the interface.
Does this mean the size of the matrix is changing as the interface moves? In
any case, it looks like you can't amortize setup costs between time steps,
so we need a solution short of a direct solve.

> ILU(2) requires less than half the number of KSP iterations, but it scales
> similar to ILU(0) and requires about 1/3 more time.

Eventually, we're going to need a coarse level to battle the poor scaling
with more processes. Hopefully that will alleviate the need for these more
expensive local solves.

> I guess you are talking about the nonlinear iterations? I was always
> referring to the KSP iterations and I thought that the ksp iteration count
> grows with increasing number of processors is more or less solely related to
> the iterative solver and preconditioner.

I meant linear iterations. It is mostly dependent on preconditioner. The
increased iteration count (when a direct subdomain solver is used, inexact
subdomain solves confuse things) is likely due to the fundamental scaling
for elliptic problems. There are many ways to improve constants, but you
need a coarse level to fix the asymptotics.

Unfortunately, multigrid methods for XFEM are a recent topic. Perhaps the
best results I have seen (at conferences) use some geometric information in
an otherwise algebraic framework. For this problem (unlike many fracture
problems), the influence of the extended basis functions may be local enough
that you can build a coarse level using the conventional part of the
problem. The first thing I would try is probably to see if a direct solve
with the conventional part makes an effective coarse level. If that works, I
would see if ML or Hypre can do a reasonable job with that part of the

I have no great confidence that this will work, it's highly dependent on how
local the influence of the extended basis functions is. Perhaps you have
enough experience with the method to hypothesize.

Note: for the conventional part of the problem, it is still incompressible
flow. It sounds like you are using equal-order elements (Q1-Q1 stabilized;
PSPG or Dohrmann&Bochev?). For those elements, you will want to interlace
the velocity and pressure degrees of freedom, then use a smoother that
respects the block size. PETSc's ML and Hypre interfaces forward this
information if you set the block size on the matrix. When using ML, you'll
probably have to make the smoothers stronger. There are also some "energy
minimization" options that may help.

>  ibcgs is slightly faster, requiring less number of ksp iterations
> compared to lgmres. Unfortunately, the iteration count scales very similar
> to lgmres and generally the lack of robustness of bcgs solvers turns out to
> problematic for tougher testcases in my experience.

Yes, that suggestion was only an attempt to improve the constants a little.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110513/6a196885/attachment.htm>

More information about the petsc-users mailing list