[petsc-users] Tuning the parallel performance of a 3D FEM CFD code

Jed Brown jed at 59A2.org
Thu May 12 09:02:38 CDT 2011


On Thu, May 12, 2011 at 15:41, Henning Sauerland <uerland at gmail.com> wrote:

> Applying -sub_pc_type lu helped a lot in 2D, but in 3D apart from reducing
> the number of iterations the whole solution takes more than 10 times longer.
>


Does -sub_pc_type ilu -sub_pc_factor_levels 2 (default is 0) help relative
to the default? Direct subdomain solves in 3D are very expensive. How much
does the system change between time steps?

What "CFD" formulation is this (physics, discretization) and what regime
(Reynolds and Mach numbers, etc)?

I attached the log_summary output for a problem with about 240000 unkowns (1
> time step) using 4, 8 and 16 Intel Xeon E5450 processors
> (InfiniBand-connected). As far as I see the number of iterations seems to be
> the major issue here or am I missing something?


Needing more iterations is the algorithmic part of the problem, but the
relative cost of orthogonaliztaion is going up. You may want to see if the
iteration count can stay reasonable with -ksp_type ibcgs. If this works
algorithmically, it may ease the pain. Beyond that, the algorithmic scaling
needs to be improved. How does the iteration count scale if you use a direct
solver? (I acknowledge that it is not practical, but it provides some
insight towards the underlying problem.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110512/84a34b4a/attachment.htm>


More information about the petsc-users mailing list