Performance Issues on ccNuma-System
Matthew Knepley
knepley at gmail.com
Mon Oct 13 08:26:58 CDT 2008
On Mon, Oct 13, 2008 at 7:12 AM, Christoph Statz
<christoph.statz at ifn.et.tu-dresden.de> wrote:
> Dear PETSc-users,
> i'm trying to work with PETSc on a ccNuma-system, where i am confronted with
> severe performance problems.
> Is there anyone using PETSc on e.g. a SGI Altix System?
> Which are the best kernels to use on cache coherent systems?
> The fortran kernels produces many cache misses (in functions like fsolve and
> fmatmul) slowing down a 3GFLOP/s machine to about 200MFLOP/s .
> Has anyone any advice to increase speed on ccNuma-system?
1) With any performance question, please send the output of -log_summary
2) I think it is unlikely that cache misses are responsible for this
performance. It is
much more likely that bandwidth limitations are responsible.
Please see the paper
by Kaushik and Gropp which models sparse matvec performance (on
Dinesh's website).
3) You would see better performance using a block method. Sparse matvec without
blocks will never see good percentages of peak (ditto for backsolve).
Matt
> Sincerly,
> Christoph Statz
> --
> Christoph Statz
> Institut für Nachrichtentechnik
> Technische Universität Dresden
> 01062 Dresden
> Email: christoph.statz at mailbox.tu-dresden.de
> Phone: +49 351 463 32287
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener
More information about the petsc-users
mailing list