Petsc on Blue Gene

Brian Biskeborn bbiskebo at us.ibm.com
Wed Jul 11 10:55:51 CDT 2007


> Can you send a log of these messages? Is this on BGL or BGP? Does the
> program abort? [on encountering these messages]

The program does not abort on exceptions - the only evidence of the problem
is messages in the event log reading "Kernel detected X floating point
alignment exceptions" (where X is a number usually on the order of 10^5)
followed by what looks like a series of register values. I'm running on
BGL.

> With the minimal runs I've done on BGL - I don't remember seing any
> such messages.

> [Barry can confirm this] the code in mal.c attempts to make sure the
> memory allocated by PETSc is aligned properly. [8 byte boundary for
> doubles]

> One possibility is that the data passed in to MatAssemblyBegin() is
> not aligned?

This says to me that the unaligned data is probably being generated outside
of Petsc. Thanks for the info, I now have a much better idea about where to
look for the problem!

Brian




More information about the petsc-users mailing list