Petsc on Blue Gene

Brian Biskeborn bbiskebo at us.ibm.com
Wed Jul 11 19:25:39 CDT 2007


> How do you know the location of these exceptions? Can you narrow down
further
> to the correct function name/source line?

I found the locations of the exceptions by forcing an abort at various
places in the code and counting the exceptions.

Line 62 (counting from 1) of ex2.c generates 10 errors:
ierr = MatView(mat,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);

Line 71 generates 12 errors:
ierr = MatTranspose(mat,&tmat);CHKERRQ(ierr);;

Line 82 generates 10 errors:
ierr = MatView(tmat,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);

Line 91 generates 10 errors:
ierr = MatView(tmat,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);

So the exceptions are occurring in MatView and MatTranspose here.

> Also do you use --with-debugging=0 for this build? Do you get the smae
> errors wih '--with-debugging=1' build?

I've been running with debugging=0, but the same errors occur with
debugging=1.

I have also improved my understanding of Blue Gene's alignment
requirements: experimentally, it looks like double values must be 4-byte
aligned, but they cannot cross a 16-byte boundary. That is, the address of
a double must be 0, 4, or 8 modulo 16. So if everything is indeed 8-byte
aligned, there should be no problem.

Lisandro:
The compiler guarantees proper alignment of stack-allocated and
statically-allocated data. Also, I think the Blue Gene implementation of
malloc always returns 16-byte aligned addresses. That means the only way to
get floating point exceptions is to use malloc'ed memory in such a way that
alignment is disrupted.

Brian




More information about the petsc-users mailing list