[petsc-users] PETSc finite difference performance question (calculating gradient square)

Tue Jun 18 22:46:46 CDT 2013

Hi PETSC users and developers,

I am new to PETSC and I just want to have a communication with users and
developers for a question I am concern. All I am concern is the performance
of calculating Nabla_square in a parallel program with PETSC implemented.

For example, say one have a 3 dimension function: f(x,y,z). We want nabla^2
f. i.e. (d^2/dx^2+ d^2/dy^2+ d^2/dz^2)f(x,y,z). (Make problem simpler
assuming d^2/dxdy*f=0.)

But on a computer, what we can do is usually finite difference, Thus in
order to calculate the nabla^2*f on point x, y, z, we need all its nearest
neighbors.

That is to say, at leaste we need x+1, x-1, y+1, y-1, z+1 and z -1 (assuming
2nd order central).

In the RAM, all number are stored in a ONE dimensional array. So the number
is stored like this: ..f(x-1, y-1,z-1), f(x, y-1,z-1), f(x+1, y-1,z-1),..
f(x-1, y,z-1), f(x, y,z-1), f(x+1, y,z-1),.. ,f(x-1, y+1,z-1), f(x,
y+1,z-1), f(x+1, y+1,z-1),..

So in order to work it out, the first thing is to pick out the numbers
needed for the calculation. So each time one need to load some numbers in to
the RAM, and pick out 1 or 2 or 3 out and discard the rest of them.. Until
all the numbers one need are prepared for the calculation.

It is clear that some there is a huge wasting here: the number that needed
for the point cannot be loaded for immediately. Thus, the bottle neck is not
the computing time but the loading time.

Q: So how does PETSc library handle such kind of problem? Could you please
explain it to me, if you understand how does it happened?

I'll appreciate any explanation. 

Thanks.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130618/6bb6e280/attachment.html>