Which way to decompose domain/grid

Jed Brown jed at 59A2.org
Fri Dec 11 05:00:14 CST 2009


On Fri, 11 Dec 2009 18:24:58 +0800, Wee-Beng Tay <zonexo at gmail.com> wrote:

> But you mention abt latency, so shouldn't minimizing the number of neighbor
> processes reduce latency and improve performance?

Perhaps, depending on your network.  But there are many tricks to hide
latency of ghost updates, global reductions (in dot products) are
harder especially since MPI collectives are synchronous.  The higher
iteration counts are way more painful than marginally higher update
cost.

> For both do u mean dividing 1 big grid into 4 55x35 grids?

Yes, instead of 4 thin slices.  And so on as you refine.  DA does this
automatically, just don't choose a prime number of processes (because
then it would be forced into doing slices).

> so whichever method I use  (horizontal or vertical) doesn't matter? But
> splitting to 4 55x35 grids will be better?

Trying to send directly from some contiguous array is not a worthwhile
optimization.  My comment about latency was to guide against another
"optimization" of sending some components of a vector problem separately
when not all components "need" updating (it's likely faster to do one
update of 5 values per ghost node than to do two separate updates of 1
value per node).

Splitting into 4 subdomains isn't "better" than 2 subdomains, but when
using many subdomains, they should not be thin slices.

DA manages all of this.  If you have some compelling reason *not* to use
DA, you won't go far wrong by copying the design decisions in DA.


Jed


More information about the petsc-users mailing list