[petsc-dev] Generality of VecScatter
Jed Brown
jedbrown at mcs.anl.gov
Fri Nov 25 15:22:56 CST 2011
On Fri, Nov 25, 2011 at 13:07, Mark F. Adams <mark.adams at columbia.edu>wrote:
> OK, thats a good start. You do need two communication steps, 1) get
> offsets, 2) get data (right?).
>
Yes, to send variable-sized data, you need two steps. For constant-sized
data, it is just one.
> You avoid packing the send data but you need to unpack the receive data,
> converting the raw graph data into the reduced ghost data. Not simple but
> writing from one side only and with only one-to-one communication is
> attractive.
>
That depends a bit on what representation you want for the ghost data. You
never "unpack" in a by-rank way since it gets put into the correct part of
the domain according to where an MPI_Datatype says to put it. (This was
part of the local setup when defining the MPI_Get() calls, but those are
internal to the library.)
If you are increasing the level of ghosting, you probably want to number
the new ghost nodes that you acquired and make the local data structure
point directly at them instead of at the (rank, offset) at which they are
owned. This is a global-to-local translation of the connectivity, so you
can do it with a hash table or sorting. I think you can also skip sorting
by pushing the new indices back through the global space.
That is, I effectively have two local-to-global maps, both pointing at the
same global space: L1toG and L2toG. I want to translate some local
information defined on L1 into L2. I can do this by fetching global indices
and inverting the local part of the map, or (with the assistance of some
memory provided by the global owner) I can do a gather L2 -> G* (G* has a
slot for each process sharing via this map) and then get those indices back
in L1 using scatter G* -> L1. I think the hash is usually better, but with
a fast network and/or highly vectorized hardware, there might be an
advantage to pushing the indices L2 -> G* -> L1 without the need for
hashing or sorting.
> So keep going. AMR is good but I'm not sure I'm up to specifying a
> complete unstructured AMR algorithm at the moment. My other example of AMG
> coarse grid process aggregation is another....
>
> Or perhaps a parallel maximal independent set algorithm, preferably my
> algorithm which is implemented in GAMG and documented in an old paper.
>
MIS is straightforward. To define the parallel graph, let's say that
ghosting processes have (owner rank, index) of neighbors. An owned point is
only a candidate for selection if all neighbors have the same rank or
smaller (according to Rule 2 of your paper).
while (candidates remain):
forall v in owned_candidates with neighbor ranks smaller than p:
mark v as selected, mark neighbors(v) as deleted (in local space)
reduce status to owner
broadcast status to ghosters
In any round, multiple procs might have marked the vertex for deletion, but
only if the vertex has a neighbor with larger owner rank, in which case it
could not have been selected.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111125/7485ab79/attachment.html>
More information about the petsc-dev
mailing list