[petsc-dev] Generality of VecScatter

Fri Nov 25 09:51:19 CST 2011

On Fri, Nov 25, 2011 at 09:38, Mark F. Adams <mark.adams at columbia.edu>wrote:

> My point is that at some point the rubber has to hit the road and you need
> two side information.  You can then set up pointers for some fast one-sided
> stuff, and that is fine but you seem to agree that this is trivial.  Its
> not clear to me that your model is not simply removing some (redundant)
> data from the user space, while it is probably in the MPI library.
>

Whether this information is ever visible inside the MPI stack is dependent
on the network. In general, it is never stored at once within the stack and
the operations may be done without involving the CPU on the owning process
(some network cards can do these operations directly on memory). If the MPI
stack wanted to, it could log these operations, but I really meant it when
I said it's not needed.

Even with some funky implementation that built full two-sided information
internally, I think the simpler user-level abstraction would be worthwhile.

>
> This information is communicated to the ghosters of that point (again
> without knowing how many there are),
>
>
> But you knew how many there were, and chose to forget it, and the MPI
> layer must know how many there are so it knows when its done communicating.
>  I'm not sure how you insure correctness with your model.
>

I never knew. We can ask: what is required to add another level of
ghosting? With my model, the owner never needs to be informed that some
procs are ghosting another level, nor be aware what those ghosted nodes
are. It may place some data in an array for the "pointwise broadcast" of
connectivity, but it doesn't need semantic knowledge that that information
will used to increase the ghosting. Similarly, any process can stop
ghosting a point without informing the owner in any way.

Completion comes through MPI_Win_fence() which is collective, but not
globally synchronizing (so neighbor communication must have completed, but
it's not like an MPI_Barrier or MPI_Allreduce.

>  I can assume that my fine grid was well partitioned, and to keep the
>> process local assume the fine grid was partitioned with a space filling
>> curve, or something, such that when I aggregate processors (eg, (0-3),
>> (4-7) ... if I reduced the number of active processor by 4) I still have
>> decent data locality.  Eventually I would like a local algorithm to improve
>> load balance at this point, but that is more than you might want to start
>> with.
>>
>
> If you have a local graph partitioner that can tell you what to move for
> better load balance, then I think I can move it efficiently and without
> redundant storage or specification.
>
>
> How do you load balance with a local partitioner?

You said you wanted a local algorithm to improve load balance. I took that
to mean that you would redistribute within a subcommunicator and, with
simple communication, inform the rest of the processes of the new
partition. I think this will work fine.

What did you have in mind?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111125/9e29180d/attachment.html>