[petsc-dev] Parallel Gauss - Seidel

Sun Jun 10 18:22:59 CDT 2012

On Jun 10, 2012, at 7:02 PM, Barry Smith wrote:

> 
> On Jun 10, 2012, at 5:54 PM, Mark F. Adams wrote:
> 
>> Barry,
>> 
>> Your algorithm is the classic CS algorithm.  My algorithms exploits the (low dimensional) geometric nature of our graphs.  Assuming you have a decent partitioning, as Jed says, you have, for instance, "interior" vertices that have no external dependence so coloring them is silly.
> 
>    Sure absolutely in the MPI world. If you are going for vectorization it may not be silly.

Yes, but you are just in the same boat as Mat-Vec.  And you still have the freedom to do what you want even if you may end up coloring.

> 
>> The algorithm enforces only the (external) data dependencies that are required for correctness, which for normal PDE problems results in a far shorter dependency depth than generic coloring and leaves you some freedom to reorder for performance.
>> 
>> From a theoretically point of view there is actually a coloring in the algorithm but it is not obvious.  For instance, if there is only one vertex per process then it basically reverts to a coloring algorithm (although it is still asynchronous) where the processor IDs (in the way that I implement it for simplicity) are the "random" numbers that one can use for an MIS or coloring. 
> 
>   I think it would be useful to understand the algorithm exactly in terms of "block colorings?", "dependency graphs", ..... etc?
> 

I think of it entirely in terms of dependency graphs combined with the obvious intuition about the nature of PDE graphs.   The paper explains the algorithm in these terms exactly and it brings up its connection to graph coloring just out of theoretical/pedagogical interest.  There may be better or different ways to understand the algorithm ...

Mark

> 
>    Barry
> 
>> 
>> Mark
>> 
>> On Jun 10, 2012, at 1:36 PM, Jed Brown wrote:
>> 
>>> On Sun, Jun 10, 2012 at 12:30 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>  For clarification it sounds like you color your matrix (with independent sets) and then march through the colors, each color gets smoothed (in parallel since they are independent) then on to the next color?
>>> 
>>> Not independent sets, just by data dependency. For example, everyone relaxes a block along the right boundary (implying an ordering that visits those blocks in rank order, except that those strips never touch each other so they can be done in parallel), then send the updated state to the proc on the right. While waiting for the message, he might smooth another set that does not have a data dependency. When a process receives the first message (from the proc to the left), it can relax that part. There is a picture in Mark's paper that should make the algorithm clear. This approach doesn't suffer from the atrocious memory access pattern of classical coloring; it's also asynchronous and completes with fewer messages in total.
>> 
> 
>