<div dir="ltr">Thank you Matt. Indeed I have looked into p4est and also Dendro. p4est uses parallel octrees/quadtrees but for what I intend to do I only need to distribute a single tree that is created in serial among processors. I definitely like to have the tree data-structure in parallel but that would be another project. I also looked into Dendro and they kind of follow the same strategy. i.e every single processor has a local copy of the whole tree. What they do differently, however, is they somehow manage to use DA instead of a general unstructured numbering which is quite interesting but I still don't know how they do it. Unfortunately, they do not handle (as far as I understood from their manual) non-graded trees which are the ones I work with.<div>
<br></div><div>So, all I need to do is to somehow distribute my grid among processors and since each one has a local copy of data-structure I could get around the problem. Just anotehr question. If the partitioning is not unique, do you at least get a better numbering than the tree you start with?</div>
<div><br></div><div>Mohammad<br><div><br><div class="gmail_quote">On Thu, Jul 28, 2011 at 9:25 PM, Matthew Knepley <span dir="ltr"><<a href="mailto:knepley@gmail.com">knepley@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class="im">On Fri, Jul 29, 2011 at 3:49 AM, Mohammad Mirzadeh <span dir="ltr"><<a href="mailto:mirzadeh@gmail.com" target="_blank">mirzadeh@gmail.com</a>></span> wrote:<br></div><div class="gmail_quote"><div class="im">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Hi all,<div><br></div><div>I am trying to write a code to do parallel computation on quadtree adaptive grids and to do so , I need to distribute the grid in parallel. I have selected a general unstructured framework for telling PETSc about my node numbering. An example of such grid is schematically shown below. </div>
</div></blockquote><div><br></div></div><div>0) If you are doing this, I think you should at least look at the p4est package before proceeding.</div><div><div></div><div class="h5"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div><br></div><div>1 16 7 3 </div><div>+---------------+---------------+------------------------------+</div><div>| | | |</div>
<div>| | | |</div>
<div>|14 | 15 | 17 |</div>
<div>+---------------+---------------+ |</div>
<div>| | | |</div><div>| | | |</div>
<div>| 4 | 12 | 6 |8</div><div>+---------------+---------------+------------------------------+</div><div>
<div>| | | |</div><div>| | | |</div>
<div>| 9 | 11 | 13 |</div><div>+---------------+---------------+ |</div><div>
| | | |</div><div>| | | |</div>
<div>| 0 | 10 |5 | 2</div></div><div>+---------------+---------------+------------------------------+</div>
<div>
<pre style="margin-top:0px;margin-bottom:0px;margin-left:0px;margin-right:0px;text-indent:0px"><br></pre><pre style="margin-top:0px;margin-bottom:0px;margin-left:0px;margin-right:0px;text-indent:0px"><font face="arial"><span style="white-space:normal">To distribute this in parallel I am using the ParMetis interface via MatPartitioning object and I follow(more or less) the example in $PETSC_DIR/src/dm/ao/examples/tutorials/ex2.c; To make the initial distribution, I choose nodal based partitioning by creating the adjacency matrix, for which I create ia and ja arrays accordingly. </span></font><span style="font-family:arial;white-space:normal">once the grid is processed and the new orderings are generated, I follow all required steps to generate the AO needed to map between PETSc ordering and the new global numbering and this is the result:</span></pre>
</div><div><br></div><div><div>Number of elements in ordering 18</div><div>PETSc->App App->PETSc</div><div> 0 9 0 1</div><div> 1 0 1 3</div><div> 2 10 2 4</div>
<div> 3 1 3 7</div><div> 4 2 4 12</div><div> 5 11 5 14</div><div> 6 12 6 15</div><div> 7 3 7 16</div><div> 8 13 8 17</div>
<div> 9 14 9 0</div><div> 10 15 10 2</div><div> 11 16 11 5</div><div> 12 4 12 6</div><div> 13 17 13 8</div><div> 14 5 14 9</div>
<div> 15 6 15 10</div><div> 16 7 16 11</div><div> 17 8 17 13</div></div><div><br></div><div>Now I have two questions/concerns: </div><div><br></div><div>1) Do processors always have the nodes in contiguous chunks of PETSc ordering? i.e 0-8 on rank 0 and 9-17 on rank 1 ? If so, this particular ordering does not seem to be "good" for this grid since it seems to cross too many edges in the graph (here 13 edges) and by just looking at the graph I can(at least) think of a better distribution with only 6 edge cuts. (if you are wondering how, having {0,9,4,14,1,10,11,12,15} on rank 0 and rest on rank 1).</div>
</div></blockquote><div><br></div></div></div><div>Yes, the PETSc ordering is always contiguous. Perhaps you are not providing the graph you think you are for partitioning.</div><div><div></div><div class="h5"><div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><div>2) Isn't it true that the final distribution should be independent of initial grid numbering? When I try the same grid but with the following (hand-generated) numbering:</div><div><br></div><div>
<div>
14 15 16 17 </div><div>+---------------+---------------+------------------------------+</div><div>| | | |</div>
<div>| | | |</div><div>|11 | 12 | 13 |</div><div>+---------------+---------------+ |</div>
<div>| | | |</div><div>| | | |</div><div>| 7 | 8 | 9 |10</div>
<div>+---------------+---------------+------------------------------+</div><div><div>| | | |</div><div>| | | |</div>
<div>| 4 | 5 | 6 |</div><div>+---------------+---------------+ |</div><div>| | | |</div>
<div>| | | |</div><div>| 0 | 1 |2 | 3</div></div><div>+---------------+---------------+------------------------------+</div>
</div><div><br></div><div>I get the following AO:</div><div><br></div><div><div>Number of elements in ordering 18</div><div>PETSc->App App->PETSc</div><div> 0 9 0 9</div><div> 1 10 1 10</div>
<div> 2 11 2 11</div><div> 3 12 3 12</div><div> 4 13 4 13</div><div> 5 14 5 14</div><div> 6 15 6 15</div><div>
7 16 7 16</div>
<div> 8 17 8 17</div><div> 9 0 9 0</div><div> 10 1 10 1</div><div> 11 2 11 2</div><div> 12 3 12 3</div><div> 13 4 13 4</div>
<div> 14 5 14 5</div><div> 15 6 15 6</div><div> 16 7 16 7</div><div> 17 8 17 8</div></div><div><br></div><div><br></div><div>which is simply the initial ordering with a change in the order in which processors handle nodes. Could it be that the partitioning is not unique and each time the algorithm only tries to obtain the "best" possible ordering depending on the initial distribution? If so, how should I know what ordering to start with?</div>
</div></blockquote><div><br></div></div></div><div>Yes, ParMetis does not provide a unique "best" ordering, which is at least NP-complete if not worse.</div><div><br></div><div> Matt</div><div class="im"><div>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><div>I am really confused and would appreciate if someone could provide some insights.</div><div><br></div><div>Thanks,</div><div>Mohammad</div></div>
</blockquote></div></div><font color="#888888"><br><br clear="all"><br>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener<br>
</font></blockquote></div><br></div></div></div>