<div dir="ltr">Hi all,<div><br></div><div>I am trying to write a code to do parallel computation on quadtree adaptive grids and to do so , I need to distribute the grid in parallel. I have selected a general unstructured framework for telling PETSc about my node numbering. An example of such grid is schematically shown below. </div>
<div><br></div><div>1 16 7 3 </div><div>+---------------+---------------+------------------------------+</div><div>| | | |</div>
<meta http-equiv="content-type" content="text/html; charset=utf-8"><meta http-equiv="content-type" content="text/html; charset=utf-8"><meta http-equiv="content-type" content="text/html; charset=utf-8"><div>| | | |</div>
<meta http-equiv="content-type" content="text/html; charset=utf-8"><meta http-equiv="content-type" content="text/html; charset=utf-8"><meta http-equiv="content-type" content="text/html; charset=utf-8"><div>|14 | 15 | 17 |</div>
<meta http-equiv="content-type" content="text/html; charset=utf-8"><meta http-equiv="content-type" content="text/html; charset=utf-8"><meta http-equiv="content-type" content="text/html; charset=utf-8"><div>+---------------+---------------+ |</div>
<meta http-equiv="content-type" content="text/html; charset=utf-8"><div><meta http-equiv="content-type" content="text/html; charset=utf-8">| | | |</div><div><meta http-equiv="content-type" content="text/html; charset=utf-8">| | | |</div>
<div><meta http-equiv="content-type" content="text/html; charset=utf-8">| 4 | 12 | 6 |8</div><div>+---------------+---------------+------------------------------+</div><div>
<meta http-equiv="content-type" content="text/html; charset=utf-8"><div><meta http-equiv="content-type" content="text/html; charset=utf-8">| | | |</div><div><meta http-equiv="content-type" content="text/html; charset=utf-8">| | | |</div>
<div><meta http-equiv="content-type" content="text/html; charset=utf-8">| 9 | 11 | 13 |</div><div>+---------------+---------------+ |</div><meta http-equiv="content-type" content="text/html; charset=utf-8"><div>
<meta http-equiv="content-type" content="text/html; charset=utf-8">| | | |</div><div><meta http-equiv="content-type" content="text/html; charset=utf-8">| | | |</div>
<div><meta http-equiv="content-type" content="text/html; charset=utf-8">| 0 | 10 |5 | 2</div></div><div><meta http-equiv="content-type" content="text/html; charset=utf-8">+---------------+---------------+------------------------------+</div>
<div><meta name="qrichtext" content="1"><style type="text/css">
p, li { white-space: pre-wrap; }
</style>
<pre style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;"><br></pre><pre style=" margin-top:0px; margin-bottom:0px; margin-left:0px; margin-right:0px; -qt-block-indent:0; text-indent:0px;">
<meta http-equiv="content-type" content="text/html; charset=utf-8"><font class="Apple-style-span" face="arial"><span class="Apple-style-span" style="white-space: normal;">To distribute this in parallel I am using the ParMetis interface via MatPartitioning object and I follow(more or less) the example in $PETSC_DIR/src/dm/ao/examples/tutorials/ex2.c; To make the initial distribution, I choose nodal based partitioning by creating the adjacency matrix, for which I create ia and ja arrays accordingly. </span></font><span class="Apple-style-span" style="font-family: arial; white-space: normal; ">once the grid is processed and the new orderings are generated, I follow all required steps to generate the AO needed to map between PETSc ordering and the new global numbering and this is the result:</span></pre>
</div><div><br></div><div><div>Number of elements in ordering 18</div><div>PETSc->App App->PETSc</div><div> 0 9 0 1</div><div> 1 0 1 3</div><div> 2 10 2 4</div>
<div> 3 1 3 7</div><div> 4 2 4 12</div><div> 5 11 5 14</div><div> 6 12 6 15</div><div> 7 3 7 16</div><div> 8 13 8 17</div>
<div> 9 14 9 0</div><div> 10 15 10 2</div><div> 11 16 11 5</div><div> 12 4 12 6</div><div> 13 17 13 8</div><div> 14 5 14 9</div>
<div> 15 6 15 10</div><div> 16 7 16 11</div><div> 17 8 17 13</div></div><div><br></div><div>Now I have two questions/concerns: </div><div><br></div><div>1) Do processors always have the nodes in contiguous chunks of PETSc ordering? i.e 0-8 on rank 0 and 9-17 on rank 1 ? If so, this particular ordering does not seem to be "good" for this grid since it seems to cross too many edges in the graph (here 13 edges) and by just looking at the graph I can(at least) think of a better distribution with only 6 edge cuts. (if you are wondering how, having {0,9,4,14,1,10,11,12,15} on rank 0 and rest on rank 1).</div>
<div><br></div><div>2) Isn't it true that the final distribution should be independent of initial grid numbering? When I try the same grid but with the following (hand-generated) numbering:</div><div><br></div><div>
<meta http-equiv="content-type" content="text/html; charset=utf-8"><div>
14 15 16 17 </div><div>+---------------+---------------+------------------------------+</div><div>| | | |</div>
<div>| | | |</div><div>|11 | 12 | 13 |</div><div>+---------------+---------------+ |</div>
<div>| | | |</div><div>| | | |</div><div>| 7 | 8 | 9 |10</div>
<div>+---------------+---------------+------------------------------+</div><div><div>| | | |</div><div>| | | |</div>
<div>| 4 | 5 | 6 |</div><div>+---------------+---------------+ |</div><div>| | | |</div>
<div>| | | |</div><div>| 0 | 1 |2 | 3</div></div><div>+---------------+---------------+------------------------------+</div>
</div><div><br></div><div>I get the following AO:</div><div><br></div><div><div>Number of elements in ordering 18</div><div>PETSc->App App->PETSc</div><div> 0 9 0 9</div><div> 1 10 1 10</div>
<div> 2 11 2 11</div><div> 3 12 3 12</div><div> 4 13 4 13</div><div> 5 14 5 14</div><div> 6 15 6 15</div><div>
7 16 7 16</div>
<div> 8 17 8 17</div><div> 9 0 9 0</div><div> 10 1 10 1</div><div> 11 2 11 2</div><div> 12 3 12 3</div><div> 13 4 13 4</div>
<div> 14 5 14 5</div><div> 15 6 15 6</div><div> 16 7 16 7</div><div> 17 8 17 8</div></div><div><br></div><div><br></div><div>which is simply the initial ordering with a change in the order in which processors handle nodes. Could it be that the partitioning is not unique and each time the algorithm only tries to obtain the "best" possible ordering depending on the initial distribution? If so, how should I know what ordering to start with?</div>
<div><br></div><div>I am really confused and would appreciate if someone could provide some insights.</div><div><br></div><div>Thanks,</div><div>Mohammad</div></div>