On Mon, Mar 7, 2011 at 11:39 AM, M. Scot Breitenfeld <span dir="ltr">&lt;<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>&gt;</span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

On 03/04/2011 03:20 PM, Matthew Knepley wrote:<br>

&gt; On Fri, Mar 4, 2011 at 3:14 PM, M. Scot Breitenfeld &lt;<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a><br>

&gt; &lt;mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>&gt;&gt; wrote:<br>

&gt;<br>

&gt;     On 03/03/2011 12:18 PM, Matthew Knepley wrote:<br>

&gt;     &gt; On Wed, Mar 2, 2011 at 4:52 PM, M. Scot Breitenfeld<br>

&gt;     &lt;<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> &lt;mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>&gt;<br>

&gt;     &gt; &lt;mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> &lt;mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>&gt;&gt;&gt; wrote:<br>

&gt;     &gt;<br>

&gt;     &gt;     I don&#39;t number my global degree&#39;s of freedom from low-high<br>

&gt;     &gt;     continuously<br>

&gt;     &gt;     per processor as PETSc uses for ordering, but I use the natural<br>

&gt;     &gt;     ordering<br>

&gt;     &gt;     of the application, I then use AOcreateBasic to obtain the<br>

&gt;     mapping<br>

&gt;     &gt;     between the PETSc and my ordering.<br>

&gt;     &gt;<br>

&gt;     &gt;<br>

&gt;     &gt; I would suggest using the LocalToGlobalMapping functions, which are<br>

&gt;     &gt; scalable.<br>

&gt;     &gt; AO is designed for complete global permutations.<br>

&gt;     I don&#39;t understand how I can avoid not using AO if my global dof per<br>

&gt;     processor are not arranged in PETSc global ordering (continuous row<br>

&gt;     ordering, i.e. proc1_ 0-n, proc2_n+1:m, proc3_m+1:p, etc...). In the<br>

&gt;     LocalToGlobalMapping routines, doesn&#39;t the &quot;GlobalMapping&quot; part mean<br>

&gt;     PETSc ordering and not my application&#39;s ordering.<br>

&gt;<br>

&gt;     I thought I understood the difference between AO and<br>

&gt;     LocalToGlobalMapping but now I&#39;m confused. I tried to use the<br>

&gt;     LocalToGlobalMapping routines and the solution values are correct but<br>

&gt;     the ordering corresponds the global node ordering, not how I<br>

&gt;     partitioned<br>

&gt;     the mesh. In other words, the values are returned in the same ordering<br>

&gt;     as for a serial run, which makes sense since this is how PETSc orders<br>

&gt;     the rows. If I had used PETSc ordering then this would be fine.<br>

&gt;<br>

&gt;     Is the moral of the story, if I want to get scalability I need to<br>

&gt;     rearrange my global dof in PETSc ordering so that I can use<br>

&gt;     LocalToGlobalMapping?<br>

&gt;<br>

&gt;<br>

&gt; I am having a really hard time understanding what you want? If you<br>

&gt; want Natural<br>

&gt; Ordering or any other crazy ordering on input/output go ahead and use<br>

&gt; AO there<br>

&gt; because the non-scalability is amortized over the run. The PETSc<br>

&gt; ordering should<br>

&gt; be used for all globally assembled structures in the solve because its<br>

&gt; efficient, and<br>

&gt; there is no reason for the user to care about these structures. For<br>

&gt; integration/assembly,<br>

&gt; use local orderings since that is all you need for a PDE. If you have<br>

&gt; an exotic<br>

&gt; equation that really does need global information, I would like to<br>

&gt; hear about it, but<br>

&gt; it would most likely be non-scalable on its own.<br>

I don&#39;t think I&#39;m doing anything crazy, just probably a misunderstanding<br>

on my part, I&#39;ll try to explain it again:<br>

<br>

Take this 1D example (dof matches the node numbering) where &#39;o&#39; are nodes<br>

<br>

Global--0********4********3********1********2********5<br>

______o--------o--------o--------o--------o--------o<br>

Local---0********1********2********0********1********2<br>

______|------PROC0---------|-------PROC1-----------|<br>

<br>

PROC0: indices = 0,4,3, input = 0, 1, 2<br>

PROC1: indices = 1,2,5, input = 0, 1, 2<br>

<br>

CALL ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD, 3, indices, mapping,<br>

ierr)<br>

CALL ISLocalToGlobalMappingApply(mapping,3,input,output,ierr)<br>

<br>

PROC0: output = 0,4,3, input = 0,1,2<br>

PROC1: output = 1,2,5, input = 0,1,2<br>

<br>

CALL VecCreateMPI(PETSC_COMM_WORLD,3, 6, b,ierr)<br>

<br>

CALL MatCreateMPISBAIJ(PETSC_COMM_WORLD, 1, &amp;<br>

       3, 3, &amp;<br>

       6, 6, &amp;<br>

       0, d_nnz, 0, o_nnz, A, ierr)<br>

<br>

CALL MatSetLocalToGlobalMapping(A, mapping, ierr)<br>

CALL VecSetLocalToGlobalMapping(b, mapping, ierr)<br>

<br>

... use MatSetValuesLocal, VecSetValuesLocal to fill arrays.<br>

<br>

... solve and place solution in &#39;b&#39; vector<br>

<br>

Now it is my understanding that when I use VecGetValues (and is also<br>

what I observe)<br>

<br>

PROC0: b will have solutions for global nodes 0,1,2<br>

PROC1: b will have solutions for global nodes 3,4,5<br>

<br>

But, I would like to have,<br>

<br>

PROC0: b will have solutions for global nodes 0,4,3<br>

PROC1: b will have solutions for global nodes 1,2,5<br>

<br>

It is my understanding that I either need to use AO to get the<br>

these values, or I should have renumber the nodes such that,<br></blockquote><div><br></div><div>Yes, however the global numbering during a solve is really only</div><div>important to the PETSc internals, and it designed to be efficient.</div>

<div>You can use AO for global reordering on input/output. As I said,</div><div>this is amortized over the entire computation, and thus some</div><div>non-scalability is usually no problem. If it is, consider renumbering</div>

<div>because the entire solve would run slowly with an arbitrary global</div><div>ordering.</div><div><br></div><div>   Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


Global--0********1********2********3********4********5<br>

______o--------o--------o--------o--------o--------o<br>

Local---0********1********2********0********1********2<br>

______|------PROC0---------|-------PROC1-----------|<br>

<br>

Scot<br>

<br>

<br>

<br>

&gt;<br>

&gt;     Matt<br>

&gt;<br>

&gt;<br>

&gt;     &gt;<br>

&gt;     &gt;   Thanks,<br>

&gt;     &gt;<br>

&gt;     &gt;      Matt<br>

&gt;     &gt;<br>

&gt;     &gt;<br>

&gt;     &gt;      CALL VecGetOwnershipRange(b, low, high, ierr)<br>

&gt;     &gt;<br>

&gt;     &gt;      icnt = 0<br>

&gt;     &gt;<br>

&gt;     &gt;      DO mi = 1, mctr ! these are the nodes local to processor<br>

&gt;     &gt;            mi_global = myglobal(mi)<br>

&gt;     &gt;<br>

&gt;     &gt;            irowx = 3*mi_global-2<br>

&gt;     &gt;            irowy = 3*mi_global-1<br>

&gt;     &gt;            irowz = 3*mi_global<br>

&gt;     &gt;<br>

&gt;     &gt;            mappings(icnt+1:icnt+3) = (/            &amp;<br>

&gt;     &gt;                 nrow_global(row_from_dof(1,mi))-1, &amp;<br>

&gt;     &gt;                 nrow_global(row_from_dof(2,mi))-1, &amp;<br>

&gt;     &gt;                 nrow_global(row_from_dof(3,mi))-1  &amp;<br>

&gt;     &gt;                 /)<br>

&gt;     &gt;<br>

&gt;     &gt;            petscOrdering(icnt+1:icnt+3) =<br>

&gt;     &gt;     (/low+icnt,low+icnt+1,low+icnt+2/)<br>

&gt;     &gt;<br>

&gt;     &gt;            icnt = icnt + 3<br>

&gt;     &gt;      END DO<br>

&gt;     &gt;<br>

&gt;     &gt;      CALL AOCreateBasic(PETSC_COMM_WORLD, icnt, mappings,<br>

&gt;     petscOrdering,<br>

&gt;     &gt;     toao, ierr)<br>

&gt;     &gt;<br>

&gt;     &gt;      DO mi = mctr+1, myn ! these are the ghost nodes not on this<br>

&gt;     processor<br>

&gt;     &gt;<br>

&gt;     &gt;         mi_global = myglobal(mi)<br>

&gt;     &gt;<br>

&gt;     &gt;         mappings(icnt+1:icnt+3) = (/            &amp;<br>

&gt;     &gt;              nrow_global(row_from_dof(1,mi))-1, &amp;<br>

&gt;     &gt;              nrow_global(row_from_dof(2,mi))-1, &amp;<br>

&gt;     &gt;              nrow_global(row_from_dof(3,mi))-1  &amp;<br>

&gt;     &gt;              /)<br>

&gt;     &gt;<br>

&gt;     &gt;         icnt = icnt + 3<br>

&gt;     &gt;      ENDDO<br>

&gt;     &gt;      CALL AOApplicationToPetsc(toao, 3*myn, mappings, ierr)<br>

&gt;     &gt;<br>

&gt;     &gt;      CALL AODestroy(toao, ierr)<br>

&gt;     &gt;<br>

&gt;     &gt;     I then use  mapping to input the values into the correct row<br>

&gt;     as wanted<br>

&gt;     &gt;     by PETSc<br>

&gt;     &gt;<br>

&gt;     &gt;<br>

&gt;     &gt;     On 03/02/2011 04:29 PM, Matthew Knepley wrote:<br>

&gt;     &gt;     &gt; On Wed, Mar 2, 2011 at 4:25 PM, M. Scot Breitenfeld<br>

&gt;     &gt;     &lt;<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> &lt;mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>&gt;<br>

&gt;     &lt;mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> &lt;mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>&gt;&gt;<br>

&gt;     &gt;     &gt; &lt;mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> &lt;mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>&gt;<br>

&gt;     &lt;mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> &lt;mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>&gt;&gt;&gt;&gt; wrote:<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;     Hi,<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;     First, thanks for the suggestion on using MPISBAIJ for<br>

&gt;     my A<br>

&gt;     &gt;     matrix, it<br>

&gt;     &gt;     &gt;     seems to have cut down on my memory and assembly time.<br>

&gt;     For a 1.5<br>

&gt;     &gt;     &gt;     million<br>

&gt;     &gt;     &gt;     dof problem:<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;        # procs:                  2        4        8<br>

&gt;         16<br>

&gt;     &gt;     &gt;<br>

&gt;     ----------------------------------------------------------------<br>

&gt;     &gt;     &gt;     Assembly (sec):        245     124      63       86<br>

&gt;     &gt;     &gt;     Solver (sec):              924     578     326     680<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;     Memory (GB):          2.5        1.4     .877    .565<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;     The problem I have is the amount of time it&#39;s taking in<br>

&gt;     &gt;     AOCreateBasic,<br>

&gt;     &gt;     &gt;     it takes longer then assembling,<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;        # procs:                            2        4        8<br>

&gt;     &gt;           16<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;<br>

&gt;     ---------------------------------------------------------------------<br>

&gt;     &gt;     &gt;     AOCreateBasic (sec):         .6      347      170     197<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;     Is there something that I can change or something I<br>

&gt;     can look<br>

&gt;     &gt;     for that<br>

&gt;     &gt;     &gt;     might be causing this increase in time as I go from 2 to 4<br>

&gt;     &gt;     processors<br>

&gt;     &gt;     &gt;     (at least it scales from 4 to 8 processors). I read in the<br>

&gt;     &gt;     archive<br>

&gt;     &gt;     &gt;     that<br>

&gt;     &gt;     &gt;     AOCreateBasic is not meant to be scalable so maybe<br>

&gt;     there is<br>

&gt;     &gt;     nothing I<br>

&gt;     &gt;     &gt;     can do.<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt; Yes, this is non-scalable. What are you using it for?<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;    Matt<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;     Thanks,<br>

&gt;     &gt;     &gt;     Scot<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt;<br>

&gt;     &gt;     &gt; --<br>

&gt;     &gt;     &gt; What most experimenters take for granted before they begin<br>

&gt;     their<br>

&gt;     &gt;     &gt; experiments is infinitely more interesting than any<br>

&gt;     results to which<br>

&gt;     &gt;     &gt; their experiments lead.<br>

&gt;     &gt;     &gt; -- Norbert Wiener<br>

&gt;     &gt;<br>

&gt;     &gt;<br>

&gt;     &gt;<br>

&gt;     &gt;<br>

&gt;     &gt; --<br>

&gt;     &gt; What most experimenters take for granted before they begin their<br>

&gt;     &gt; experiments is infinitely more interesting than any results to which<br>

&gt;     &gt; their experiments lead.<br>

&gt;     &gt; -- Norbert Wiener<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; --<br>

&gt; What most experimenters take for granted before they begin their<br>

&gt; experiments is infinitely more interesting than any results to which<br>

&gt; their experiments lead.<br>

&gt; -- Norbert Wiener<br>

<br>

</blockquote></div><br><br clear="all"><br>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener<br>