On Mon, Mar 7, 2011 at 11:39 AM, M. Scot Breitenfeld <span dir="ltr"><<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
On 03/04/2011 03:20 PM, Matthew Knepley wrote:<br>
> On Fri, Mar 4, 2011 at 3:14 PM, M. Scot Breitenfeld <<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a><br>
> <mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>>> wrote:<br>
><br>
> On 03/03/2011 12:18 PM, Matthew Knepley wrote:<br>
> > On Wed, Mar 2, 2011 at 4:52 PM, M. Scot Breitenfeld<br>
> <<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> <mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>><br>
> > <mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> <mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>>>> wrote:<br>
> ><br>
> > I don't number my global degree's of freedom from low-high<br>
> > continuously<br>
> > per processor as PETSc uses for ordering, but I use the natural<br>
> > ordering<br>
> > of the application, I then use AOcreateBasic to obtain the<br>
> mapping<br>
> > between the PETSc and my ordering.<br>
> ><br>
> ><br>
> > I would suggest using the LocalToGlobalMapping functions, which are<br>
> > scalable.<br>
> > AO is designed for complete global permutations.<br>
> I don't understand how I can avoid not using AO if my global dof per<br>
> processor are not arranged in PETSc global ordering (continuous row<br>
> ordering, i.e. proc1_ 0-n, proc2_n+1:m, proc3_m+1:p, etc...). In the<br>
> LocalToGlobalMapping routines, doesn't the "GlobalMapping" part mean<br>
> PETSc ordering and not my application's ordering.<br>
><br>
> I thought I understood the difference between AO and<br>
> LocalToGlobalMapping but now I'm confused. I tried to use the<br>
> LocalToGlobalMapping routines and the solution values are correct but<br>
> the ordering corresponds the global node ordering, not how I<br>
> partitioned<br>
> the mesh. In other words, the values are returned in the same ordering<br>
> as for a serial run, which makes sense since this is how PETSc orders<br>
> the rows. If I had used PETSc ordering then this would be fine.<br>
><br>
> Is the moral of the story, if I want to get scalability I need to<br>
> rearrange my global dof in PETSc ordering so that I can use<br>
> LocalToGlobalMapping?<br>
><br>
><br>
> I am having a really hard time understanding what you want? If you<br>
> want Natural<br>
> Ordering or any other crazy ordering on input/output go ahead and use<br>
> AO there<br>
> because the non-scalability is amortized over the run. The PETSc<br>
> ordering should<br>
> be used for all globally assembled structures in the solve because its<br>
> efficient, and<br>
> there is no reason for the user to care about these structures. For<br>
> integration/assembly,<br>
> use local orderings since that is all you need for a PDE. If you have<br>
> an exotic<br>
> equation that really does need global information, I would like to<br>
> hear about it, but<br>
> it would most likely be non-scalable on its own.<br>
I don't think I'm doing anything crazy, just probably a misunderstanding<br>
on my part, I'll try to explain it again:<br>
<br>
Take this 1D example (dof matches the node numbering) where 'o' are nodes<br>
<br>
Global--0********4********3********1********2********5<br>
______o--------o--------o--------o--------o--------o<br>
Local---0********1********2********0********1********2<br>
______|------PROC0---------|-------PROC1-----------|<br>
<br>
PROC0: indices = 0,4,3, input = 0, 1, 2<br>
PROC1: indices = 1,2,5, input = 0, 1, 2<br>
<br>
CALL ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD, 3, indices, mapping,<br>
ierr)<br>
CALL ISLocalToGlobalMappingApply(mapping,3,input,output,ierr)<br>
<br>
PROC0: output = 0,4,3, input = 0,1,2<br>
PROC1: output = 1,2,5, input = 0,1,2<br>
<br>
CALL VecCreateMPI(PETSC_COMM_WORLD,3, 6, b,ierr)<br>
<br>
CALL MatCreateMPISBAIJ(PETSC_COMM_WORLD, 1, &<br>
3, 3, &<br>
6, 6, &<br>
0, d_nnz, 0, o_nnz, A, ierr)<br>
<br>
CALL MatSetLocalToGlobalMapping(A, mapping, ierr)<br>
CALL VecSetLocalToGlobalMapping(b, mapping, ierr)<br>
<br>
... use MatSetValuesLocal, VecSetValuesLocal to fill arrays.<br>
<br>
... solve and place solution in 'b' vector<br>
<br>
Now it is my understanding that when I use VecGetValues (and is also<br>
what I observe)<br>
<br>
PROC0: b will have solutions for global nodes 0,1,2<br>
PROC1: b will have solutions for global nodes 3,4,5<br>
<br>
But, I would like to have,<br>
<br>
PROC0: b will have solutions for global nodes 0,4,3<br>
PROC1: b will have solutions for global nodes 1,2,5<br>
<br>
It is my understanding that I either need to use AO to get the<br>
these values, or I should have renumber the nodes such that,<br></blockquote><div><br></div><div>Yes, however the global numbering during a solve is really only</div><div>important to the PETSc internals, and it designed to be efficient.</div>
<div>You can use AO for global reordering on input/output. As I said,</div><div>this is amortized over the entire computation, and thus some</div><div>non-scalability is usually no problem. If it is, consider renumbering</div>
<div>because the entire solve would run slowly with an arbitrary global</div><div>ordering.</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Global--0********1********2********3********4********5<br>
______o--------o--------o--------o--------o--------o<br>
Local---0********1********2********0********1********2<br>
______|------PROC0---------|-------PROC1-----------|<br>
<br>
Scot<br>
<br>
<br>
<br>
><br>
> Matt<br>
><br>
><br>
> ><br>
> > Thanks,<br>
> ><br>
> > Matt<br>
> ><br>
> ><br>
> > CALL VecGetOwnershipRange(b, low, high, ierr)<br>
> ><br>
> > icnt = 0<br>
> ><br>
> > DO mi = 1, mctr ! these are the nodes local to processor<br>
> > mi_global = myglobal(mi)<br>
> ><br>
> > irowx = 3*mi_global-2<br>
> > irowy = 3*mi_global-1<br>
> > irowz = 3*mi_global<br>
> ><br>
> > mappings(icnt+1:icnt+3) = (/ &<br>
> > nrow_global(row_from_dof(1,mi))-1, &<br>
> > nrow_global(row_from_dof(2,mi))-1, &<br>
> > nrow_global(row_from_dof(3,mi))-1 &<br>
> > /)<br>
> ><br>
> > petscOrdering(icnt+1:icnt+3) =<br>
> > (/low+icnt,low+icnt+1,low+icnt+2/)<br>
> ><br>
> > icnt = icnt + 3<br>
> > END DO<br>
> ><br>
> > CALL AOCreateBasic(PETSC_COMM_WORLD, icnt, mappings,<br>
> petscOrdering,<br>
> > toao, ierr)<br>
> ><br>
> > DO mi = mctr+1, myn ! these are the ghost nodes not on this<br>
> processor<br>
> ><br>
> > mi_global = myglobal(mi)<br>
> ><br>
> > mappings(icnt+1:icnt+3) = (/ &<br>
> > nrow_global(row_from_dof(1,mi))-1, &<br>
> > nrow_global(row_from_dof(2,mi))-1, &<br>
> > nrow_global(row_from_dof(3,mi))-1 &<br>
> > /)<br>
> ><br>
> > icnt = icnt + 3<br>
> > ENDDO<br>
> > CALL AOApplicationToPetsc(toao, 3*myn, mappings, ierr)<br>
> ><br>
> > CALL AODestroy(toao, ierr)<br>
> ><br>
> > I then use mapping to input the values into the correct row<br>
> as wanted<br>
> > by PETSc<br>
> ><br>
> ><br>
> > On 03/02/2011 04:29 PM, Matthew Knepley wrote:<br>
> > > On Wed, Mar 2, 2011 at 4:25 PM, M. Scot Breitenfeld<br>
> > <<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> <mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>><br>
> <mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> <mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>>><br>
> > > <mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> <mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>><br>
> <mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a> <mailto:<a href="mailto:brtnfld@uiuc.edu">brtnfld@uiuc.edu</a>>>>> wrote:<br>
> > ><br>
> > > Hi,<br>
> > ><br>
> > > First, thanks for the suggestion on using MPISBAIJ for<br>
> my A<br>
> > matrix, it<br>
> > > seems to have cut down on my memory and assembly time.<br>
> For a 1.5<br>
> > > million<br>
> > > dof problem:<br>
> > ><br>
> > > # procs: 2 4 8<br>
> 16<br>
> > ><br>
> ----------------------------------------------------------------<br>
> > > Assembly (sec): 245 124 63 86<br>
> > > Solver (sec): 924 578 326 680<br>
> > ><br>
> > > Memory (GB): 2.5 1.4 .877 .565<br>
> > ><br>
> > > The problem I have is the amount of time it's taking in<br>
> > AOCreateBasic,<br>
> > > it takes longer then assembling,<br>
> > ><br>
> > > # procs: 2 4 8<br>
> > 16<br>
> > ><br>
> ><br>
> ---------------------------------------------------------------------<br>
> > > AOCreateBasic (sec): .6 347 170 197<br>
> > ><br>
> > > Is there something that I can change or something I<br>
> can look<br>
> > for that<br>
> > > might be causing this increase in time as I go from 2 to 4<br>
> > processors<br>
> > > (at least it scales from 4 to 8 processors). I read in the<br>
> > archive<br>
> > > that<br>
> > > AOCreateBasic is not meant to be scalable so maybe<br>
> there is<br>
> > nothing I<br>
> > > can do.<br>
> > ><br>
> > ><br>
> > > Yes, this is non-scalable. What are you using it for?<br>
> > ><br>
> > > Matt<br>
> > ><br>
> > ><br>
> > > Thanks,<br>
> > > Scot<br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > > --<br>
> > > What most experimenters take for granted before they begin<br>
> their<br>
> > > experiments is infinitely more interesting than any<br>
> results to which<br>
> > > their experiments lead.<br>
> > > -- Norbert Wiener<br>
> ><br>
> ><br>
> ><br>
> ><br>
> > --<br>
> > What most experimenters take for granted before they begin their<br>
> > experiments is infinitely more interesting than any results to which<br>
> > their experiments lead.<br>
> > -- Norbert Wiener<br>
><br>
><br>
><br>
><br>
> --<br>
> What most experimenters take for granted before they begin their<br>
> experiments is infinitely more interesting than any results to which<br>
> their experiments lead.<br>
> -- Norbert Wiener<br>
<br>
</blockquote></div><br><br clear="all"><br>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener<br>