[petsc-users] AOCreateBasic is not scaling
M. Scot Breitenfeld
brtnfld at uiuc.edu
Mon Mar 7 11:39:27 CST 2011
On 03/04/2011 03:20 PM, Matthew Knepley wrote:
> On Fri, Mar 4, 2011 at 3:14 PM, M. Scot Breitenfeld <brtnfld at uiuc.edu
> <mailto:brtnfld at uiuc.edu>> wrote:
>
> On 03/03/2011 12:18 PM, Matthew Knepley wrote:
> > On Wed, Mar 2, 2011 at 4:52 PM, M. Scot Breitenfeld
> <brtnfld at uiuc.edu <mailto:brtnfld at uiuc.edu>
> > <mailto:brtnfld at uiuc.edu <mailto:brtnfld at uiuc.edu>>> wrote:
> >
> > I don't number my global degree's of freedom from low-high
> > continuously
> > per processor as PETSc uses for ordering, but I use the natural
> > ordering
> > of the application, I then use AOcreateBasic to obtain the
> mapping
> > between the PETSc and my ordering.
> >
> >
> > I would suggest using the LocalToGlobalMapping functions, which are
> > scalable.
> > AO is designed for complete global permutations.
> I don't understand how I can avoid not using AO if my global dof per
> processor are not arranged in PETSc global ordering (continuous row
> ordering, i.e. proc1_ 0-n, proc2_n+1:m, proc3_m+1:p, etc...). In the
> LocalToGlobalMapping routines, doesn't the "GlobalMapping" part mean
> PETSc ordering and not my application's ordering.
>
> I thought I understood the difference between AO and
> LocalToGlobalMapping but now I'm confused. I tried to use the
> LocalToGlobalMapping routines and the solution values are correct but
> the ordering corresponds the global node ordering, not how I
> partitioned
> the mesh. In other words, the values are returned in the same ordering
> as for a serial run, which makes sense since this is how PETSc orders
> the rows. If I had used PETSc ordering then this would be fine.
>
> Is the moral of the story, if I want to get scalability I need to
> rearrange my global dof in PETSc ordering so that I can use
> LocalToGlobalMapping?
>
>
> I am having a really hard time understanding what you want? If you
> want Natural
> Ordering or any other crazy ordering on input/output go ahead and use
> AO there
> because the non-scalability is amortized over the run. The PETSc
> ordering should
> be used for all globally assembled structures in the solve because its
> efficient, and
> there is no reason for the user to care about these structures. For
> integration/assembly,
> use local orderings since that is all you need for a PDE. If you have
> an exotic
> equation that really does need global information, I would like to
> hear about it, but
> it would most likely be non-scalable on its own.
I don't think I'm doing anything crazy, just probably a misunderstanding
on my part, I'll try to explain it again:
Take this 1D example (dof matches the node numbering) where 'o' are nodes
Global--0********4********3********1********2********5
______o--------o--------o--------o--------o--------o
Local---0********1********2********0********1********2
______|------PROC0---------|-------PROC1-----------|
PROC0: indices = 0,4,3, input = 0, 1, 2
PROC1: indices = 1,2,5, input = 0, 1, 2
CALL ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD, 3, indices, mapping,
ierr)
CALL ISLocalToGlobalMappingApply(mapping,3,input,output,ierr)
PROC0: output = 0,4,3, input = 0,1,2
PROC1: output = 1,2,5, input = 0,1,2
CALL VecCreateMPI(PETSC_COMM_WORLD,3, 6, b,ierr)
CALL MatCreateMPISBAIJ(PETSC_COMM_WORLD, 1, &
3, 3, &
6, 6, &
0, d_nnz, 0, o_nnz, A, ierr)
CALL MatSetLocalToGlobalMapping(A, mapping, ierr)
CALL VecSetLocalToGlobalMapping(b, mapping, ierr)
... use MatSetValuesLocal, VecSetValuesLocal to fill arrays.
... solve and place solution in 'b' vector
Now it is my understanding that when I use VecGetValues (and is also
what I observe)
PROC0: b will have solutions for global nodes 0,1,2
PROC1: b will have solutions for global nodes 3,4,5
But, I would like to have,
PROC0: b will have solutions for global nodes 0,4,3
PROC1: b will have solutions for global nodes 1,2,5
It is my understanding that I either need to use AO to get the
these values, or I should have renumber the nodes such that,
Global--0********1********2********3********4********5
______o--------o--------o--------o--------o--------o
Local---0********1********2********0********1********2
______|------PROC0---------|-------PROC1-----------|
Scot
>
> Matt
>
>
> >
> > Thanks,
> >
> > Matt
> >
> >
> > CALL VecGetOwnershipRange(b, low, high, ierr)
> >
> > icnt = 0
> >
> > DO mi = 1, mctr ! these are the nodes local to processor
> > mi_global = myglobal(mi)
> >
> > irowx = 3*mi_global-2
> > irowy = 3*mi_global-1
> > irowz = 3*mi_global
> >
> > mappings(icnt+1:icnt+3) = (/ &
> > nrow_global(row_from_dof(1,mi))-1, &
> > nrow_global(row_from_dof(2,mi))-1, &
> > nrow_global(row_from_dof(3,mi))-1 &
> > /)
> >
> > petscOrdering(icnt+1:icnt+3) =
> > (/low+icnt,low+icnt+1,low+icnt+2/)
> >
> > icnt = icnt + 3
> > END DO
> >
> > CALL AOCreateBasic(PETSC_COMM_WORLD, icnt, mappings,
> petscOrdering,
> > toao, ierr)
> >
> > DO mi = mctr+1, myn ! these are the ghost nodes not on this
> processor
> >
> > mi_global = myglobal(mi)
> >
> > mappings(icnt+1:icnt+3) = (/ &
> > nrow_global(row_from_dof(1,mi))-1, &
> > nrow_global(row_from_dof(2,mi))-1, &
> > nrow_global(row_from_dof(3,mi))-1 &
> > /)
> >
> > icnt = icnt + 3
> > ENDDO
> > CALL AOApplicationToPetsc(toao, 3*myn, mappings, ierr)
> >
> > CALL AODestroy(toao, ierr)
> >
> > I then use mapping to input the values into the correct row
> as wanted
> > by PETSc
> >
> >
> > On 03/02/2011 04:29 PM, Matthew Knepley wrote:
> > > On Wed, Mar 2, 2011 at 4:25 PM, M. Scot Breitenfeld
> > <brtnfld at uiuc.edu <mailto:brtnfld at uiuc.edu>
> <mailto:brtnfld at uiuc.edu <mailto:brtnfld at uiuc.edu>>
> > > <mailto:brtnfld at uiuc.edu <mailto:brtnfld at uiuc.edu>
> <mailto:brtnfld at uiuc.edu <mailto:brtnfld at uiuc.edu>>>> wrote:
> > >
> > > Hi,
> > >
> > > First, thanks for the suggestion on using MPISBAIJ for
> my A
> > matrix, it
> > > seems to have cut down on my memory and assembly time.
> For a 1.5
> > > million
> > > dof problem:
> > >
> > > # procs: 2 4 8
> 16
> > >
> ----------------------------------------------------------------
> > > Assembly (sec): 245 124 63 86
> > > Solver (sec): 924 578 326 680
> > >
> > > Memory (GB): 2.5 1.4 .877 .565
> > >
> > > The problem I have is the amount of time it's taking in
> > AOCreateBasic,
> > > it takes longer then assembling,
> > >
> > > # procs: 2 4 8
> > 16
> > >
> >
> ---------------------------------------------------------------------
> > > AOCreateBasic (sec): .6 347 170 197
> > >
> > > Is there something that I can change or something I
> can look
> > for that
> > > might be causing this increase in time as I go from 2 to 4
> > processors
> > > (at least it scales from 4 to 8 processors). I read in the
> > archive
> > > that
> > > AOCreateBasic is not meant to be scalable so maybe
> there is
> > nothing I
> > > can do.
> > >
> > >
> > > Yes, this is non-scalable. What are you using it for?
> > >
> > > Matt
> > >
> > >
> > > Thanks,
> > > Scot
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > --
> > > What most experimenters take for granted before they begin
> their
> > > experiments is infinitely more interesting than any
> results to which
> > > their experiments lead.
> > > -- Norbert Wiener
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> > experiments is infinitely more interesting than any results to which
> > their experiments lead.
> > -- Norbert Wiener
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
More information about the petsc-users
mailing list