[petsc-dev] asm / gasm

Barry Smith bsmith at mcs.anl.gov
Thu Jun 23 16:46:19 CDT 2016


   Mark, 

    It is not as simple as this to convert to ASM. It will take a little bit of work to use ASM here instead of GASM.

    But before that please please tell me the command line argument and example you use where the GASM crashes so I can get that fixed. Then I will look at using ASM instead after I have the current GASM code running again.

   Barry


> On Jun 23, 2016, at 4:19 PM, Mark Adams <mfadams at lbl.gov> wrote:
> 
> The question boils down to, for empty processors do we:
> 
> ierr = ISCreateGeneral(PETSC_COMM_SELF, 0, NULL, PETSC_COPY_VALUES, &is);CHKERRQ(ierr);
>           ierr = PCASMSetLocalSubdomains(subpc, 1, &is, NULL);CHKERRQ(ierr);
>           ierr = ISDestroy(&is);CHKERRQ(ierr);
> 
> or
> 
> PCASMSetLocalSubdomains(subpc, 0, NULL, NULL);
> 
> The later gives and error that one domain is need and the later gives an error (appended). 
> 
> I've checked in the code for this second error in ksp (make runex56)
> 
> Thanks,
> 
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------    
> [0]PETSC ERROR: Petsc has generated inconsistent data                                                                 
> [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors                    
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.                         
> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.2-633-g4f88208  GIT Date: 2016-06-23 18:53:31 +0200              
> [0]PETSC ERROR: /global/u2/m/madams/petsc/src/ksp/ksp/examples/tutorials/./ex56 on a arch-xc30-dbg64-intel named nid00495 by madams Thu Jun 23 14:12:57 2016                                                                                
> [0]PETSC ERROR: Configure options --COPTFLAGS="-no-ipo -g -O0" --CXXOPTFLAGS="-no-ipo -g -O0" --FOPTFLAGS="-fast -no-ipo -g -O0" --download-parmetis --download-metis --with-ssl=0 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=1 --with-fc=0 --with-shared-libraries=0 --with-x=0 --with-mpiexec=srun LIBS=-lstdc++ --with-64-bit-indices PETSC_ARCH=arch-xc30-dbg64-intel                                                          
> [0]PETSC ERROR: #1 MatGetSubMatrices_MPIAIJ() line 1147 in /global/u2/m/madams/petsc/src/mat/impls/aij/mpi/mpiov.c    
> [0]PETSC ERROR: #2 MatGetSubMatrices_MPIAIJ() line 1147 in /global/u2/m/madams/petsc/src/mat/impls/aij/mpi/mpiov.c    
> [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------    
> 
> On Thu, Jun 23, 2016 at 8:05 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   Where is the command line that generates the error?
> 
> 
> > On Jun 23, 2016, at 12:08 AM, Mark Adams <mfadams at lbl.gov> wrote:
> >
> > [adding Garth]
> >
> > On Thu, Jun 23, 2016 at 12:52 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   Mark,
> >
> >    I think there is a misunderstanding here. With GASM an individual block problem is __solved__ (via a parallel KSP) in parallel by several processes, with ASM each block is "owned" by and solved on a single process.
> >
> > Ah, OK, so this is for multiple processors in a block. Yes, we are looking at small, smother, blocks.
> >
> >
> >    With both the "block" can come from any unknowns on any processes. You can have, for example a block that comes from a region snaking across several processes if you like (or it makes sense due to coupling in the matrix).
> >
> >    By default if you use ASM it will create one non-overlapping block defined by all unknowns owned by a single process and then extend it by "one level" (defined by the nonzero structure of the matrix) to get overlapping.
> >
> > The default in ASM is one level of overlap? That is new.  (OK, I have not looked at ASM in like over 10 years)
> >
> > If you use multiple blocks per process it defines the non-overlapping blocks within a single process's unknowns
> >
> > I assume this still chops the matrix and does not call a partitioner.
> >
> > and extends each of them to have overlap (again by the non-zero structure of the matrix). The default is simple because the user only need indicate the number of blocks per process, the drawback is of course that it does depend on the process layout, number of processes etc and does not take into account particular "coupling information" that the user may know about with their problem.
> >
> >   If the user wishes to defined the blocks themselves that is also possible with PCASMSetSubLocalSubdomains(). Each process provides 1 or more index sets for the subdomains it will solve on. Note that the index sets can contain any unknowns in the entire problem so the blocks do not have to "line up" with the parallel decomposition at all.
> >
> > Oh, OK, this is what I want. (I thought this worked).
> >
> > Of course determining and providing good such subdomains may not always be clear.
> >
> > In smoothed aggregation there is an argument that the aggregates are good, but the scale is fixed obviously.  On a regular grid smoothed aggregation wants 3^D sized aggregates, which is obviously wonderful for AMS.  And for anisotropy you want your ASM blocks to be on strongly connected components, which is what smoothed aggregation wants (not that I do this very well).
> >
> >
> >   I see in GAMG you have PCGAMGSetUseASMAggs
> >
> > But the code calls PCGASMSetSubdomains and the command line is -pc_gamg_use_agg_gasm, so this is all messed up.  (more below)
> >
> > which sadly does not have an explanation in the users manual and sadly does not have a matching options data base name -pc_gamg_use_agg_gasm  following the rule of drop the word set, all lower case, and put _ between words the option should be -pc_gamg_use_asm_aggs.
> >
> > BUT, THIS IS THE WAY IT WAS!  It looks like someone hijacked this code and made it gasm.  I never did this.
> >
> > Barry: you did this apparently in 2013.
> >
> >
> >    In addition to this one you could also have one that uses the aggs but use the PCASM to manage the solves instead of GASM, it would likely be less buggy and more efficient.
> >
> > yes
> >
> >
> >   Please tell me exactly what example you tried to run with what options and I will debug it.
> >
> > We got an error message:
> >
> > ** Max-trans not allowed because matrix is distributed
> >
> > Garth: is this from your code perhaps? I don't see it in PETSc.
> >
> > Note that ALL functionality that is included in PETSc should have tests that test that functionality then we will find out immediately when it is broken instead of two years later when it is much harder to debug. If this -pc_gamg_use_agg_gasm had had a test we won't be in this mess now. (Jed's damn code reviews sure don't pick up this stuff).
> >
> > First we need to change gasm to asm.
> >
> > We could add this argument pc_gamg_use_agg_asm  to ksp/ex56 (runex56 or make a new test).  The SNES version (also ex56) is my current test that I like to refer to as recommended parameters for elasticity. So I'd like to keep that clean, but we can add junk to ksp/ex56.
> >
> > I've done this in a branch mark/gamg-agg-asm.  I get an error (appended). It looks like the second coarsest grid, which has 36 dof on one processor has an index 36 in the block on every processor. Strange.  I can take a look at it later.
> >
> > Mark
> >
> > > [3]PETSC ERROR: [4]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > > [4]PETSC ERROR: Petsc has generated inconsistent data
> > > [4]PETSC ERROR: ith 0 block entry 36 not owned by any process, upper bound 36
> > > [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [4]PETSC ERROR: Petsc Development GIT revision: v3.7.2-630-g96e0c40  GIT Date: 2016-06-22 10:03:02 -0500
> > > [4]PETSC ERROR: ./ex56 on a arch-macosx-gnu-g named MarksMac-3.local by markadams Thu Jun 23 06:53:27 2016
> > > [4]PETSC ERROR: Configure options COPTFLAGS="-g -O0" CXXOPTFLAGS="-g -O0" FOPTFLAGS="-g -O0" --download-hypre=1 --download-parmetis=1 --download-metis=1 --download-ml=1 --download-p4est=1 --download-exodus=1 --download-triangle=1 --with-hdf5-dir=/Users/markadams/Codes/hdf5 --with-x=0 --with-debugging=1 PETSC_ARCH=arch-macosx-gnu-g --download-chaco
> > > [4]PETSC ERROR: #1 VecScatterCreate_PtoS() line 2348 in /Users/markadams/Codes/petsc/src/vec/vec/utils/vpscat.c
> > > [4]PETSC ERROR: #2 VecScatterCreate() line 1552 in /Users/markadams/Codes/petsc/src/vec/vec/utils/vscat.c
> > > [4]PETSC ERROR: Petsc has generated inconsistent data
> > > [3]PETSC ERROR: ith 0 block entry 36 not owned by any process, upper bound 36
> > > [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [3]PETSC ERROR: Petsc Development GIT revision: v3.7.2-630-g96e0c40  GIT Date: 2016-06-22 10:03:02 -0500
> > > [3]PETSC ERROR: ./ex56 on a arch-macosx-gnu-g named MarksMac-3.local by markadams Thu Jun 23 06:53:27 2016
> > > [3]PETSC ERROR: Configure options COPTFLAGS="-g -O0" CXXOPTFLAGS="-g -O0" FOPTFLAGS="-g -O0" --download-hypre=1 --download-parmetis=1 --download-metis=1 --download-ml=1 --download-p4est=1 --download-exodus=1 --download-triangle=1 --with-hdf5-dir=/Users/markadams/Codes/hdf5 --with-x=0 --with-debugging=1 PETSC_ARCH=arch-macosx-gnu-g --download-chaco
> > > [3]PETSC ERROR: #1 VecScatterCreate_PtoS() line 2348 in /Users/markadams/Codes/petsc/src/vec/vec/utils/vpscat.c
> > > [3]PETSC ERROR: #2 VecScatterCreate() line 1552 in /Users/markadams/Codes/petsc/src/vec/vec/utils/vscat.c
> > > [3]PETSC ERROR: #3 PCSetUp_ASM() line 279 in /Users/markadams/Codes/petsc/src/ksp/pc/impls/asm/asm.c
> >
> >
> >
> >
> >
> >
> >
> >    Barry
> >
> >
> >
> >
> >
> >
> > > On Jun 22, 2016, at 5:20 PM, Mark Adams <mfadams at lbl.gov> wrote:
> > >
> > >
> > >
> > > On Wed, Jun 22, 2016 at 8:06 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > >    I suggest focusing on asm.
> > >
> > > OK, I will switch gasm to asm, this does not work anyway.
> > >
> > > Having blocks that span multiple processes seems like over kill for a smoother ?
> > >
> > > No, because it is a pain to have the math convolved with the parallel decompositions strategy (ie, I can't tell an application how to partition their problem). If an aggregate spans processor boundaries, which is fine and needed, and let's say we have a pretty uniform problem, then if the block gets split up, H is small in part of the domain and convergence could suffer along processor boundaries.  And having the math change as the parallel decomposition changes is annoying.
> > >
> > > (Major league overkill) in fact doesn't one want multiple blocks per process, ie. pretty small blocks.
> > >
> > > No, it is just doing what would be done in serial.  If the cost of moving the data across the processor is a problem then that is a tradeoff to consider.
> > >
> > > And I think you are misunderstanding me.  There are lots of blocks per process (the aggregates are say 3^D in size).  And many of the aggregates/blocks along the processor boundary will be split between processors, resulting is mall blocks and weak ASM PC on processor boundaries.
> > >
> > > I can understand ASM not being general and not letting blocks span processor boundaries, but I don't think the extra matrix communication costs are a big deal (done just once) and the vector communication costs are not bad, it probably does not include (too many) new processors to communicate with.
> > >
> > >
> > >    Barry
> > >
> > > > On Jun 22, 2016, at 7:51 AM, Mark Adams <mfadams at lbl.gov> wrote:
> > > >
> > > > I'm trying to get block smoothers to work for gamg.  We (Garth) tried this and got this error:
> > > >
> > > >
> > > >  - Another option is use '-pc_gamg_use_agg_gasm true' and use '-mg_levels_pc_type gasm'.
> > > >
> > > >
> > > > Running in parallel, I get
> > > >
> > > >      ** Max-trans not allowed because matrix is distributed
> > > >  ----
> > > >
> > > > First, what is the difference between asm and gasm?
> > > >
> > > > Second, I need to fix this to get block smoothers. This used to work.  Did we lose the capability to have blocks that span processor subdomains?
> > > >
> > > > gamg only aggregates across processor subdomains within one layer, so maybe I could use one layer of overlap in some way?
> > > >
> > > > Thanks,
> > > > Mark
> > > >
> > >
> > >
> >
> >
> 
> 




More information about the petsc-dev mailing list