[petsc-dev] asm / gasm

Mark Adams mfadams at lbl.gov
Fri Jun 24 09:37:38 CDT 2016


>
>
>> > Just to be clear: ASM used to work.  Did the semantics of ASM change?
>>
>
> Hi Mark,
>
> Assume that you said GASM used to work, and now is not working any more.
>
>
I see everyone thinks that, but (intended to) said ASM, not GASM, used to
work.


> GASM was originally written by Dmitry. The basic idea is to allow
> mulit-rank blocks, that is, a  mulit-rank subdomain problem could be solved
> using a small number of processor cores in parallel. This is different from
> ASM.
>
> I was involved into the development of GASM last summer. There are some
> changes:
>
> (1) Added a function to increase overlap of the multi-rank subdomains. The
> function is called by GASM in default.
>
> (2) Added a hierarchical partitioning to optimize data exchange. Ensure
> that small subdomains in a multi-rank subdomain are geometrically
> connected. GASM does not use this functionality in default.
>
> Any way, if you have an example (like Barry asked) showing the broken
> GASM, I will debug into it (of course, if Barry does not mind).
>
> Fande Kong,
>
>
>>
>>    Show me a commit where ASM worked!  Do you mean that the GASM worked?
>> The code has GASM calls in it, not ASM so how could ASM have previously
>> worked? It is possible that something changed in GASM that broke GAMG's
>> usage of GASM. Once you tell me how to reproduce the problem with GASM I
>> can try to track down the problem.
>>
>> >
>> >     But before that please please tell me the command line argument and
>> example you use where the GASM crashes so I can get that fixed. Then I will
>> look at using ASM instead after I have the current GASM code running again.
>> >
>> > In branch mark/gamg-agg-asm in ksp ex56, 'make runex56':
>>
>>    I don't care about this! This is where you have tried to change from
>> GASM to ASM which I told you is non-trivial.  Give me the example and
>> command line where the GASM version in master (or maint) doesn't work where
>> the error message includes ** Max-trans not allowed because matrix is
>> distributed
>>
>>    We are not communicating very well, you jumped from stating GASM
>> crashed to monkeying with ASM and now refuse to tell me how to reproduce
>> the GASM crash. We have to start by fixing the current code to work with
>> GASM (if it ever worked) and then move on to using ASM (which is just an
>> optimization of the GASM usage.)
>>
>>
>> Barry
>>
>>
>> >
>> > 14:12 nid00495  ~/petsc/src/ksp/ksp/examples/tutorials$ make runex56
>> > [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > [0]PETSC ERROR: Petsc has generated inconsistent data
>> > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
>> lines) on different processors
>> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> > [0]PETSC ERROR: Petsc Development GIT revision: v3.7.2-633-g4f88208
>> GIT Date: 2016-06-23 18:53:31 +0200
>> > [0]PETSC ERROR:
>> /global/u2/m/madams/petsc/src/ksp/ksp/examples/tutorials/./ex56 on a
>> arch-xc30-dbg64-intel named nid00495 by madams Thu Jun 23 14:12:57 2016
>> > [0]PETSC ERROR: Configure options --COPTFLAGS="-no-ipo -g -O0"
>> --CXXOPTFLAGS="-no-ipo -g -O0" --FOPTFLAGS="-fast -no-ipo -g -O0"
>> --download-parmetis --download-metis --with-ssl=0 --with-cc=cc
>> --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0
>> --with-debugging=1 --with-fc=0 --with-shared-libraries=0 --with-x=0
>> --with-mpiexec=srun LIBS=-lstdc++ --with-64-bit-indices
>> PETSC_ARCH=arch-xc30-dbg64-intel
>> > [0]PETSC ERROR: #1 MatGetSubMatrices_MPIAIJ() li
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >    Barry
>> >
>> >
>> > > On Jun 23, 2016, at 4:19 PM, Mark Adams <mfadams at lbl.gov> wrote:
>> > >
>> > > The question boils down to, for empty processors do we:
>> > >
>> > > ierr = ISCreateGeneral(PETSC_COMM_SELF, 0, NULL, PETSC_COPY_VALUES,
>> &is);CHKERRQ(ierr);
>> > >           ierr = PCASMSetLocalSubdomains(subpc, 1, &is,
>> NULL);CHKERRQ(ierr);
>> > >           ierr = ISDestroy(&is);CHKERRQ(ierr);
>> > >
>> > > or
>> > >
>> > > PCASMSetLocalSubdomains(subpc, 0, NULL, NULL);
>> > >
>> > > The later gives and error that one domain is need and the later gives
>> an error (appended).
>> > >
>> > > I've checked in the code for this second error in ksp (make runex56)
>> > >
>> > > Thanks,
>> > >
>> > > [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > > [0]PETSC ERROR: Petsc has generated inconsistent data
>> > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
>> lines) on different processors
>> > > [0]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> > > [0]PETSC ERROR: Petsc Development GIT revision: v3.7.2-633-g4f88208
>> GIT Date: 2016-06-23 18:53:31 +0200
>> > > [0]PETSC ERROR:
>> /global/u2/m/madams/petsc/src/ksp/ksp/examples/tutorials/./ex56 on a
>> arch-xc30-dbg64-intel named nid00495 by madams Thu Jun 23 14:12:57 2016
>> > > [0]PETSC ERROR: Configure options --COPTFLAGS="-no-ipo -g -O0"
>> --CXXOPTFLAGS="-no-ipo -g -O0" --FOPTFLAGS="-fast -no-ipo -g -O0"
>> --download-parmetis --download-metis --with-ssl=0 --with-cc=cc
>> --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0
>> --with-debugging=1 --with-fc=0 --with-shared-libraries=0 --with-x=0
>> --with-mpiexec=srun LIBS=-lstdc++ --with-64-bit-indices
>> PETSC_ARCH=arch-xc30-dbg64-intel
>> > > [0]PETSC ERROR: #1 MatGetSubMatrices_MPIAIJ() line 1147 in
>> /global/u2/m/madams/petsc/src/mat/impls/aij/mpi/mpiov.c
>> > > [0]PETSC ERROR: #2 MatGetSubMatrices_MPIAIJ() line 1147 in
>> /global/u2/m/madams/petsc/src/mat/impls/aij/mpi/mpiov.c
>> > > [1]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > >
>> > > On Thu, Jun 23, 2016 at 8:05 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> > >
>> > >   Where is the command line that generates the error?
>> > >
>> > >
>> > > > On Jun 23, 2016, at 12:08 AM, Mark Adams <mfadams at lbl.gov> wrote:
>> > > >
>> > > > [adding Garth]
>> > > >
>> > > > On Thu, Jun 23, 2016 at 12:52 AM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> > > >
>> > > >   Mark,
>> > > >
>> > > >    I think there is a misunderstanding here. With GASM an
>> individual block problem is __solved__ (via a parallel KSP) in parallel by
>> several processes, with ASM each block is "owned" by and solved on a single
>> process.
>> > > >
>> > > > Ah, OK, so this is for multiple processors in a block. Yes, we are
>> looking at small, smother, blocks.
>> > > >
>> > > >
>> > > >    With both the "block" can come from any unknowns on any
>> processes. You can have, for example a block that comes from a region
>> snaking across several processes if you like (or it makes sense due to
>> coupling in the matrix).
>> > > >
>> > > >    By default if you use ASM it will create one non-overlapping
>> block defined by all unknowns owned by a single process and then extend it
>> by "one level" (defined by the nonzero structure of the matrix) to get
>> overlapping.
>> > > >
>> > > > The default in ASM is one level of overlap? That is new.  (OK, I
>> have not looked at ASM in like over 10 years)
>> > > >
>> > > > If you use multiple blocks per process it defines the
>> non-overlapping blocks within a single process's unknowns
>> > > >
>> > > > I assume this still chops the matrix and does not call a
>> partitioner.
>> > > >
>> > > > and extends each of them to have overlap (again by the non-zero
>> structure of the matrix). The default is simple because the user only need
>> indicate the number of blocks per process, the drawback is of course that
>> it does depend on the process layout, number of processes etc and does not
>> take into account particular "coupling information" that the user may know
>> about with their problem.
>> > > >
>> > > >   If the user wishes to defined the blocks themselves that is also
>> possible with PCASMSetSubLocalSubdomains(). Each process provides 1 or more
>> index sets for the subdomains it will solve on. Note that the index sets
>> can contain any unknowns in the entire problem so the blocks do not have to
>> "line up" with the parallel decomposition at all.
>> > > >
>> > > > Oh, OK, this is what I want. (I thought this worked).
>> > > >
>> > > > Of course determining and providing good such subdomains may not
>> always be clear.
>> > > >
>> > > > In smoothed aggregation there is an argument that the aggregates
>> are good, but the scale is fixed obviously.  On a regular grid smoothed
>> aggregation wants 3^D sized aggregates, which is obviously wonderful for
>> AMS.  And for anisotropy you want your ASM blocks to be on strongly
>> connected components, which is what smoothed aggregation wants (not that I
>> do this very well).
>> > > >
>> > > >
>> > > >   I see in GAMG you have PCGAMGSetUseASMAggs
>> > > >
>> > > > But the code calls PCGASMSetSubdomains and the command line is
>> -pc_gamg_use_agg_gasm, so this is all messed up.  (more below)
>> > > >
>> > > > which sadly does not have an explanation in the users manual and
>> sadly does not have a matching options data base name
>> -pc_gamg_use_agg_gasm  following the rule of drop the word set, all lower
>> case, and put _ between words the option should be -pc_gamg_use_asm_aggs.
>> > > >
>> > > > BUT, THIS IS THE WAY IT WAS!  It looks like someone hijacked this
>> code and made it gasm.  I never did this.
>> > > >
>> > > > Barry: you did this apparently in 2013.
>> > > >
>> > > >
>> > > >    In addition to this one you could also have one that uses the
>> aggs but use the PCASM to manage the solves instead of GASM, it would
>> likely be less buggy and more efficient.
>> > > >
>> > > > yes
>> > > >
>> > > >
>> > > >   Please tell me exactly what example you tried to run with what
>> options and I will debug it.
>> > > >
>> > > > We got an error message:
>> > > >
>> > > > ** Max-trans not allowed because matrix is distributed
>> > > >
>> > > > Garth: is this from your code perhaps? I don't see it in PETSc.
>> > > >
>> > > > Note that ALL functionality that is included in PETSc should have
>> tests that test that functionality then we will find out immediately when
>> it is broken instead of two years later when it is much harder to debug. If
>> this -pc_gamg_use_agg_gasm had had a test we won't be in this mess now.
>> (Jed's damn code reviews sure don't pick up this stuff).
>> > > >
>> > > > First we need to change gasm to asm.
>> > > >
>> > > > We could add this argument pc_gamg_use_agg_asm  to ksp/ex56
>> (runex56 or make a new test).  The SNES version (also ex56) is my current
>> test that I like to refer to as recommended parameters for elasticity. So
>> I'd like to keep that clean, but we can add junk to ksp/ex56.
>> > > >
>> > > > I've done this in a branch mark/gamg-agg-asm.  I get an error
>> (appended). It looks like the second coarsest grid, which has 36 dof on one
>> processor has an index 36 in the block on every processor. Strange.  I can
>> take a look at it later.
>> > > >
>> > > > Mark
>> > > >
>> > > > > [3]PETSC ERROR: [4]PETSC ERROR: --------------------- Error
>> Message --------------------------------------------------------------
>> > > > > [4]PETSC ERROR: Petsc has generated inconsistent data
>> > > > > [4]PETSC ERROR: ith 0 block entry 36 not owned by any process,
>> upper bound 36
>> > > > > [4]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> > > > > [4]PETSC ERROR: Petsc Development GIT revision:
>> v3.7.2-630-g96e0c40  GIT Date: 2016-06-22 10:03:02 -0500
>> > > > > [4]PETSC ERROR: ./ex56 on a arch-macosx-gnu-g named
>> MarksMac-3.local by markadams Thu Jun 23 06:53:27 2016
>> > > > > [4]PETSC ERROR: Configure options COPTFLAGS="-g -O0"
>> CXXOPTFLAGS="-g -O0" FOPTFLAGS="-g -O0" --download-hypre=1
>> --download-parmetis=1 --download-metis=1 --download-ml=1 --download-p4est=1
>> --download-exodus=1 --download-triangle=1
>> --with-hdf5-dir=/Users/markadams/Codes/hdf5 --with-x=0 --with-debugging=1
>> PETSC_ARCH=arch-macosx-gnu-g --download-chaco
>> > > > > [4]PETSC ERROR: #1 VecScatterCreate_PtoS() line 2348 in
>> /Users/markadams/Codes/petsc/src/vec/vec/utils/vpscat.c
>> > > > > [4]PETSC ERROR: #2 VecScatterCreate() line 1552 in
>> /Users/markadams/Codes/petsc/src/vec/vec/utils/vscat.c
>> > > > > [4]PETSC ERROR: Petsc has generated inconsistent data
>> > > > > [3]PETSC ERROR: ith 0 block entry 36 not owned by any process,
>> upper bound 36
>> > > > > [3]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> > > > > [3]PETSC ERROR: Petsc Development GIT revision:
>> v3.7.2-630-g96e0c40  GIT Date: 2016-06-22 10:03:02 -0500
>> > > > > [3]PETSC ERROR: ./ex56 on a arch-macosx-gnu-g named
>> MarksMac-3.local by markadams Thu Jun 23 06:53:27 2016
>> > > > > [3]PETSC ERROR: Configure options COPTFLAGS="-g -O0"
>> CXXOPTFLAGS="-g -O0" FOPTFLAGS="-g -O0" --download-hypre=1
>> --download-parmetis=1 --download-metis=1 --download-ml=1 --download-p4est=1
>> --download-exodus=1 --download-triangle=1
>> --with-hdf5-dir=/Users/markadams/Codes/hdf5 --with-x=0 --with-debugging=1
>> PETSC_ARCH=arch-macosx-gnu-g --download-chaco
>> > > > > [3]PETSC ERROR: #1 VecScatterCreate_PtoS() line 2348 in
>> /Users/markadams/Codes/petsc/src/vec/vec/utils/vpscat.c
>> > > > > [3]PETSC ERROR: #2 VecScatterCreate() line 1552 in
>> /Users/markadams/Codes/petsc/src/vec/vec/utils/vscat.c
>> > > > > [3]PETSC ERROR: #3 PCSetUp_ASM() line 279 in
>> /Users/markadams/Codes/petsc/src/ksp/pc/impls/asm/asm.c
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >    Barry
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > > On Jun 22, 2016, at 5:20 PM, Mark Adams <mfadams at lbl.gov> wrote:
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Wed, Jun 22, 2016 at 8:06 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> > > > >
>> > > > >    I suggest focusing on asm.
>> > > > >
>> > > > > OK, I will switch gasm to asm, this does not work anyway.
>> > > > >
>> > > > > Having blocks that span multiple processes seems like over kill
>> for a smoother ?
>> > > > >
>> > > > > No, because it is a pain to have the math convolved with the
>> parallel decompositions strategy (ie, I can't tell an application how to
>> partition their problem). If an aggregate spans processor boundaries, which
>> is fine and needed, and let's say we have a pretty uniform problem, then if
>> the block gets split up, H is small in part of the domain and convergence
>> could suffer along processor boundaries.  And having the math change as the
>> parallel decomposition changes is annoying.
>> > > > >
>> > > > > (Major league overkill) in fact doesn't one want multiple blocks
>> per process, ie. pretty small blocks.
>> > > > >
>> > > > > No, it is just doing what would be done in serial.  If the cost
>> of moving the data across the processor is a problem then that is a
>> tradeoff to consider.
>> > > > >
>> > > > > And I think you are misunderstanding me.  There are lots of
>> blocks per process (the aggregates are say 3^D in size).  And many of the
>> aggregates/blocks along the processor boundary will be split between
>> processors, resulting is mall blocks and weak ASM PC on processor
>> boundaries.
>> > > > >
>> > > > > I can understand ASM not being general and not letting blocks
>> span processor boundaries, but I don't think the extra matrix communication
>> costs are a big deal (done just once) and the vector communication costs
>> are not bad, it probably does not include (too many) new processors to
>> communicate with.
>> > > > >
>> > > > >
>> > > > >    Barry
>> > > > >
>> > > > > > On Jun 22, 2016, at 7:51 AM, Mark Adams <mfadams at lbl.gov>
>> wrote:
>> > > > > >
>> > > > > > I'm trying to get block smoothers to work for gamg.  We (Garth)
>> tried this and got this error:
>> > > > > >
>> > > > > >
>> > > > > >  - Another option is use '-pc_gamg_use_agg_gasm true' and use
>> '-mg_levels_pc_type gasm'.
>> > > > > >
>> > > > > >
>> > > > > > Running in parallel, I get
>> > > > > >
>> > > > > >      ** Max-trans not allowed because matrix is distributed
>> > > > > >  ----
>> > > > > >
>> > > > > > First, what is the difference between asm and gasm?
>> > > > > >
>> > > > > > Second, I need to fix this to get block smoothers. This used to
>> work.  Did we lose the capability to have blocks that span processor
>> subdomains?
>> > > > > >
>> > > > > > gamg only aggregates across processor subdomains within one
>> layer, so maybe I could use one layer of overlap in some way?
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Mark
>> > > > > >
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> >
>> >
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> petsc-dev mailing list
>> petsc-dev at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/petsc-dev
>>
>>
>> End of petsc-dev Digest, Vol 90, Issue 30
>> *****************************************
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20160624/ba21fcc7/attachment.html>


More information about the petsc-dev mailing list