[petsc-dev] plans for preconditioners for SeqSELL
Richard Tran Mills
rtmills at anl.gov
Tue Jun 26 01:47:19 CDT 2018
On Mon, Jun 25, 2018 at 6:12 PM, Richard Tran Mills <rtmills at anl.gov> wrote:
> Hi Hong,
>
> On Mon, Jun 25, 2018 at 2:52 PM, Zhang, Hong <hongzhang at anl.gov> wrote:
>
>>
>>
>> On Jun 25, 2018, at 4:09 PM, Mills, Richard Tran <rtmills at anl.gov> wrote:
>>
>> Hi Hong,
>>
>> Thanks for your reply. Yes, I was just looking at what goes on in
>> MatConvert_Basic() and I see the problem. I note that my AIJSELL code seems
>> to always work correctly when the matrix block size is one -- I guess that
>> this is because MatSeqSELLSetPreallocation() is called in this case.
>>
>> Are you saying that the
>>
>> if (A->rmap->bs > 1) {
>> ierr = MatConvert_Basic(A,newtype,reuse,newmat);CHKERRQ(ierr);
>> PetscFunctionReturn(0);
>> }
>>
>> lines in sell.c should be removed and then things should work?
>>
>>
>> If your conversion function for AIJSELL calls MatConvert_SeqAIJ_SeqSELL(),
>> then yes. Otherwise you might want to copy some code from the link I
>> attached (sell.c).
>>
>
> I've tried to construct things so that there is as little code duplication
> as possible, so I am just calling MatConvert_SeqAIJ_SeqSELL().
>
>
>> If so, I'll remove this, test on my machine, and then merge this change
>> into 'next'. Or is more code needed to make SELL work properly with a block
>> size > 1?
>>
>>
>> I don't think we need more code.
>>
>
> Alright, sounds like the presence of the code path that sends things down
> MatConvert_Basic() should be considered a bug, then. I'll start a fix off
> of 'maint', test things locally, and then (if things pass) merge into
> 'next' for testing.
>
> If removing the code in question still doesn't fix things, well, then it's
> still a bug, but will need some more work to fix.
>
Hong: Thanks for your help in identifying this bug. Removing the code that
called MatConvert_Basic() fixes the bugs I was seeing with my AIJSELL
class. I've created a branch with this fix off of maint. It passes all
tests and I've merged it into 'next'.
Best regards,
Richard
> Thanks,
> Richard
>
>
>> Thanks,
>> Hong (Mr.)
>>
>>
>> --Richard
>>
>> On Mon, Jun 25, 2018 at 2:04 PM, Zhang, Hong <hongzhang at anl.gov> wrote:
>>
>>> Hi Richard,
>>>
>>> MatConvert_Basic() does not work for most cases when converting AIJ to
>>> SELL because SELL may require padding and being preallocated based on the
>>> nonzeros per row.
>>> See the correct conversion in https://bitbucket.org/petsc
>>> /petsc/src/master/src/mat/impls/sell/seq/sell.c?fileviewer=
>>> file-view-default%20#sell.c-251
>>>
>>> You were probably mislead by the code
>>>
>>> if (A->rmap->bs > 1) {
>>> ierr = MatConvert_Basic(A,newtype,reuse,newmat);CHKERRQ(ierr);
>>> PetscFunctionReturn(0);
>>> }
>>> which should be removed.
>>>
>>> Thanks,
>>> Hong (Mr.)
>>>
>>>
>>> On Jun 25, 2018, at 3:06 PM, Richard Tran Mills <rtmills at anl.gov> wrote:
>>>
>>> Hi Everyone (especially Herr Doktor Hong Zhang),
>>>
>>> It sure took me a while to get around to it, but I coded up (in branch
>>> rmills/feature-add-mataijsell) the skeleton of what I'm currently calling
>>> the 'AIJSELL' matrix type, which is a "meta" matrix type that inherits from
>>> AIJ but maintains a "shadow" MATSEQSELL copy of the matrix for use when
>>> that is appropriate. This works similarly to how AIJMKL works, insofar as
>>> the PetscObjectState is tracked and used to determine when the SELL
>>> "shadow" copy needs to be updated before an operation like MatMult() is
>>> applied. (Note that what I have so far only does MatMult, but it should be
>>> easy to add other operations once I've verified that my implementation is
>>> working.) I decided on the approach of using a MATSEQSELL instance inside
>>> an AIJ-derived matrix type because I still want to be able to support
>>> standalone SELL matrices when there really isn't a need for any of the
>>> AIJ-only operations.
>>>
>>> My AIJSELL implementation seems to work fine in many cases, but I've
>>> spent several hours chasing some bugs that crop up in certain cases. I now
>>> think that the bugs are not a problem with my AIJSELL implementation
>>> (though I'd welcome another pair of eyes to help me verify this), but are
>>> lurking somewhere in the SELL code and haven't been hit before because
>>> AIJSELL makes it easier to exercise the SELL class on more problems.
>>>
>>> I've been using the well-loved SNES ex19 example (that is used for 'make
>>> check') for some of my debugging. I've found that everything works great
>>> when I size the problem grid such that the resulting matrices have a row
>>> count that is evenly divisible by 8 (when using double precision scalars).
>>> So
>>>
>>> ./ex19 -ksp_type gmres -pc_type none -snes_monitor -ksp_monitor
>>> -da_grid_x 4 -da_grid_y 5 -mat_seqaij_type seqaijsell -snes_max_it 1
>>> -ksp_monitor
>>>
>>> which results in an 80x80 Jacobian (since there are 4 degrees of freedom
>>> per grid point) runs fine, but
>>>
>>> ./ex19 -ksp_type gmres -pc_type none -snes_monitor -ksp_monitor
>>> -da_grid_x 3 -da_grid_y 5 -mat_seqaij_type seqaijsell -snes_max_it 1
>>> -ksp_monitor
>>>
>>> results in either a segfault or the GMRES(30) solve terminating at
>>> 10,000 iterations because the residual norm just keeps bouncing up and down.
>>>
>>> Sticking some MatView() calls in, it appears that the MatMult() results
>>> inside the KSP solve are incorrect because the SELL copy of the matrix
>>> generated by my MatConvert() call is incorrect when the row count isn't
>>> evenly divisible by eight. In the case of SNES ex19, MatConvert_Basic() is
>>> currently used because the block size is greater than unity. So I think
>>> something is going wrong either in the MatSetValues() or (my guess)
>>> MatAssemblyBegin()/End().
>>>
>>> Poking around with a debugger, I see that the column index array
>>> ("colidx") that is part of the SEQSELLHEADER ends up with nonsense values,
>>> and the segfaults I see tend to occur when indexing into the x vector with
>>> this.
>>>
>>> I'm not familiar enough with the internals of SELL to look further into
>>> debugging this without a lot of effort. Hong, can you help me look at this?
>>>
>>> Best regards,
>>> Richard
>>>
>>> On Tue, Mar 6, 2018 at 3:59 AM, Karl Rupp <rupp at iue.tuwien.ac.at> wrote:
>>>
>>>>
>>>> > Karl, are you thinking of a matrix subclass that has everything that
>>>> an
>>>> > AIJ matrix does, but also keeps a SELL copy around for operations
>>>> like
>>>> > MatMult()? Would it make sense to just keep another Mat inside (like
>>>> how
>>>> > MPIAIJ keeps multiple Mat instances) that *is* of type MATSELL, that
>>>> > gets built/updated as needed? Would this involve carrying too much
>>>> > baggage around, associated with a complete Mat instance?
>>>>
>>>> What I have in mind is to put the SELL datastructures into a A->spptr,
>>>> very much like you did for AIJMKL.
>>>>
>>>>
>>>> > I like the idea
>>>> > of having a MATSELL type available that is lean (insofar as not
>>>> having
>>>> > storing an AIJ matrix) for those cases when a user really knows that
>>>> the
>>>> > AIJ stuff will not be needed. But maybe it makes sense to be able to
>>>> use
>>>> > that inside another matrix class. Perhaps we could have something,
>>>> > called, say, MATAIJMUTABLE that uses AIJ but might also create copies
>>>> in
>>>> > SELL (or other formats, potentially) when appropriate -- perhaps
>>>> based
>>>> > on a some performance model indicating which format is fastest for
>>>> > MatMult() or whatever.
>>>>
>>>> The actual overhead of also storing a SELL datastructure in terms of
>>>> memory footprint is at most 2x. When you keep in mind that extra memory
>>>> during the matrix assembly is also needed in the stash, then the impact
>>>> on overall memory consumption is about 50 percent extra. Given that
>>>> SELL
>>>> is a performance optimization for SpMV (and hence the SELL
>>>> datastructure
>>>> is only populated if you call MatMult on the particular matrix), I'm
>>>> not
>>>> too worried about the increased memory footprint at this time.
>>>>
>>>>
>>>> > Having a class that's an AIJ but can also use SELL is more convenient
>>>> > than adding a fallback to AIJ format inside MATSELL. I wonder if the
>>>> > latter option might be preferable in some circumstances, however,
>>>> > because it can avoid the extra memory footprint of also keeping the
>>>> > matrix in AIJ -- maybe AIJ operations are rarely needed and the AIJ
>>>> > conversion can just happen on a lazy basis.
>>>>
>>>> I advocate an 'optimize as needed' approach here. Let's first make SELL
>>>> compatible with the full range of AIJ operations and preconditioners.
>>>>
>>>> Best regards,
>>>> Karli
>>>>
>>>>
>>>>
>>>> >
>>>> > --Richard
>>>> >
>>>> > On 3/4/18 2:58 AM, Karl Rupp wrote:
>>>> >> Hi all,
>>>> >>
>>>> >> I'm getting increasingly concerned about SELL not being a subclass
>>>> of
>>>> >> AIJ. As such, we have to deal with all these fallback operations
>>>> now,
>>>> >> whereas as a subclass of AIJ we could just selectively make use of
>>>> the
>>>> >> SELL format where we really benefit from it. "Use AIJ by default
>>>> >> unless we have something optimized for SELL" is just much more
>>>> >> appropriate for the few use cases of SELL than the current "SELL has
>>>> >> to implement everything and usually this means to manually convert
>>>> >> back to AIJ".
>>>> >>
>>>> >> If there are no objections I'd like to clean this up. (Subclassing
>>>> AIJ
>>>> >> was unfortunately not available at the time Hong started his great
>>>> >> work on SELL)
>>>> >>
>>>> >> Best regards,
>>>> >> Karli
>>>> >>
>>>> >>
>>>> >>
>>>> >> On 03/03/2018 07:52 AM, Richard Tran Mills wrote:
>>>> >>> Resurrecting a few weeks old thread:
>>>> >>>
>>>> >>> Stefano, did you get around to coding something up to do an
>>>> automatic
>>>> >>> conversion to SeqAIJ for operations unsupported by SELL format? I
>>>> did
>>>> >>> some hacking the other day to try to get PCGAMG to use SELL inside
>>>> >>> the smoothers and this turns out to be way more complicated than
>>>> I'd
>>>> >>> like and very bug prone (I haven't found all of mine, anyway). I
>>>> >>> think it may be preferable to be able to pass a SELL matrix to
>>>> PCGAMG
>>>> >>> and have an internal conversion happen in the SELL matrix to AIJ
>>>> >>> format for doing the MatPtAP and LU solves. Support for this would
>>>> >>> certainly make it easier for users in a lot other cases as well,
>>>> and
>>>> >>> might make the use of SELL much more likely. If no one has already
>>>> >>> done some work on this, I'll take a stab at it.
>>>> >>>
>>>> >>> --Richard
>>>> >>>
>>>> >>> On Mon, Feb 12, 2018 at 10:04 AM, Richard Tran Mills <
>>>> rtmills at anl.gov
>>>> >>> <mailto:rtmills at anl.gov>> wrote:
>>>> >>>
>>>> >>> On Mon, Feb 12, 2018 at 8:47 AM, Smith, Barry F. <
>>>> bsmith at mcs.anl.gov
>>>> >>> <mailto:bsmith at mcs.anl.gov>> wrote:
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> > On Feb 12, 2018, at 10:25 AM, Stefano Zampini
>>>> >>> <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.com>>
>>>> wrote:
>>>> >>> >
>>>> >>> > Barry,
>>>> >>> >
>>>> >>> > for sure Amat,Pmat is the right approach; however, with
>>>> >>> complicated user codes, we are not always in control of having a
>>>> >>> different Jacobian matrix.
>>>> >>> > Since Mat*SELL does not currently support any
>>>> >>> preconditioning except PCSOR and PCJACOBI, we ask the user to put
>>>> >>> codes like
>>>> >>> >
>>>> >>> > if (type is SELL)
>>>> >>> > create two matrices (and maybe modify the code in many
>>>> >>> other parts)
>>>> >>> > else
>>>> >>> > ok with the previous code
>>>> >>>
>>>> >>> I don't disagree with what you are saying and am not
>>>> opposed
>>>> >>> to the proposed work.
>>>> >>>
>>>> >>> Perhaps we need to do a better job with making the
>>>> mat,pmat
>>>> >>> approach simpler or better documented so more people use it
>>>> >>> naturally in their applications.
>>>> >>>
>>>> >>>
>>>> >>> I wrote some code like that in some of the Jacobian/function
>>>> >>> routines in PFLOTRAN to experiment with MATSELL, and it works,
>>>> but
>>>> >>> looks and feels pretty hacky. And if I wanted to support it for
>>>> all
>>>> >>> of the different systems that PFLOTRAN can model, then I'd have
>>>> to
>>>> >>> reproduce that it in many different Jacobian and function
>>>> evaluation
>>>> >>> routines. I also don't like that it makes it awkward to play
>>>> with
>>>> >>> the many combinations of matrix types and preconditioners that
>>>> PETSc
>>>> >>> allows: The above pseudocode should really say "if (type is
>>>> SELL)
>>>> >>> and (preconditioner is not PCSOR or PCJACOBI)". I do think that
>>>> >>> Amat,Pmat is a good approach in many situations, but it's easy
>>>> to
>>>> >>> construct scenarios in which it falls short.
>>>> >>>
>>>> >>> In some situations, what I'd like to have happen is what
>>>> Stefano is
>>>> >>> talking about, with an automatic conversion to AIJ happening if
>>>> SELL
>>>> >>> doesn't support an operation. But, ideally, I think this sort of
>>>> >>> implicit format conversion shouldn't be something hard-coded
>>>> into
>>>> >>> the workings of SELL. Instead, there should be some general
>>>> >>> mechanism by which PETSc recognizes that a particular operation
>>>> is
>>>> >>> unsupported for a given matrix format, and then it can
>>>> (optionally)
>>>> >>> copy/convert to a different matrix type (probably default to
>>>> AIJ,
>>>> >>> but it shouldn't have to be AIJ) that supports the operation.
>>>> This
>>>> >>> sort of implicit data rearrangement game may actually become
>>>> more
>>>> >>> important if future computer architectures strongly prefer
>>>> different
>>>> >>> data layouts different types of operations (though let's not get
>>>> >>> ahead of ourselves).
>>>> >>>
>>>> >>> --Richard
>>>> >>>
>>>> >>>
>>>> >>> Barry
>>>> >>>
>>>> >>> >
>>>> >>> > Just my two cents.
>>>> >>> >
>>>> >>> >
>>>> >>> > 2018-02-12 19:10 GMT+03:00 Smith, Barry F.
>>>> >>> <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>>:
>>>> >>> >
>>>> >>> >
>>>> >>> > > On Feb 12, 2018, at 9:59 AM, Stefano Zampini
>>>> >>> <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.
>>>> com>>
>>>> >>> wrote:
>>>> >>> > >
>>>> >>> > > FYI, I just checked and MatSOR_*SELL does not use any
>>>> >>> vectorized instruction.
>>>> >>> > > Why just not converting to SeqAIJ, factor and then use
>>>> the
>>>> >>> AIJ implementation for MatSolve for the moment?
>>>> >>> >
>>>> >>> > Why not use the mat, pmat feature of the solvers to
>>>> pass in
>>>> >>> both matrices and have the solvers handle using two formats
>>>> >>> simultaneously instead of burdening the MatSELL code with
>>>> tons
>>>> >>> of special code for automatically converting to AIJ for
>>>> >>> solvers etc?
>>>> >>> >
>>>> >>> >
>>>> >>> > >
>>>> >>> > > 2018-02-12 18:06 GMT+03:00 Stefano Zampini
>>>> >>> <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.
>>>> com>>:
>>>> >>> > >
>>>> >>> > >
>>>> >>> > > 2018-02-12 17:36 GMT+03:00 Jed Brown <jed at jedbrown.org
>>>> >>> <mailto:jed at jedbrown.org>>:
>>>> >>> > > Karl Rupp <rupp at iue.tuwien.ac.at
>>>> >>> <mailto:rupp at iue.tuwien.ac.at>> writes:
>>>> >>> > >
>>>> >>> > > > Hi Stefano,
>>>> >>> > > >
>>>> >>> > > >> Is there any plan to write code for native ILU/ICC
>>>> etc
>>>> >>> for SeqSELL, at least to have BJACOBI in parallel?
>>>> >>> > > >
>>>> >>> > > > (imho) ILU/ICC is a pain to do with SeqSELL.
>>>> Point-Jacobi
>>>> >>> should be
>>>> >>> > > > possible, yes. SELL is really just tailored to
>>>> MatMults
>>>> >>> and a pain for
>>>> >>> > > > anything that is not very similar to a MatMult...
>>>> >>> > >
>>>> >>> > > There is already MatSOR_*SELL. MatSolve_SeqSELL
>>>> wouldn't
>>>> >>> be any harder.
>>>> >>> > > I think it would be acceptable to convert to SeqAIJ,
>>>> >>> factor, and convert
>>>> >>> > > the factors back to SELL.
>>>> >>> > >
>>>> >>> > > Yes, this was my idea. Today I have started coding
>>>> >>> something. I'll push the branch whenever I have anything
>>>> working
>>>> >>> > >
>>>> >>> > >
>>>> >>> > >
>>>> >>> > > --
>>>> >>> > > Stefano
>>>> >>> > >
>>>> >>> > >
>>>> >>> > >
>>>> >>> > > --
>>>> >>> > > Stefano
>>>> >>> >
>>>> >>> >
>>>> >>> >
>>>> >>> >
>>>> >>> > --
>>>> >>> > Stefano
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >
>>>>
>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20180625/540dcdc2/attachment-0001.html>
More information about the petsc-dev
mailing list