[petsc-dev] Bug introduced in MatGetSubmatrices()

Matthew Knepley knepley at gmail.com
Thu Jan 26 20:15:51 CST 2017


On Wed, Jan 25, 2017 at 10:17 AM, Hong <hzhang at mcs.anl.gov> wrote:

> Matt,
> This bug is fixed and merged to master. Please give it a try and let me
> know if there is a problem.
>

Thanks Hong. I have verified that PyLith works normally now and closed the
issue.

  Thanks,

    Matt


> Hong
>
> On Mon, Jan 23, 2017 at 7:04 PM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
>
>> OK, I removed flag MAT_SUBMAT_SINGLEIS
>> https://bitbucket.org/petsc/petsc/commits/65b5f10fe03da0cc38
>> 3d7dd0f31fadf6a7ab9abe
>>
>> Hong
>> ________________________________________
>> From: Barry Smith [bsmith at mcs.anl.gov]
>> Sent: Monday, January 23, 2017 2:29 PM
>> To: Zhang, Hong
>> Cc: Matthew Knepley; PETSc
>> Subject: Re: [petsc-dev] Bug introduced in MatGetSubmatrices()
>>
>> > On Jan 23, 2017, at 2:23 PM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
>> >
>> > We actually check if ismax=1 for all process in PCSetUp_ASM():
>> > if (outwork.max == 1 && outwork.sum == size) {
>> >        /* osm->n_local_true = 1 on all processes, set this option may
>> enable use of optimized MatGetSubMatrices() implementation */
>> >        ierr = MatSetOption(pc->pmat,MAT_SUBM
>> AT_SINGLEIS,PETSC_TRUE);CHKERRQ(ierr);
>> >      }
>> > which is free of communication, then set the flag to avoid any
>> MPI_Allreduce call in
>> > MatGetSubMatrices(). Without this flag, it calls general impl of
>> MatGetSubMatrices() which calls reduce once.
>>
>>    Oh, the problem is that pc->pmat now has this flag and if someone
>> calls any other MatGetSubMatrices() on this matrix the flag is set and thus
>> trouble ensues. So the flag can't be stashed in the matrix.
>>
>> >
>> > Hong
>> >
>> > ________________________________________
>> > From: Barry Smith [bsmith at mcs.anl.gov]
>> > Sent: Monday, January 23, 2017 11:34 AM
>> > To: Zhang, Hong
>> > Cc: Matthew Knepley; PETSc
>> > Subject: Re: [petsc-dev] Bug introduced in MatGetSubmatrices()
>> >
>> >> On Jan 23, 2017, at 10:51 AM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
>> >>
>> >>
>> >>
>> >> What if a MPI process has no submatrices, how does it save the flag?
>> >>
>> >> case of 'SingleIS' is for ismax=1 in all processes. For PCASM, all
>> process has ismax=1, isn't it?
>> >
>> >   No, definitely not always 1.
>> >
>> >>
>> >> Hong
>> >>
>> >>
>> >>> On Jan 22, 2017, at 8:58 PM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
>> >>>
>> >>> The 'saved' context is in submatrices, not original matrix. I made
>> following fix:
>> >>>
>> >>> if (C->submat_singleis) { /* flag is set in PCSetUp_ASM() to skip
>> MPIU_Allreduce() */
>> >>>   ierr = MatGetSubMatrices_MPIAIJ_SingleIS(C,ismax,isrow,iscol,scall,
>> submat);CHKERRQ(ierr);
>> >>> +    C->submat_singleis = PETSC_FALSE; /* resume its default value in
>> case C will be used for non-singlis */
>> >>>   PetscFunctionReturn(0);
>> >>> }
>> >>> ...
>> >>> +  } else { /* MAT_REUSE_MATRIX */
>> >>>   subc = (Mat_SeqAIJ*)((*submat)[0]->data);
>> >>>   smat   = subc->submatis1;
>> >>>   if (!smat) {
>> >>>     /* smat is not generated by MatGetSubMatrix_MPIAIJ_All(...,MAT_INITIAL_MATRIX,...)
>> */
>> >>>     wantallmatrix = PETSC_TRUE;
>> >>> +    } else if (smat->singleis) {
>> >>> +      ierr = MatGetSubMatrices_MPIAIJ_Singl
>> eIS(C,ismax,isrow,iscol,scall,submat);CHKERRQ(ierr);
>> >>> +      PetscFunctionReturn(0);
>> >>>   } else {
>> >>>
>> >>> ...
>> >>> ie., after first call of MatGetSubMatrices_MPIAIJ_SingleIS, this
>> flag is set in saved context of submatrices
>> >>> and is reset as 'false' to the original matrix C.
>> >>>
>> >>> Do you agree with this fix? Is so, I'll push it in a branch for
>> nightly test.
>> >>>
>> >>> Hong
>> >>>
>> >>> ________________________________________
>> >>> From: Barry Smith [bsmith at mcs.anl.gov]
>> >>> Sent: Friday, January 20, 2017 8:19 PM
>> >>> To: Matthew Knepley
>> >>> Cc: Zhang, Hong; PETSc
>> >>> Subject: Re: [petsc-dev] Bug introduced in MatGetSubmatrices()
>> >>>
>> >>> Ok, so the problem comes from setting the flag in in the original
>> matrix that other people have access to and can be used for other purposes
>> by anyone,.
>> >>>
>> >>> So it would be better to instead put the flag into the generated sub
>> matrix which no one else has access to. This should be do-able and is the
>> right way to do it anyways.
>> >>>
>> >>> Hong, can you try putting all the "saved" context into the generated
>> submatrices instead of the original matrix? If that is not possible then
>> the context saved with the original matrix should have a reference to the
>> sub matrix and this context should only be reused if the sub matrix passed
>> in matches the sub matrix referenced in the context (this is a less
>> desirable solution).
>> >>>
>> >>> Barry
>> >>>
>> >>>> On Jan 20, 2017, at 8:06 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>> >>>>
>> >>>> On Fri, Jan 20, 2017 at 7:39 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> >>>>
>> >>>>> On Jan 20, 2017, at 7:01 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>> >>>>>
>> >>>>> On Fri, Jan 20, 2017 at 3:08 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> >>>>>
>> >>>>>> On Jan 20, 2017, at 3:01 PM, Zhang, Hong <hzhang at mcs.anl.gov>
>> wrote:
>> >>>>>>
>> >>>>>> We set C->submat_singleis = false as default.
>> >>>>>> User turns it on when he knows a single IS will be used for
>> MatGetSubMatrices(), e.g., when asm is used.
>> >>>>>> User should turn the flag off after it is being used.
>> >>>>>
>> >>>>> Right, this is why I am confused why this is messing up Mat since
>> presumably he is not setting it?
>> >>>>>
>> >>>>> ASM is turning this flag on, and it is corrupting my UNRELATED call
>> to MatGetSubMatrices().
>> >>>>
>> >>>> Yes, but you need to tell us exactly how. The flag is set into
>> C->submat_singleis  which is associated with the C from ASM, so why is it
>> affecting your UNRELATED call to MatGetSubMatrices(). It is not a global
>> variable so shouldn't.
>> >>>>
>> >>>> It sets the flag on the primary matrix. I will go step by step
>> >>>>
>> >>>> 1) I have a linear system with matrix A
>> >>>>
>> >>>> 2) To form a preconditioner for FieldSplit, I extract some diagonal
>> blocks from A using MatGetSubMatrices()
>> >>>>
>> >>>> 3) I also use ASM on the another block of FS
>> >>>>
>> >>>> 4) The second Newton iterate, I try to call MatGetSubMatrices(), but
>> ASM has set a flag on the matrix that causes this to fail.
>> >>>>
>> >>>> Explain or send a code that reproduces the problem.
>> >>>>
>> >>>> You can easily reproduce this by installing PyLith (a widely used
>> package) and running an automatic test. However, since this
>> >>>> is so simple, I am not sure this is necessary.
>> >>>>
>> >>>> Matt
>> >>>>
>> >>>> Barry
>> >>>>
>> >>>>>
>> >>>>> Matt
>> >>>>>
>> >>>>>>
>> >>>>>> This flag is for optimization -- avoid MPI_Allreduce, which is
>> expensive for np>10k.
>> >>>>>> MatGetSubMatrices() works without this flag turns on.
>> >>>>>> I just merged the improved MatGetSubMatrices_MPIAIJ() which only
>> calls MPI_Allreduce once
>> >>>>>> for repeated use. Previously, it calls MPI_Allreduce twice for
>> each call of MatGetSubMatrices_MPIAIJ().
>> >>>>>>
>> >>>>>> Hong
>> >>>>>>
>> >>>>>> ________________________________________
>> >>>>>> From: Barry Smith [bsmith at mcs.anl.gov]
>> >>>>>> Sent: Friday, January 20, 2017 1:11 PM
>> >>>>>> To: Matthew Knepley
>> >>>>>> Cc: Zhang, Hong; PETSc
>> >>>>>> Subject: Re: [petsc-dev] Bug introduced in MatGetSubmatrices()
>> >>>>>>
>> >>>>>>> On Jan 20, 2017, at 12:45 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>> >>>>>>>
>> >>>>>>> On Fri, Jan 20, 2017 at 11:55 AM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> >>>>>>>
>> >>>>>>>> On Jan 20, 2017, at 11:49 AM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>> >>>>>>>>
>> >>>>>>>> On Fri, Jan 20, 2017 at 10:49 AM, Hong <hzhang at mcs.anl.gov>
>> wrote:
>> >>>>>>>> Matt,
>> >>>>>>>> By default, the flag C->submat_singleis = false.
>> >>>>>>>> In PCSetUp_ASM(), we set it as 'true' to use
>> MatGetSubMatrices_MPIAIJ_SingleIS().
>> >>>>>>>>
>> >>>>>>>> Can you check the value of this flag in your case?
>> >>>>>>>>
>> >>>>>>>> The problem is the following:
>> >>>>>>>>
>> >>>>>>>> 1) We use MatGetSubMatrices() to extract small matrices in order
>> to form a preconditioner
>> >>>>>>>>
>> >>>>>>>> 2) We do this at each Newton iteration
>> >>>>>>>>
>> >>>>>>>> 3) We use ASM as a preconditioner for the eventual Newton solve
>> >>>>>>>>
>> >>>>>>>> 4) The second time we call MatGetSubMatrices(), it has this flag
>> set, even though we are using multiple ISes
>> >>>>>>>>
>> >>>>>>>> Solution: ALSO check that the user is in fact passing a single
>> IS.
>> >>>>>>>>
>> >>>>>>> This requires a communication. I am confused, is the number of IS
>> changing each time? If not why is the flag set?
>> >>>>>>>
>> >>>>>>> ASM sets this flag because it knows that IT is going to call
>> MatGetSubMatrices() later, but it unsafe if any user calls
>> >>>>>>> MatGetSubMatrices() as well. I think overall its a fragile design
>> and should be scrapped.
>> >>>>>>
>> >>>>>> You mean the user calls the SAME MatGetSubMatrices with the same
>> matrices, right? Not a completely different unrelated MatGetSubMatrices()
>> which should not be affect by the previous unrelated call.
>> >>>>>>
>> >>>>>>
>> >>>>>> Barry
>> >>>>>>
>> >>>>>>>
>> >>>>>>> Matt
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Barry
>> >>>>>>>
>> >>>>>>>> Matt
>> >>>>>>>>
>> >>>>>>>> Hong
>> >>>>>>>>
>> >>>>>>>> It comes from here:
>> >>>>>>>>
>> >>>>>>>> https://bitbucket.org/petsc/petsc/commits/c10200c1442b553b7a
>> d65c70101560db4fa22e78
>> >>>>>>>>
>> >>>>>>>> If we ask for more than 1 matrix, it dispatches to
>> >>>>>>>>
>> >>>>>>>> MatGetSubMatrices_MPIAIJ_SingleIS()
>> >>>>>>>>
>> >>>>>>>> but then fails here
>> >>>>>>>>
>> >>>>>>>> https://bitbucket.org/petsc/petsc/annotate/2e559809f9aee9c95
>> ee79eb0939630cfe5502c8d/src/mat/impls/aij/mpi/mpiov.c?at=
>> master&fileviewer=file-view-default#mpiov.c-1306
>> >>>>>>>>
>> >>>>>>>> because ismax > 1. I think the ismax check needs to move up to
>> here
>> >>>>>>>>
>> >>>>>>>> https://bitbucket.org/petsc/petsc/annotate/2e559809f9aee9c95
>> ee79eb0939630cfe5502c8d/src/mat/impls/aij/mpi/mpiov.c?at=
>> master&fileviewer=file-view-default#mpiov.c-2012
>> >>>>>>>>
>> >>>>>>>> but I don't know for sure. Please fix this since it is breaking
>> PyLith.
>> >>>>>>>>
>> >>>>>>>> Matt
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> >>>>>>>> -- Norbert Wiener
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> >>>>>>>> -- Norbert Wiener
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> >>>>>>> -- Norbert Wiener
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> >>>>> -- Norbert Wiener
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> >>>> -- Norbert Wiener
>> >>>
>> >>>
>> >>
>> >>
>> >
>> >
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20170126/cb95d6b1/attachment.html>


More information about the petsc-dev mailing list