[petsc-dev] Bug introduced in MatGetSubmatrices()

Hong hzhang at mcs.anl.gov
Wed Jan 25 10:17:01 CST 2017


Matt,
This bug is fixed and merged to master. Please give it a try and let me
know if there is a problem.
Hong

On Mon, Jan 23, 2017 at 7:04 PM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:

> OK, I removed flag MAT_SUBMAT_SINGLEIS
> https://bitbucket.org/petsc/petsc/commits/65b5f10fe03da0cc383d7dd0f31fad
> f6a7ab9abe
>
> Hong
> ________________________________________
> From: Barry Smith [bsmith at mcs.anl.gov]
> Sent: Monday, January 23, 2017 2:29 PM
> To: Zhang, Hong
> Cc: Matthew Knepley; PETSc
> Subject: Re: [petsc-dev] Bug introduced in MatGetSubmatrices()
>
> > On Jan 23, 2017, at 2:23 PM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
> >
> > We actually check if ismax=1 for all process in PCSetUp_ASM():
> > if (outwork.max == 1 && outwork.sum == size) {
> >        /* osm->n_local_true = 1 on all processes, set this option may
> enable use of optimized MatGetSubMatrices() implementation */
> >        ierr = MatSetOption(pc->pmat,MAT_SUBMAT_SINGLEIS,PETSC_TRUE);
> CHKERRQ(ierr);
> >      }
> > which is free of communication, then set the flag to avoid any
> MPI_Allreduce call in
> > MatGetSubMatrices(). Without this flag, it calls general impl of
> MatGetSubMatrices() which calls reduce once.
>
>    Oh, the problem is that pc->pmat now has this flag and if someone calls
> any other MatGetSubMatrices() on this matrix the flag is set and thus
> trouble ensues. So the flag can't be stashed in the matrix.
>
> >
> > Hong
> >
> > ________________________________________
> > From: Barry Smith [bsmith at mcs.anl.gov]
> > Sent: Monday, January 23, 2017 11:34 AM
> > To: Zhang, Hong
> > Cc: Matthew Knepley; PETSc
> > Subject: Re: [petsc-dev] Bug introduced in MatGetSubmatrices()
> >
> >> On Jan 23, 2017, at 10:51 AM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
> >>
> >>
> >>
> >> What if a MPI process has no submatrices, how does it save the flag?
> >>
> >> case of 'SingleIS' is for ismax=1 in all processes. For PCASM, all
> process has ismax=1, isn't it?
> >
> >   No, definitely not always 1.
> >
> >>
> >> Hong
> >>
> >>
> >>> On Jan 22, 2017, at 8:58 PM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
> >>>
> >>> The 'saved' context is in submatrices, not original matrix. I made
> following fix:
> >>>
> >>> if (C->submat_singleis) { /* flag is set in PCSetUp_ASM() to skip
> MPIU_Allreduce() */
> >>>   ierr = MatGetSubMatrices_MPIAIJ_SingleIS(C,ismax,isrow,iscol,
> scall,submat);CHKERRQ(ierr);
> >>> +    C->submat_singleis = PETSC_FALSE; /* resume its default value in
> case C will be used for non-singlis */
> >>>   PetscFunctionReturn(0);
> >>> }
> >>> ...
> >>> +  } else { /* MAT_REUSE_MATRIX */
> >>>   subc = (Mat_SeqAIJ*)((*submat)[0]->data);
> >>>   smat   = subc->submatis1;
> >>>   if (!smat) {
> >>>     /* smat is not generated by MatGetSubMatrix_MPIAIJ_All(...,MAT_INITIAL_MATRIX,...)
> */
> >>>     wantallmatrix = PETSC_TRUE;
> >>> +    } else if (smat->singleis) {
> >>> +      ierr = MatGetSubMatrices_MPIAIJ_SingleIS(C,ismax,isrow,iscol,
> scall,submat);CHKERRQ(ierr);
> >>> +      PetscFunctionReturn(0);
> >>>   } else {
> >>>
> >>> ...
> >>> ie., after first call of MatGetSubMatrices_MPIAIJ_SingleIS, this flag
> is set in saved context of submatrices
> >>> and is reset as 'false' to the original matrix C.
> >>>
> >>> Do you agree with this fix? Is so, I'll push it in a branch for
> nightly test.
> >>>
> >>> Hong
> >>>
> >>> ________________________________________
> >>> From: Barry Smith [bsmith at mcs.anl.gov]
> >>> Sent: Friday, January 20, 2017 8:19 PM
> >>> To: Matthew Knepley
> >>> Cc: Zhang, Hong; PETSc
> >>> Subject: Re: [petsc-dev] Bug introduced in MatGetSubmatrices()
> >>>
> >>> Ok, so the problem comes from setting the flag in in the original
> matrix that other people have access to and can be used for other purposes
> by anyone,.
> >>>
> >>> So it would be better to instead put the flag into the generated sub
> matrix which no one else has access to. This should be do-able and is the
> right way to do it anyways.
> >>>
> >>> Hong, can you try putting all the "saved" context into the generated
> submatrices instead of the original matrix? If that is not possible then
> the context saved with the original matrix should have a reference to the
> sub matrix and this context should only be reused if the sub matrix passed
> in matches the sub matrix referenced in the context (this is a less
> desirable solution).
> >>>
> >>> Barry
> >>>
> >>>> On Jan 20, 2017, at 8:06 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> >>>>
> >>>> On Fri, Jan 20, 2017 at 7:39 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> >>>>
> >>>>> On Jan 20, 2017, at 7:01 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> >>>>>
> >>>>> On Fri, Jan 20, 2017 at 3:08 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> >>>>>
> >>>>>> On Jan 20, 2017, at 3:01 PM, Zhang, Hong <hzhang at mcs.anl.gov>
> wrote:
> >>>>>>
> >>>>>> We set C->submat_singleis = false as default.
> >>>>>> User turns it on when he knows a single IS will be used for
> MatGetSubMatrices(), e.g., when asm is used.
> >>>>>> User should turn the flag off after it is being used.
> >>>>>
> >>>>> Right, this is why I am confused why this is messing up Mat since
> presumably he is not setting it?
> >>>>>
> >>>>> ASM is turning this flag on, and it is corrupting my UNRELATED call
> to MatGetSubMatrices().
> >>>>
> >>>> Yes, but you need to tell us exactly how. The flag is set into
> C->submat_singleis  which is associated with the C from ASM, so why is it
> affecting your UNRELATED call to MatGetSubMatrices(). It is not a global
> variable so shouldn't.
> >>>>
> >>>> It sets the flag on the primary matrix. I will go step by step
> >>>>
> >>>> 1) I have a linear system with matrix A
> >>>>
> >>>> 2) To form a preconditioner for FieldSplit, I extract some diagonal
> blocks from A using MatGetSubMatrices()
> >>>>
> >>>> 3) I also use ASM on the another block of FS
> >>>>
> >>>> 4) The second Newton iterate, I try to call MatGetSubMatrices(), but
> ASM has set a flag on the matrix that causes this to fail.
> >>>>
> >>>> Explain or send a code that reproduces the problem.
> >>>>
> >>>> You can easily reproduce this by installing PyLith (a widely used
> package) and running an automatic test. However, since this
> >>>> is so simple, I am not sure this is necessary.
> >>>>
> >>>> Matt
> >>>>
> >>>> Barry
> >>>>
> >>>>>
> >>>>> Matt
> >>>>>
> >>>>>>
> >>>>>> This flag is for optimization -- avoid MPI_Allreduce, which is
> expensive for np>10k.
> >>>>>> MatGetSubMatrices() works without this flag turns on.
> >>>>>> I just merged the improved MatGetSubMatrices_MPIAIJ() which only
> calls MPI_Allreduce once
> >>>>>> for repeated use. Previously, it calls MPI_Allreduce twice for each
> call of MatGetSubMatrices_MPIAIJ().
> >>>>>>
> >>>>>> Hong
> >>>>>>
> >>>>>> ________________________________________
> >>>>>> From: Barry Smith [bsmith at mcs.anl.gov]
> >>>>>> Sent: Friday, January 20, 2017 1:11 PM
> >>>>>> To: Matthew Knepley
> >>>>>> Cc: Zhang, Hong; PETSc
> >>>>>> Subject: Re: [petsc-dev] Bug introduced in MatGetSubmatrices()
> >>>>>>
> >>>>>>> On Jan 20, 2017, at 12:45 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> >>>>>>>
> >>>>>>> On Fri, Jan 20, 2017 at 11:55 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> >>>>>>>
> >>>>>>>> On Jan 20, 2017, at 11:49 AM, Matthew Knepley <knepley at gmail.com>
> wrote:
> >>>>>>>>
> >>>>>>>> On Fri, Jan 20, 2017 at 10:49 AM, Hong <hzhang at mcs.anl.gov>
> wrote:
> >>>>>>>> Matt,
> >>>>>>>> By default, the flag C->submat_singleis = false.
> >>>>>>>> In PCSetUp_ASM(), we set it as 'true' to use
> MatGetSubMatrices_MPIAIJ_SingleIS().
> >>>>>>>>
> >>>>>>>> Can you check the value of this flag in your case?
> >>>>>>>>
> >>>>>>>> The problem is the following:
> >>>>>>>>
> >>>>>>>> 1) We use MatGetSubMatrices() to extract small matrices in order
> to form a preconditioner
> >>>>>>>>
> >>>>>>>> 2) We do this at each Newton iteration
> >>>>>>>>
> >>>>>>>> 3) We use ASM as a preconditioner for the eventual Newton solve
> >>>>>>>>
> >>>>>>>> 4) The second time we call MatGetSubMatrices(), it has this flag
> set, even though we are using multiple ISes
> >>>>>>>>
> >>>>>>>> Solution: ALSO check that the user is in fact passing a single IS.
> >>>>>>>>
> >>>>>>> This requires a communication. I am confused, is the number of IS
> changing each time? If not why is the flag set?
> >>>>>>>
> >>>>>>> ASM sets this flag because it knows that IT is going to call
> MatGetSubMatrices() later, but it unsafe if any user calls
> >>>>>>> MatGetSubMatrices() as well. I think overall its a fragile design
> and should be scrapped.
> >>>>>>
> >>>>>> You mean the user calls the SAME MatGetSubMatrices with the same
> matrices, right? Not a completely different unrelated MatGetSubMatrices()
> which should not be affect by the previous unrelated call.
> >>>>>>
> >>>>>>
> >>>>>> Barry
> >>>>>>
> >>>>>>>
> >>>>>>> Matt
> >>>>>>>
> >>>>>>>
> >>>>>>> Barry
> >>>>>>>
> >>>>>>>> Matt
> >>>>>>>>
> >>>>>>>> Hong
> >>>>>>>>
> >>>>>>>> It comes from here:
> >>>>>>>>
> >>>>>>>> https://bitbucket.org/petsc/petsc/commits/
> c10200c1442b553b7ad65c70101560db4fa22e78
> >>>>>>>>
> >>>>>>>> If we ask for more than 1 matrix, it dispatches to
> >>>>>>>>
> >>>>>>>> MatGetSubMatrices_MPIAIJ_SingleIS()
> >>>>>>>>
> >>>>>>>> but then fails here
> >>>>>>>>
> >>>>>>>> https://bitbucket.org/petsc/petsc/annotate/
> 2e559809f9aee9c95ee79eb0939630cfe5502c8d/src/mat/impls/aij/
> mpi/mpiov.c?at=master&fileviewer=file-view-default#mpiov.c-1306
> >>>>>>>>
> >>>>>>>> because ismax > 1. I think the ismax check needs to move up to
> here
> >>>>>>>>
> >>>>>>>> https://bitbucket.org/petsc/petsc/annotate/
> 2e559809f9aee9c95ee79eb0939630cfe5502c8d/src/mat/impls/aij/
> mpi/mpiov.c?at=master&fileviewer=file-view-default#mpiov.c-2012
> >>>>>>>>
> >>>>>>>> but I don't know for sure. Please fix this since it is breaking
> PyLith.
> >>>>>>>>
> >>>>>>>> Matt
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> >>>>>>>> -- Norbert Wiener
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> >>>>>>>> -- Norbert Wiener
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> >>>>>>> -- Norbert Wiener
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> >>>>> -- Norbert Wiener
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> >>>> -- Norbert Wiener
> >>>
> >>>
> >>
> >>
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20170125/3741c104/attachment.html>


More information about the petsc-dev mailing list