[petsc-dev] Bug introduced in MatGetSubmatrices()
Barry Smith
bsmith at mcs.anl.gov
Mon Jan 23 14:29:32 CST 2017
> On Jan 23, 2017, at 2:23 PM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
>
> We actually check if ismax=1 for all process in PCSetUp_ASM():
> if (outwork.max == 1 && outwork.sum == size) {
> /* osm->n_local_true = 1 on all processes, set this option may enable use of optimized MatGetSubMatrices() implementation */
> ierr = MatSetOption(pc->pmat,MAT_SUBMAT_SINGLEIS,PETSC_TRUE);CHKERRQ(ierr);
> }
> which is free of communication, then set the flag to avoid any MPI_Allreduce call in
> MatGetSubMatrices(). Without this flag, it calls general impl of MatGetSubMatrices() which calls reduce once.
Oh, the problem is that pc->pmat now has this flag and if someone calls any other MatGetSubMatrices() on this matrix the flag is set and thus trouble ensues. So the flag can't be stashed in the matrix.
>
> Hong
>
> ________________________________________
> From: Barry Smith [bsmith at mcs.anl.gov]
> Sent: Monday, January 23, 2017 11:34 AM
> To: Zhang, Hong
> Cc: Matthew Knepley; PETSc
> Subject: Re: [petsc-dev] Bug introduced in MatGetSubmatrices()
>
>> On Jan 23, 2017, at 10:51 AM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
>>
>>
>>
>> What if a MPI process has no submatrices, how does it save the flag?
>>
>> case of 'SingleIS' is for ismax=1 in all processes. For PCASM, all process has ismax=1, isn't it?
>
> No, definitely not always 1.
>
>>
>> Hong
>>
>>
>>> On Jan 22, 2017, at 8:58 PM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
>>>
>>> The 'saved' context is in submatrices, not original matrix. I made following fix:
>>>
>>> if (C->submat_singleis) { /* flag is set in PCSetUp_ASM() to skip MPIU_Allreduce() */
>>> ierr = MatGetSubMatrices_MPIAIJ_SingleIS(C,ismax,isrow,iscol,scall,submat);CHKERRQ(ierr);
>>> + C->submat_singleis = PETSC_FALSE; /* resume its default value in case C will be used for non-singlis */
>>> PetscFunctionReturn(0);
>>> }
>>> ...
>>> + } else { /* MAT_REUSE_MATRIX */
>>> subc = (Mat_SeqAIJ*)((*submat)[0]->data);
>>> smat = subc->submatis1;
>>> if (!smat) {
>>> /* smat is not generated by MatGetSubMatrix_MPIAIJ_All(...,MAT_INITIAL_MATRIX,...) */
>>> wantallmatrix = PETSC_TRUE;
>>> + } else if (smat->singleis) {
>>> + ierr = MatGetSubMatrices_MPIAIJ_SingleIS(C,ismax,isrow,iscol,scall,submat);CHKERRQ(ierr);
>>> + PetscFunctionReturn(0);
>>> } else {
>>>
>>> ...
>>> ie., after first call of MatGetSubMatrices_MPIAIJ_SingleIS, this flag is set in saved context of submatrices
>>> and is reset as 'false' to the original matrix C.
>>>
>>> Do you agree with this fix? Is so, I'll push it in a branch for nightly test.
>>>
>>> Hong
>>>
>>> ________________________________________
>>> From: Barry Smith [bsmith at mcs.anl.gov]
>>> Sent: Friday, January 20, 2017 8:19 PM
>>> To: Matthew Knepley
>>> Cc: Zhang, Hong; PETSc
>>> Subject: Re: [petsc-dev] Bug introduced in MatGetSubmatrices()
>>>
>>> Ok, so the problem comes from setting the flag in in the original matrix that other people have access to and can be used for other purposes by anyone,.
>>>
>>> So it would be better to instead put the flag into the generated sub matrix which no one else has access to. This should be do-able and is the right way to do it anyways.
>>>
>>> Hong, can you try putting all the "saved" context into the generated submatrices instead of the original matrix? If that is not possible then the context saved with the original matrix should have a reference to the sub matrix and this context should only be reused if the sub matrix passed in matches the sub matrix referenced in the context (this is a less desirable solution).
>>>
>>> Barry
>>>
>>>> On Jan 20, 2017, at 8:06 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>
>>>> On Fri, Jan 20, 2017 at 7:39 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>
>>>>> On Jan 20, 2017, at 7:01 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>>
>>>>> On Fri, Jan 20, 2017 at 3:08 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>
>>>>>> On Jan 20, 2017, at 3:01 PM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
>>>>>>
>>>>>> We set C->submat_singleis = false as default.
>>>>>> User turns it on when he knows a single IS will be used for MatGetSubMatrices(), e.g., when asm is used.
>>>>>> User should turn the flag off after it is being used.
>>>>>
>>>>> Right, this is why I am confused why this is messing up Mat since presumably he is not setting it?
>>>>>
>>>>> ASM is turning this flag on, and it is corrupting my UNRELATED call to MatGetSubMatrices().
>>>>
>>>> Yes, but you need to tell us exactly how. The flag is set into C->submat_singleis which is associated with the C from ASM, so why is it affecting your UNRELATED call to MatGetSubMatrices(). It is not a global variable so shouldn't.
>>>>
>>>> It sets the flag on the primary matrix. I will go step by step
>>>>
>>>> 1) I have a linear system with matrix A
>>>>
>>>> 2) To form a preconditioner for FieldSplit, I extract some diagonal blocks from A using MatGetSubMatrices()
>>>>
>>>> 3) I also use ASM on the another block of FS
>>>>
>>>> 4) The second Newton iterate, I try to call MatGetSubMatrices(), but ASM has set a flag on the matrix that causes this to fail.
>>>>
>>>> Explain or send a code that reproduces the problem.
>>>>
>>>> You can easily reproduce this by installing PyLith (a widely used package) and running an automatic test. However, since this
>>>> is so simple, I am not sure this is necessary.
>>>>
>>>> Matt
>>>>
>>>> Barry
>>>>
>>>>>
>>>>> Matt
>>>>>
>>>>>>
>>>>>> This flag is for optimization -- avoid MPI_Allreduce, which is expensive for np>10k.
>>>>>> MatGetSubMatrices() works without this flag turns on.
>>>>>> I just merged the improved MatGetSubMatrices_MPIAIJ() which only calls MPI_Allreduce once
>>>>>> for repeated use. Previously, it calls MPI_Allreduce twice for each call of MatGetSubMatrices_MPIAIJ().
>>>>>>
>>>>>> Hong
>>>>>>
>>>>>> ________________________________________
>>>>>> From: Barry Smith [bsmith at mcs.anl.gov]
>>>>>> Sent: Friday, January 20, 2017 1:11 PM
>>>>>> To: Matthew Knepley
>>>>>> Cc: Zhang, Hong; PETSc
>>>>>> Subject: Re: [petsc-dev] Bug introduced in MatGetSubmatrices()
>>>>>>
>>>>>>> On Jan 20, 2017, at 12:45 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>>>>
>>>>>>> On Fri, Jan 20, 2017 at 11:55 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>>
>>>>>>>> On Jan 20, 2017, at 11:49 AM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>>>>>
>>>>>>>> On Fri, Jan 20, 2017 at 10:49 AM, Hong <hzhang at mcs.anl.gov> wrote:
>>>>>>>> Matt,
>>>>>>>> By default, the flag C->submat_singleis = false.
>>>>>>>> In PCSetUp_ASM(), we set it as 'true' to use MatGetSubMatrices_MPIAIJ_SingleIS().
>>>>>>>>
>>>>>>>> Can you check the value of this flag in your case?
>>>>>>>>
>>>>>>>> The problem is the following:
>>>>>>>>
>>>>>>>> 1) We use MatGetSubMatrices() to extract small matrices in order to form a preconditioner
>>>>>>>>
>>>>>>>> 2) We do this at each Newton iteration
>>>>>>>>
>>>>>>>> 3) We use ASM as a preconditioner for the eventual Newton solve
>>>>>>>>
>>>>>>>> 4) The second time we call MatGetSubMatrices(), it has this flag set, even though we are using multiple ISes
>>>>>>>>
>>>>>>>> Solution: ALSO check that the user is in fact passing a single IS.
>>>>>>>>
>>>>>>> This requires a communication. I am confused, is the number of IS changing each time? If not why is the flag set?
>>>>>>>
>>>>>>> ASM sets this flag because it knows that IT is going to call MatGetSubMatrices() later, but it unsafe if any user calls
>>>>>>> MatGetSubMatrices() as well. I think overall its a fragile design and should be scrapped.
>>>>>>
>>>>>> You mean the user calls the SAME MatGetSubMatrices with the same matrices, right? Not a completely different unrelated MatGetSubMatrices() which should not be affect by the previous unrelated call.
>>>>>>
>>>>>>
>>>>>> Barry
>>>>>>
>>>>>>>
>>>>>>> Matt
>>>>>>>
>>>>>>>
>>>>>>> Barry
>>>>>>>
>>>>>>>> Matt
>>>>>>>>
>>>>>>>> Hong
>>>>>>>>
>>>>>>>> It comes from here:
>>>>>>>>
>>>>>>>> https://bitbucket.org/petsc/petsc/commits/c10200c1442b553b7ad65c70101560db4fa22e78
>>>>>>>>
>>>>>>>> If we ask for more than 1 matrix, it dispatches to
>>>>>>>>
>>>>>>>> MatGetSubMatrices_MPIAIJ_SingleIS()
>>>>>>>>
>>>>>>>> but then fails here
>>>>>>>>
>>>>>>>> https://bitbucket.org/petsc/petsc/annotate/2e559809f9aee9c95ee79eb0939630cfe5502c8d/src/mat/impls/aij/mpi/mpiov.c?at=master&fileviewer=file-view-default#mpiov.c-1306
>>>>>>>>
>>>>>>>> because ismax > 1. I think the ismax check needs to move up to here
>>>>>>>>
>>>>>>>> https://bitbucket.org/petsc/petsc/annotate/2e559809f9aee9c95ee79eb0939630cfe5502c8d/src/mat/impls/aij/mpi/mpiov.c?at=master&fileviewer=file-view-default#mpiov.c-2012
>>>>>>>>
>>>>>>>> but I don't know for sure. Please fix this since it is breaking PyLith.
>>>>>>>>
>>>>>>>> Matt
>>>>>>>>
>>>>>>>> --
>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>
>>>
>>
>>
>
>
More information about the petsc-dev
mailing list