[petsc-users] Near nullspace lost in fieldsplit
Barry Smith
bsmith at petsc.dev
Tue Feb 13 13:20:56 CST 2024
Thank you for the code.
A) By default MatZeroRows() does change the nonzero structure of the matrix
B) PCFIELDSPLIT loses the null spaces attached to the submatrices if the nonzero structure of the matrix changes.
For the example code if one sets MatSetOption(A,MAT_KEEP_NONZERO_PATTERN, PETSC_TRUE); then using MatZeroRows() no longer changes the nonzero structure and thus the code behaves similarly for both MatZeroRows() and MatZeroRowsColumns().
Should B lose the null spaces when the nonzero structure changes? The documentation for MatSetNullSpace() does not explicitly state that the attached null space will remain with the matrix for its entire life unless the user calls MatSetNullspace again, but it implicitly implies that. Thus, I think the B should not lose the attached null spaces. I will change the code to preserve the null spaces in PCFIELDSPLIT when the nonzero structure changes.
Going back to your two points
1. Code going through line 692 loses the near nullspace of the matrices attached to the sub-KSPs
2. The call to MatZeroRowsColumns() changes then non-zero structure for MPIAIJ but not for SEQAIJ
My new branch will prevent the loss of the nullspace in 1.
My previous fix (that is now in main) fixes the bug where MatZeroRowsColumns() in parallel thought it changed the nonzero structure (while it actually did not change the nonzero structure).
Note: Independent of 1 and 2 most likely when you use MatZeroRows() in your code as you describe it you will want to use MatSetOption(A,MAT_KEEP_NONZERO_PATTERN, PETSC_TRUE); since the code will likely be a bit more efficient and will have less memory churn.
> On Feb 13, 2024, at 7:12 AM, Jeremy Theler (External) <jeremy.theler-ext at ansys.com> wrote:
>
> Hi Barry
>
> > 7279 does change the code for MatZeroRowsColumns_MPIAIJ(). But perhaps that does not resolve the problem you are seeing? If that is the case we will need a reproducible example so we can determine exactly what else is happening in your code to cause the difficulties.
> >
> >Here is the diff for MatZeroRowsColumns_MPIAIJ()
> >
> >@@ -1026,7 +1023,7 @@ static PetscErrorCode MatZeroRowsColumns_MPIAIJ(Mat A, PetscInt N, const PetscIn
> >
> > PetscCall(PetscFree(lrows));
> >
> > /* only change matrix nonzero state if pattern was allowed to be changed */
> > - if (!((Mat_SeqAIJ *)(l->A->data))->keepnonzeropattern) {
> > + if (!((Mat_SeqAIJ *)(l->A->data))->nonew) {
> > PetscObjectState state = l->A->nonzerostate + l->B->nonzerostate;
> > PetscCall(MPIU_Allreduce(&state, &A->nonzerostate, 1, MPIU_INT64, MPI_SUM, PetscObjectComm((PetscObject)A)));
> > }
>
> Fair enough, I might have overlooked this single line. But this change does not fix our issue. The if condition is still true.
> I'll need some time to come up with a reproducible example because it involves setting up the PC of the KSP of an SNES of a TS before doing TSSolve() with the DM-allocated jacobian.
>
> However, regarding bug #1, I do have a MWE. It is not CI-friendly. I tried to modify ex47.c to illustrate the issue but could not make it work.
> Anyway, the attached tarball:
> reads a global matrix A from a file.
> reads two ISs from another file
> sets up a field split PC
> attaches a near nullspace (read from another file) to one of the sub-KSPs
> sets "dirichlet BCs" with MatZeroRowsColumns() or MatZeroRows() depending if -columns was given in the command line
>
> The issue comes when calling MatZeroRowsColumns() or MatZeroRows().
> In the first case, the near nullspace is not lost but it is in the second:
>
> $ make lost
> $ ./lost -columns -ksp_view | grep "near null"
> has attached near null space
> has attached near null space
> $ mpiexec -n 2 ./lost -columns -ksp_view | grep "near null"
> has attached near null space
> has attached near null space
> $ ./lost -ksp_view | grep "near null"
> $ mpiexec -n 2 ./lost -ksp_view | grep "near null"
> $
>
> When using MatZeroRows(), the code passes through fieldsplit.c:692 which looses the near nullspace.
>
>
> Note that the original issue we see in our code is that there is a difference between the serial and parallel implementation of MatZeroRowsColumns().
> We loose the near nullspace only in parallel but not in serial because of a combination of bugs #1 and #2.
>
>
> --
> jeremy
>
> <lost.tar.gz>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240213/3682d961/attachment-0001.html>
More information about the petsc-users
mailing list