[petsc-users] nondeterministic behavior of MUMPS when filtering out zero rows and columns

Zhang, Hong hzhang at mcs.anl.gov
Thu Nov 7 09:28:26 CST 2019


Run your code with option '-ksp_error_if_not_converged' to get more info.
Hong

On Thu, Nov 7, 2019 at 5:45 AM s.a.hack--- via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hi,

I am doing calculations with version 3.12.0 of PETSc.
Using the finite-element method, I solve the Maxwell equations on the interior of a 3D domain, coupled with boundary condition auxiliary equations on the boundary of the domain. The auxiliary equations employ auxiliary variables g.

For ease of implementation of element matrix assembly, the auxiliary variables g are defined on the entire domain. However, only the basis functions for g with nonzero value at the boundary give nonzero entries in the system matrix.

The element matrices hence have the structure
[ A B; C D]
at the boundary.

In the interior the element matrices have the structure
[A 0; 0 0].

The degrees of freedom in the system matrix can be ordered by element [u_e1 g_e1 u_e2 g_e2 …] or by parallel process [u_p1 g_p1 u_p2 g_p2 …].

To solve the system matrix, I need to filter out zero rows and columns:
error = MatFindNonzeroRows(stiffnessMatrix, &nonzeroRows);
CHKERRABORT(PETSC_COMM_WORLD, error);
error = MatCreateSubMatrix(stiffnessMatrix, nonzeroRows, nonzeroRows, MAT_INITIAL_MATRIX, &stiffnessMatrixSubMatrix);
CHKERRABORT(PETSC_COMM_WORLD, error);

I solve the system matrix in parallel on multiple nodes connected with InfiniBand.
The problem is that the MUMPS solver frequently (nondeterministically) hangs during KSPSolve() (after KSPSetUp() is completed).
Running with the options -ksp_view and -info the last printed statement is:

[0] VecScatterCreate_SF(): Using StarForest for vector scatter
In the calculations where the program does not hang, the calculated solution is correct.

The problem doesn’t occur for calculations on a single node, or for calculations with the SuperLU solver (but SuperLU will not allow calculations that are as large).
The problem also doesn’t seem to occur for small problems.
The problem doesn’t occur either when I put ones on the diagonal, but this is computationally expensive:
error = MatFindZeroRows(stiffnessMatrix, &zeroRows);
CHKERRABORT(PETSC_COMM_WORLD, error);
error = MatZeroRowsColumnsIS(stiffnessMatrix, zeroRows, diagEntry, PETSC_IGNORE, PETSC_IGNORE);
CHKERRABORT(PETSC_COMM_WORLD, error);

Would you have any ideas on what I could check?

Best regards,
Sjoerd

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191107/b049e185/attachment.html>


More information about the petsc-users mailing list