[petsc-users] Random Error of mumps: out of memory: INFOG(1)=-9

Zongze Yang yangzongze at gmail.com
Sat Mar 4 07:30:38 CST 2023


Hi,

I am writing to seek your advice regarding a problem I encountered while
using multigrid to solve a certain issue.
I am currently using multigrid with the coarse problem solved by PCLU.
However, the PC failed randomly with the error below (the value of INFO(2)
may differ):
```shell
[ 0] Error reported by MUMPS in numerical factorization phase: INFOG(1)=-9,
INFO(2)=36
```

Upon checking the documentation of MUMPS, I discovered that increasing the
value of ICNTL(14) may help resolve the issue. Specifically, I set the
option -mat_mumps_icntl_14 to a higher value (such as 40), and the error
seemed to disappear after I set the value of ICNTL(14) to 80. However, I am
still curious as to why MUMPS failed randomly in the first place.

Upon further inspection, I found that the number of nonzeros of the PETSc
matrix and the MUMPS matrix were different every time I ran the code. I am
now left with the following questions:

1. What could be causing the number of nonzeros of the MUMPS matrix to
change every time I run the code?
2. Why is the number of nonzeros of the MUMPS matrix significantly greater
than that of the PETSc matrix (as seen in the output of ksp_view, 115025949
vs 7346177)?
3. Is it possible that the varying number of nonzeros of the MUMPS matrix
is the cause of the random failure?

I have attached a test example written in Firedrake. The output of
`ksp_view` after running the code twice is included below for your
reference.
In the output, the number of nonzeros of the MUMPS matrix was 115025949 and
115377847, respectively, while that of the PETSc matrix was only 7346177.

```shell
(complex-int32-mkl) $ mpiexec -n 32 python test_mumps.py -ksp_view
::ascii_info_detail | grep -A3 "type: "
  type: preonly
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
--
  type: lu
    out-of-place factorization
    tolerance for zero pivot 2.22045e-14
    matrix ordering: external
--
          type: mumps
          rows=1050625, cols=1050625
          package used to perform factorization: mumps
          total: nonzeros=115025949, allocated nonzeros=115025949
--
    type: mpiaij
    rows=1050625, cols=1050625
    total: nonzeros=7346177, allocated nonzeros=7346177
    total number of mallocs used during MatSetValues calls=0
(complex-int32-mkl) $ mpiexec -n 32 python test_mumps.py -ksp_view
::ascii_info_detail | grep -A3 "type: "
  type: preonly
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
--
  type: lu
    out-of-place factorization
    tolerance for zero pivot 2.22045e-14
    matrix ordering: external
--
          type: mumps
          rows=1050625, cols=1050625
          package used to perform factorization: mumps
          total: nonzeros=115377847, allocated nonzeros=115377847
--
    type: mpiaij
    rows=1050625, cols=1050625
    total: nonzeros=7346177, allocated nonzeros=7346177
    total number of mallocs used during MatSetValues calls=0
```

I would greatly appreciate any insights you may have on this matter. Thank
you in advance for your time and assistance.

Best wishes,
Zongze
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230304/72957a42/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_mumps.py
Type: text/x-python
Size: 763 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230304/72957a42/attachment.py>


More information about the petsc-users mailing list