[petsc-users] Random Error of mumps: out of memory: INFOG(1)=-9
Zongze Yang
yangzongze at gmail.com
Sat Mar 4 07:30:38 CST 2023
Hi,
I am writing to seek your advice regarding a problem I encountered while
using multigrid to solve a certain issue.
I am currently using multigrid with the coarse problem solved by PCLU.
However, the PC failed randomly with the error below (the value of INFO(2)
may differ):
```shell
[ 0] Error reported by MUMPS in numerical factorization phase: INFOG(1)=-9,
INFO(2)=36
```
Upon checking the documentation of MUMPS, I discovered that increasing the
value of ICNTL(14) may help resolve the issue. Specifically, I set the
option -mat_mumps_icntl_14 to a higher value (such as 40), and the error
seemed to disappear after I set the value of ICNTL(14) to 80. However, I am
still curious as to why MUMPS failed randomly in the first place.
Upon further inspection, I found that the number of nonzeros of the PETSc
matrix and the MUMPS matrix were different every time I ran the code. I am
now left with the following questions:
1. What could be causing the number of nonzeros of the MUMPS matrix to
change every time I run the code?
2. Why is the number of nonzeros of the MUMPS matrix significantly greater
than that of the PETSc matrix (as seen in the output of ksp_view, 115025949
vs 7346177)?
3. Is it possible that the varying number of nonzeros of the MUMPS matrix
is the cause of the random failure?
I have attached a test example written in Firedrake. The output of
`ksp_view` after running the code twice is included below for your
reference.
In the output, the number of nonzeros of the MUMPS matrix was 115025949 and
115377847, respectively, while that of the PETSc matrix was only 7346177.
```shell
(complex-int32-mkl) $ mpiexec -n 32 python test_mumps.py -ksp_view
::ascii_info_detail | grep -A3 "type: "
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
--
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: external
--
type: mumps
rows=1050625, cols=1050625
package used to perform factorization: mumps
total: nonzeros=115025949, allocated nonzeros=115025949
--
type: mpiaij
rows=1050625, cols=1050625
total: nonzeros=7346177, allocated nonzeros=7346177
total number of mallocs used during MatSetValues calls=0
(complex-int32-mkl) $ mpiexec -n 32 python test_mumps.py -ksp_view
::ascii_info_detail | grep -A3 "type: "
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
--
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: external
--
type: mumps
rows=1050625, cols=1050625
package used to perform factorization: mumps
total: nonzeros=115377847, allocated nonzeros=115377847
--
type: mpiaij
rows=1050625, cols=1050625
total: nonzeros=7346177, allocated nonzeros=7346177
total number of mallocs used during MatSetValues calls=0
```
I would greatly appreciate any insights you may have on this matter. Thank
you in advance for your time and assistance.
Best wishes,
Zongze
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230304/72957a42/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_mumps.py
Type: text/x-python
Size: 763 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230304/72957a42/attachment.py>
More information about the petsc-users
mailing list