[petsc-dev] Petsc "make test" have more failures for --with-openmp=1
Eric Chamberland
Eric.Chamberland at giref.ulaval.ca
Thu Mar 18 20:46:40 CDT 2021
Hi again,
ok, just saw that some matrices have lines of "0" in case of 3D hermite
DOFs (ex: du/dz derivatives) when used into a 2D plane mesh...
So, my last problem about hypre smoother is "normal".
However, just to play with one of this matrix, I tried to do a "LU" with
mumps icntl_24 option activated on the global system: fine it works.
Then I tried to switche to GAMG with mumps for the coarse_sub level, but
it seems my icntl_24 option is then ignored and I don't know why...
See my KSP:
KSP Object: (Options_ProjectionL2_0) 1 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-15, absolute=1e-15, divergence=1e+12
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (Options_ProjectionL2_0) 1 MPI processes
type: gamg
type is MULTIPLICATIVE, levels=2 cycles=v
Cycles per PCApply=1
Using externally compute Galerkin coarse grid matrices
GAMG specific options
Threshold for dropping small values in graph on each level =
Threshold scaling factor for each level not specified = 1.
AGG specific options
Symmetric graph false
Number of levels to square graph 1
Number smoothing steps 1
Complexity: grid = 1.09756
Coarse grid solver -- level -------------------------------
KSP Object: (Options_ProjectionL2_0mg_coarse_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (Options_ProjectionL2_0mg_coarse_) 1 MPI processes
type: bjacobi
number of blocks = 1
Local solver is the same for all blocks, as in the following
KSP and PC objects on rank 0:
KSP Object: (Options_ProjectionL2_0mg_coarse_sub_) 1 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (Options_ProjectionL2_0mg_coarse_sub_) 1 MPI processes
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
matrix ordering: nd
factor fill ratio given 0., needed 0.
Factored matrix follows:
Mat Object: 1 MPI processes
type: mumps
rows=8, cols=8
package used to perform factorization: mumps
total: nonzeros=64, allocated nonzeros=64
MUMPS run parameters:
SYM (matrix type): 0
PAR (host participation): 1
ICNTL(1) (output for error): 6
ICNTL(2) (output of diagnostic msg): 0
ICNTL(3) (output for global info): 0
ICNTL(4) (level of printing): 0
ICNTL(5) (input mat struct): 0
ICNTL(6) (matrix prescaling): 7
ICNTL(7) (sequential matrix ordering):7
ICNTL(8) (scaling strategy): 77
ICNTL(10) (max num of refinements): 0
ICNTL(11) (error analysis): 0
ICNTL(12) (efficiency
control): 1
ICNTL(13) (sequential factorization of the root
node): 0
ICNTL(14) (percentage of estimated workspace
increase): 20
ICNTL(18) (input mat
struct): 0
ICNTL(19) (Schur complement
info): 0
ICNTL(20) (RHS sparse
pattern): 0
ICNTL(21) (solution
struct): 0
ICNTL(22) (in-core/out-of-core
facility): 0
ICNTL(23) (max size of memory can be allocated
locally):0
ICNTL(24) (detection of null pivot
rows): 0
ICNTL(25) (computation of a null space
basis): 0
ICNTL(26) (Schur options for RHS or
solution): 0
ICNTL(27) (blocking size for multiple
RHS): -32
ICNTL(28) (use parallel or sequential
ordering): 1
ICNTL(29) (parallel
ordering): 0
ICNTL(30) (user-specified set of entries in
inv(A)): 0
ICNTL(31) (factors is discarded in the solve
phase): 0
ICNTL(33) (compute
determinant): 0
ICNTL(35) (activate BLR based
factorization): 0
ICNTL(36) (choice of BLR factorization
variant): 0
ICNTL(38) (estimated compression rate of LU
factors): 333
CNTL(1) (relative pivoting threshold): 0.01
CNTL(2) (stopping criterion of refinement):
1.49012e-08
CNTL(3) (absolute pivoting threshold): 0.
CNTL(4) (value of static pivoting): -1.
CNTL(5) (fixation for null pivots): 0.
CNTL(7) (dropping parameter for BLR): 0.
RINFO(1) (local estimated flops for the elimination
after analysis):
[0] 308.
RINFO(2) (local estimated flops for the assembly
after factorization):
[0] 0.
RINFO(3) (local estimated flops for the elimination
after factorization):
[0] 0.
INFO(15) (estimated size of (in MB) MUMPS internal
data for running numerical factorization):
[0] 0
INFO(16) (size of (in MB) MUMPS internal data used
during numerical factorization):
[0] 0
INFO(23) (num of pivots eliminated on this
processor after factorization):
[0] 6
RINFOG(1) (global estimated flops for the
elimination after analysis): 308.
RINFOG(2) (global estimated flops for the assembly
after factorization): 0.
RINFOG(3) (global estimated flops for the
elimination after factorization): 0.
(RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant):
(0.,0.)*(2^0)
INFOG(3) (estimated real workspace for factors on
all processors after analysis): 64
INFOG(4) (estimated integer workspace for factors
on all processors after analysis): 35
INFOG(5) (estimated maximum front size in the
complete tree): 8
INFOG(6) (number of nodes in the complete tree): 1
INFOG(7) (ordering option effectively use after
analysis): 2
INFOG(8) (structural symmetry in percent of the
permuted matrix after analysis): 100
INFOG(9) (total real/complex workspace to store the
matrix factors after factorization): 64
INFOG(10) (total integer space store the matrix
factors after factorization): 35
INFOG(11) (order of largest frontal matrix after
factorization): 8
INFOG(12) (number of off-diagonal pivots): 0
INFOG(13) (number of delayed pivots after
factorization): 0
INFOG(14) (number of memory compress after
factorization): 0
INFOG(15) (number of steps of iterative refinement
after solution): 0
INFOG(16) (estimated size (in MB) of all MUMPS
internal data for factorization after analysis: value on the most memory
consuming processor): 0
INFOG(17) (estimated size of all MUMPS internal
data for factorization after analysis: sum over all processors): 0
INFOG(18) (size of all MUMPS internal data
allocated during factorization: value on the most memory consuming
processor): 0
INFOG(19) (size of all MUMPS internal data
allocated during factorization: sum over all processors): 0
INFOG(20) (estimated number of entries in the
factors): 64
INFOG(21) (size in MB of memory effectively used
during factorization - value on the most memory consuming processor): 0
INFOG(22) (size in MB of memory effectively used
during factorization - sum over all processors): 0
INFOG(23) (after analysis: value of ICNTL(6)
effectively used): 0
INFOG(24) (after analysis: value of ICNTL(12)
effectively used): 1
INFOG(25) (after factorization: number of pivots
modified by static pivoting): 0
INFOG(28) (after factorization: number of null
pivots encountered): 0
INFOG(29) (after factorization: effective number of
entries in the factors (sum over all processors)): 0
INFOG(30, 31) (after solution: size in Mbytes of
memory used during solution phase): 0, 0
INFOG(32) (after analysis: type of analysis done): 1
INFOG(33) (value used for ICNTL(8)): 7
INFOG(34) (exponent of the determinant if
determinant is requested): 0
INFOG(35) (after factorization: number of entries
taking into account BLR factor compression - sum over all processors): 0
INFOG(36) (after analysis: estimated size of all
MUMPS internal data for running BLR in-core - value on the most memory
consuming processor): 0
INFOG(37) (after analysis: estimated size of all
MUMPS internal data for running BLR in-core - sum over all processors): 0
INFOG(38) (after analysis: estimated size of all
MUMPS internal data for running BLR out-of-core - value on the most
memory consuming processor): 0
INFOG(39) (after analysis: estimated size of all
MUMPS internal data for running BLR out-of-core - sum over all
processors): 0
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=8, cols=8, bs=4
total: nonzeros=64, allocated nonzeros=64
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 2 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=8, cols=8, bs=4
total: nonzeros=64, allocated nonzeros=64
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 2 nodes, limit used is 5
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (Options_ProjectionL2_0mg_levels_1_) 1 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0., max = 0.
eigenvalues estimate via gmres min 0., max 0.
eigenvalues estimated using gmres with translations [0. 0.1;
0. 1.1]
KSP Object: (Options_ProjectionL2_0mg_levels_1_esteig_) 1 MPI
processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (Options_ProjectionL2_0mg_levels_1_) 1 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations =
1, omega = 1.
linear system matrix = precond matrix:
Mat Object: (Options_ProjectionL2_0) 1 MPI processes
type: seqaij
rows=36, cols=36, bs=4
total: nonzeros=656, allocated nonzeros=656
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 9 nodes, limit used is 5
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (Options_ProjectionL2_0mg_levels_1_) 1 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1,
omega = 1.
linear system matrix = precond matrix:
Mat Object: (Options_ProjectionL2_0) 1 MPI processes
type: seqaij
rows=36, cols=36, bs=4
total: nonzeros=656, allocated nonzeros=656
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 9 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Mat Object: (Options_ProjectionL2_0) 1 MPI processes
type: seqaij
rows=36, cols=36, bs=4
total: nonzeros=656, allocated nonzeros=656
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 9 nodes, limit used is 5
but I have this option left:
Option left:
name:-Options_ProjectionL2_0mg_coarse_sub_mat_mumps_icntl_24 value: 1
and as you can see above I end with:
ICNTL(24) (detection of null pivot
rows): 0
which is fatal in my case...
Can you see where I did wrong?
Thanks,
Eric
More information about the petsc-dev
mailing list