[petsc-users] Generalized eigenvalue problem using quad precision
Santiago Andres Triana
repepo at gmail.com
Mon Mar 5 04:50:35 CST 2018
Dear Jose,
Thanks for your reply. The problem I deal with (rotational fluid dynamics)
involves a very small parameter, the Ekman number, which needs to be as
small as possible, hopefully 10^-10 or smaller (typical of the molten core
of a planet). I have noticed (and other authors before) that round-off
errors become more noticeable as this Ekman number is made smaller. That is
why it would be nice to have some calculations done with quad precision, it
will also help to estimate the round-off errors when working with
double-precision.
As far as I understand MUMPS doesn't do quad precision but all native
methods in petsc/slepc can do it.
Below is the output of -eps_view for a typical run. Thanks again!
Santiago
EPS Object: 12 MPI processes
type: krylovschur
50% of basis vectors kept after restart
using the locking variant
problem type: generalized non-hermitian eigenvalue problem
balancing enabled: one-sided Krylov, with its=5
selected portion of the spectrum: closest to target:
-0.000828692+1.00018i (in magnitude)
number of eigenvalues (nev): 2
number of column vectors (ncv): 17
maximum dimension of projected problem (mpd): 17
maximum number of iterations: 160
tolerance: 1e-18
convergence test: relative to the eigenvalue
BV Object: 12 MPI processes
type: svec
18 columns of global length 903168
vector orthogonalization method: classical Gram-Schmidt
orthogonalization refinement: if needed (eta: 0.7071)
block orthogonalization method: Gram-Schmidt
doing matmult as a single matrix-matrix product
DS Object: 12 MPI processes
type: nhep
parallel operation mode: REDUNDANT
ST Object: 12 MPI processes
type: sinvert
shift: -0.000828692+1.00018i
number of matrices: 2
all matrices have different nonzero pattern
KSP Object: (st_) 12 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-08, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (st_) 12 MPI processes
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 0., needed 0.
Factored matrix follows:
Mat Object: 12 MPI processes
type: mumps
rows=903168, cols=903168
package used to perform factorization: mumps
total: nonzeros=357531912, allocated nonzeros=357531912
total number of mallocs used during MatSetValues calls =0
MUMPS run parameters:
SYM (matrix type): 0
PAR (host participation): 1
ICNTL(1) (output for error): 6
ICNTL(2) (output of diagnostic msg): 0
ICNTL(3) (output for global info): 0
ICNTL(4) (level of printing): 0
ICNTL(5) (input mat struct): 0
ICNTL(6) (matrix prescaling): 7
ICNTL(7) (sequential matrix ordering):7
ICNTL(8) (scaling strategy): 77
ICNTL(10) (max num of refinements): 0
ICNTL(11) (error analysis): 0
ICNTL(12) (efficiency control): 1
ICNTL(13) (efficiency control): 0
ICNTL(14) (percentage of estimated workspace increase): 30
ICNTL(18) (input mat struct): 3
ICNTL(19) (Schur complement info): 0
ICNTL(20) (rhs sparse pattern): 0
ICNTL(21) (solution struct): 1
ICNTL(22) (in-core/out-of-core facility): 0
ICNTL(23) (max size of memory can be allocated locally):0
ICNTL(24) (detection of null pivot rows): 0
ICNTL(25) (computation of a null space basis): 0
ICNTL(26) (Schur options for rhs or solution): 0
ICNTL(27) (experimental parameter): -32
ICNTL(28) (use parallel or sequential ordering): 1
ICNTL(29) (parallel ordering): 0
ICNTL(30) (user-specified set of entries in inv(A)): 0
ICNTL(31) (factors is discarded in the solve phase): 0
ICNTL(33) (compute determinant): 0
CNTL(1) (relative pivoting threshold): 0.01
CNTL(2) (stopping criterion of refinement): 1.49012e-08
CNTL(3) (absolute pivoting threshold): 0.
CNTL(4) (value of static pivoting): -1.
CNTL(5) (fixation for null pivots): 0.
RINFO(1) (local estimated flops for the elimination after
analysis):
[0] 3.19646e+10
[1] 2.52681e+10
[2] 2.44386e+10
[3] 1.9843e+10
[4] 1.98101e+10
[5] 1.99033e+10
[6] 2.43184e+10
[7] 1.96892e+10
[8] 1.99212e+10
[9] 2.03623e+10
[10] 3.43984e+10
[11] 3.19496e+10
RINFO(2) (local estimated flops for the assembly after
factorization):
[0] 6.77645e+07
[1] 7.50768e+07
[2] 6.47743e+07
[3] 4.58456e+07
[4] 4.43134e+07
[5] 4.71606e+07
[6] 4.76135e+07
[7] 4.66102e+07
[8] 4.8116e+07
[9] 4.81349e+07
[10] 8.36226e+07
[11] 7.80849e+07
RINFO(3) (local estimated flops for the elimination after
factorization):
[0] 3.08628e+10
[1] 2.57087e+10
[2] 2.48956e+10
[3] 2.04365e+10
[4] 2.00618e+10
[5] 2.04788e+10
[6] 2.4473e+10
[7] 1.96377e+10
[8] 1.90936e+10
[9] 2.10852e+10
[10] 3.40174e+10
[11] 3.25389e+10
INFO(15) (estimated size of (in MB) MUMPS internal data for
running numerical factorization):
[0] 1023
[1] 1072
[2] 928
[3] 736
[4] 736
[5] 770
[6] 810
[7] 739
[8] 783
[9] 761
[10] 1163
[11] 1229
INFO(16) (size of (in MB) MUMPS internal data used during
numerical factorization):
[0] 1023
[1] 1072
[2] 928
[3] 736
[4] 736
[5] 770
[6] 810
[7] 739
[8] 783
[9] 761
[10] 1163
[11] 1229
INFO(23) (num of pivots eliminated on this processor after
factorization):
[0] 86694
[1] 110879
[2] 83326
[3] 56448
[4] 55775
[5] 57124
[6] 57115
[7] 56671
[8] 58017
[9] 57345
[10] 111545
[11] 112229
RINFOG(1) (global estimated flops for the elimination after
analysis): 2.91867e+11
RINFOG(2) (global estimated flops for the assembly after
factorization): 6.97117e+08
RINFOG(3) (global estimated flops for the elimination after
factorization): 2.9329e+11
(RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant):
(0.,0.)*(2^0)
INFOG(3) (estimated real workspace for factors on all
processors after analysis): 357531912
INFOG(4) (estimated integer workspace for factors on all
processors after analysis): 11477656
INFOG(5) (estimated maximum front size in the complete
tree): 2016
INFOG(6) (number of nodes in the complete tree): 67821
INFOG(7) (ordering option effectively use after analysis):
5
INFOG(8) (structural symmetry in percent of the permuted
matrix after analysis): 72
INFOG(9) (total real/complex workspace to store the matrix
factors after factorization): 358009746
INFOG(10) (total integer space store the matrix factors
after factorization): 11469827
INFOG(11) (order of largest frontal matrix after
factorization): 2032
INFOG(12) (number of off-diagonal pivots): 12941
INFOG(13) (number of delayed pivots after factorization):
617
INFOG(14) (number of memory compress after factorization):
0
INFOG(15) (number of steps of iterative refinement after
solution): 0
INFOG(16) (estimated size (in MB) of all MUMPS internal
data for factorization after analysis: value on the most memory consuming
processor): 1229
INFOG(17) (estimated size of all MUMPS internal data for
factorization after analysis: sum over all processors): 10750
INFOG(18) (size of all MUMPS internal data allocated during
factorization: value on the most memory consuming processor): 1229
INFOG(19) (size of all MUMPS internal data allocated during
factorization: sum over all processors): 10750
INFOG(20) (estimated number of entries in the factors):
357531912
INFOG(21) (size in MB of memory effectively used during
factorization - value on the most memory consuming processor): 945
INFOG(22) (size in MB of memory effectively used during
factorization - sum over all processors): 8386
INFOG(23) (after analysis: value of ICNTL(6) effectively
used): 0
INFOG(24) (after analysis: value of ICNTL(12) effectively
used): 1
INFOG(25) (after factorization: number of pivots modified
by static pivoting): 0
INFOG(28) (after factorization: number of null pivots
encountered): 0
INFOG(29) (after factorization: effective number of entries
in the factors (sum over all processors)): 358009746
INFOG(30, 31) (after solution: size in Mbytes of memory
used during solution phase): 1105, 9342
INFOG(32) (after analysis: type of analysis done): 1
INFOG(33) (value used for ICNTL(8)): 7
INFOG(34) (exponent of the determinant if determinant is
requested): 0
linear system matrix = precond matrix:
Mat Object: 12 MPI processes
type: mpiaij
rows=903168, cols=903168
total: nonzeros=17538393, allocated nonzeros=17538393
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
On Sun, Mar 4, 2018 at 10:12 PM, Jose E. Roman <jroman at dsic.upv.es> wrote:
> Why do you want to move to quad precision? Double precision is usually
> enough.
> The fact that B is singular should not be a problem, provided that you do
> shift-and-invert with a nonzero target value.
> Can you send the output of -eps_view so that I can get a better idea what
> you are doing?
>
> Jose
>
>
> > El 5 mar 2018, a las 0:50, Santiago Andres Triana <repepo at gmail.com>
> escribió:
> >
> > Dear all,
> >
> > A rather general question, is there any possibility of solving a
> complex-valued generalized eigenvalue problem using quad (or extended)
> precision when the 'B' matrix is singular? So far I have been using MUMPS
> with double precision with good results but I require eventually extended
> precision. Any comment or advice highly welcome. Thanks in advance!
> >
> > Santiago
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20180305/b66e52ad/attachment-0001.html>
More information about the petsc-users
mailing list