[petsc-users] Krylov-Schur Tolerance

Christopher Pierce cmpierce at WPI.EDU
Sun Feb 19 04:00:41 CST 2017


Thanks,

Those changes did improve the tolerances of the solutions.  However, I
still have the same problem.  For certain matrices the error is up to
10^4 times as large as the requested tolerances and when using true
residual the solver gets stuck on a certain residual norm the solutions
and does not converge.  I dumped the settings that I used which I'm
attaching here.

Chris




On 02/17/17 04:42, Jose E. Roman wrote:
> For computing eigenvalues with smallest real part of generalized problems Ax=lambda Bx, it may be better to use a target value (instead of -eps_smallest_real). For instance, if you know that all eigenvalues are positive, use -eps_target 0 -eps_target_magnitude
>
> What linear solvers are you using? In the default setting, the coefficient matrix for linear solves will be B, but with target=sigma the coefficient matrix will be A-sigma*B; this may make a difference. Also, in any case, if experiencing convergence problems I would suggest using MUMPS (see section 3.4.1 of SLEPc's users manual).
>
> Jose
>
>
>
>> El 17 feb 2017, a las 10:25, Christopher Pierce <cmpierce at WPI.EDU> escribió:
>>
>> Hello All,
>>
>> I'm trying to use the SLEPc Krylov-Schur implementation to solve a
>> general eigenvalue problem.  I have a monitor on my solver and the
>> solutions appear to converge correctly when using the approximation for
>> the residual norm in the algorithm.  However, when the solutions are
>> displayed and I retrieve the actual residual norm it is very large and
>> increases with the size of the matrices I am working with.  In some
>> cases it may be 10^17 times as large as the approximate norm.  I also
>> don't get the eigenvalues I would expect for the system I am studying.
>>
>> When I turn on the option "true residual" the solver fails to converge. 
>> The residual norm shrinks to some limit (~10^-3) and then sits there for
>> the remaining iterations.  As a note, I am solving for the eigenvalues
>> with the smallest real part.  I have also tried the RQCG solver on the
>> same problems and appear to get the correct results using it, but I'm
>> looking to use the better scaling of the Krylov-Schur solver.
>>
>> Does anyone know what could be causing this behavior?
>>
>> Thanks,
>>
>> Chris Pierce
>> WPI Center for Computational Nanoscience
>>
>>

-------------- next part --------------
EPS Object: 4 MPI processes
  type: krylovschur
    Krylov-Schur: 50% of basis vectors kept after restart
    Krylov-Schur: using the locking variant
  problem type: generalized symmetric eigenvalue problem
  selected portion of the spectrum: closest to target: 0. (in magnitude)
  postprocessing eigenvectors with purification
  number of eigenvalues (nev): 10
  number of column vectors (ncv): 25
  maximum dimension of projected problem (mpd): 25
  maximum number of iterations: 1000
  tolerance: 1e-10
  convergence test: relative to the eigenvalue
BV Object: 4 MPI processes
  type: svec
  26 columns of global length 12513
  vector orthogonalization method: classical Gram-Schmidt
  orthogonalization refinement: if needed (eta: 0.7071)
  block orthogonalization method: Gram-Schmidt
  non-standard inner product
  Mat Object:   4 MPI processes
    type: mpiaij
    rows=12513, cols=12513
    total: nonzeros=177931, allocated nonzeros=177931
    total number of mallocs used during MatSetValues calls =0
      not using I-node (on process 0) routines
  doing matmult as a single matrix-matrix product
DS Object: 4 MPI processes
  type: hep
  solving the problem with: Implicit QR method (_steqr)
ST Object: 4 MPI processes
  type: sinvert
  shift: 0.
  number of matrices: 2
  all matrices have different nonzero pattern
  KSP Object:  (st_)   4 MPI processes
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-08, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object:  (st_)   4 MPI processes
    type: lu
      LU: out-of-place factorization
      tolerance for zero pivot 2.22045e-14
      matrix ordering: natural
      factor fill ratio given 0., needed 0.
        Factored matrix follows:
          Mat Object:           4 MPI processes
            type: mpiaij
            rows=12513, cols=12513
            package used to perform factorization: mumps
            total: nonzeros=4234311, allocated nonzeros=4234311
            total number of mallocs used during MatSetValues calls =0
              MUMPS run parameters:
                SYM (matrix type):                   0 
                PAR (host participation):            1 
                ICNTL(1) (output for error):         6 
                ICNTL(2) (output of diagnostic msg): 0 
                ICNTL(3) (output for global info):   0 
                ICNTL(4) (level of printing):        0 
                ICNTL(5) (input mat struct):         0 
                ICNTL(6) (matrix prescaling):        7 
                ICNTL(7) (sequentia matrix ordering):7 
                ICNTL(8) (scalling strategy):        77 
                ICNTL(10) (max num of refinements):  0 
                ICNTL(11) (error analysis):          0 
                ICNTL(12) (efficiency control):                         1 
                ICNTL(13) (efficiency control):                         0 
                ICNTL(14) (percentage of estimated workspace increase): 20 
                ICNTL(18) (input mat struct):                           3 
                ICNTL(19) (Shur complement info):                       0 
                ICNTL(20) (rhs sparse pattern):                         0 
                ICNTL(21) (solution struct):                            1 
                ICNTL(22) (in-core/out-of-core facility):               0 
                ICNTL(23) (max size of memory can be allocated locally):0 
                ICNTL(24) (detection of null pivot rows):               0 
                ICNTL(25) (computation of a null space basis):          0 
                ICNTL(26) (Schur options for rhs or solution):          0 
                ICNTL(27) (experimental parameter):                     -24 
                ICNTL(28) (use parallel or sequential ordering):        1 
                ICNTL(29) (parallel ordering):                          0 
                ICNTL(30) (user-specified set of entries in inv(A)):    0 
                ICNTL(31) (factors is discarded in the solve phase):    0 
                ICNTL(33) (compute determinant):                        0 
                CNTL(1) (relative pivoting threshold):      0.01 
                CNTL(2) (stopping criterion of refinement): 1.49012e-08 
                CNTL(3) (absolute pivoting threshold):      0. 
                CNTL(4) (value of static pivoting):         -1. 
                CNTL(5) (fixation for null pivots):         0. 
                RINFO(1) (local estimated flops for the elimination after analysis): 
                  [0] 3.42689e+08 
                  [1] 5.94214e+08 
                  [2] 3.8211e+08 
                  [3] 3.4841e+08 
                RINFO(2) (local estimated flops for the assembly after factorization): 
                  [0]  1.5205e+06 
                  [1]  1.4933e+06 
                  [2]  1.4988e+06 
                  [3]  1.56079e+06 
                RINFO(3) (local estimated flops for the elimination after factorization): 
                  [0]  3.42689e+08 
                  [1]  5.94214e+08 
                  [2]  3.8211e+08 
                  [3]  3.4841e+08 
                INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): 
                [0] 37 
                [1] 44 
                [2] 40 
                [3] 38 
                INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): 
                  [0] 37 
                  [1] 44 
                  [2] 40 
                  [3] 38 
                INFO(23) (num of pivots eliminated on this processor after factorization): 
                  [0] 3977 
                  [1] 2646 
                  [2] 2409 
                  [3] 3481 
                RINFOG(1) (global estimated flops for the elimination after analysis): 1.66742e+09 
                RINFOG(2) (global estimated flops for the assembly after factorization): 6.0734e+06 
                RINFOG(3) (global estimated flops for the elimination after factorization): 1.66742e+09 
                (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0)
                INFOG(3) (estimated real workspace for factors on all processors after analysis): 4234311 
                INFOG(4) (estimated integer workspace for factors on all processors after analysis): 169823 
                INFOG(5) (estimated maximum front size in the complete tree): 925 
                INFOG(6) (number of nodes in the complete tree): 2357 
                INFOG(7) (ordering option effectively use after analysis): 4 
                INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 
                INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 4234311 
                INFOG(10) (total integer space store the matrix factors after factorization): 169823 
                INFOG(11) (order of largest frontal matrix after factorization): 925 
                INFOG(12) (number of off-diagonal pivots): 0 
                INFOG(13) (number of delayed pivots after factorization): 0 
                INFOG(14) (number of memory compress after factorization): 0 
                INFOG(15) (number of steps of iterative refinement after solution): 0 
                INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 44 
                INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 159 
                INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 44 
                INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 159 
                INFOG(20) (estimated number of entries in the factors): 4234311 
                INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 38 
                INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 142 
                INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 
                INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 
                INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 
                INFOG(28) (after factorization: number of null pivots encountered): 0
                INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 4234311
                INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 22, 68
                INFOG(32) (after analysis: type of analysis done): 1
                INFOG(33) (value used for ICNTL(8)): 7
                INFOG(34) (exponent of the determinant if determinant is requested): 0
    linear system matrix = precond matrix:
    Mat Object:     4 MPI processes
      type: mpiaij
      rows=12513, cols=12513
      total: nonzeros=177931, allocated nonzeros=177931
      total number of mallocs used during MatSetValues calls =0
        not using I-node (on process 0) routines


More information about the petsc-users mailing list