[petsc-users] [SLEPc] GD is not deterministic when using different number of cores

Denis Davydov davydden at gmail.com
Thu Nov 19 03:49:29 CST 2015


Dear all,

I was trying to get some scaling results for the GD eigensolver as applied to the density functional theory.
Interestingly enough, the number of self-consistent iterations (solution of couple eigenvalue problem and poisson equations) 
depends on the number of MPI cores used. For my case the range of iterations is 19-24 for MPI cores between 2 and 160.
That makes the whole scaling check useless as the eigenproblem is solved different number of times.

That is **not** the case when I use Krylov-Schur eigensolver with zero shift, which makes me believe that I am missing some settings on GD to make it fully deterministic. The only non-deterministic part I am currently aware of is the initial subspace for the first SC iterations. But that’s the case for both KS and GD. For subsequent iterations I provide previously obtained eigenvectors as initial subspace.

Certainly there will be some round-off error due to different partition of DoFs for different number of MPI cores, 
but i don’t expect it to have such a strong influence. Especially given the fact that I don’t see this problem with KS.

Below is the output of -eps-view for GD with -eps_type gd -eps_harmonic -st_pc_type bjacobi -eps_gd_krylov_start -eps_target -10.0
I would appreciate any suggestions on how to address the issue.

As a side question, why GD uses KSP pre-only? It could as well be using a proper linear solver to apply K^{-1} in the expansion state --
I assume the Olsen variant is the default in SLEPc?

Kind regards,
Denis


EPS Object: 4 MPI processes
  type: gd
    Davidson: search subspace is B-orthogonalized
    Davidson: block size=1
    Davidson: type of the initial subspace: Krylov
    Davidson: size of the subspace after restarting: 6
    Davidson: number of vectors after restarting from the previous iteration: 0
  problem type: generalized symmetric eigenvalue problem
  extraction type: harmonic Ritz
  selected portion of the spectrum: closest to target: -10 (in magnitude)
  postprocessing eigenvectors with purification
  number of eigenvalues (nev): 87
  number of column vectors (ncv): 175
  maximum dimension of projected problem (mpd): 175
  maximum number of iterations: 57575
  tolerance: 1e-10
  convergence test: absolute
  dimension of user-provided initial space: 87
BV Object: 4 MPI processes
  type: svec
  175 columns of global length 57575
  vector orthogonalization method: classical Gram-Schmidt
  orthogonalization refinement: if needed (eta: 0.7071)
  block orthogonalization method: Gram-Schmidt
  non-standard inner product
  Mat Object:   4 MPI processes
    type: mpiaij
    rows=57575, cols=57575
    total: nonzeros=1.51135e+06, allocated nonzeros=1.51135e+06
    total number of mallocs used during MatSetValues calls =0
      not using I-node (on process 0) routines
  doing matmult as a single matrix-matrix product
DS Object: 4 MPI processes
  type: gnhep
ST Object: 4 MPI processes
  type: precond
  shift: -10
  number of matrices: 2
  all matrices have different nonzero pattern
  KSP Object:  (st_)   4 MPI processes
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
    left preconditioning
    using DEFAULT norm type for convergence test
  PC Object:  (st_)   4 MPI processes
    type: bjacobi
      block Jacobi: number of blocks = 4
      Local solve is same for all blocks, in the following KSP and PC objects:
    KSP Object:    (st_sub_)     1 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (st_sub_)     1 MPI processes
      type: ilu
        ILU: out-of-place factorization
        0 levels of fill
        tolerance for zero pivot 2.22045e-14
        matrix ordering: natural
        factor fill ratio given 1, needed 1
          Factored matrix follows:
            Mat Object:             1 MPI processes
              type: seqaij
              rows=15557, cols=15557
              package used to perform factorization: petsc
              total: nonzeros=388947, allocated nonzeros=388947
              total number of mallocs used during MatSetValues calls =0
                not using I-node routines
      linear system matrix = precond matrix:
      Mat Object:       1 MPI processes
        type: seqaij
        rows=15557, cols=15557
        total: nonzeros=388947, allocated nonzeros=388947
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
    linear system matrix = precond matrix:
    Mat Object:     4 MPI processes
      type: mpiaij
      rows=57575, cols=57575
      total: nonzeros=1.51135e+06, allocated nonzeros=1.51135e+06
      total number of mallocs used during MatSetValues calls =0
        not using I-node (on process 0) routines





More information about the petsc-users mailing list