[petsc-users] [SLEPc] GD is not deterministic when using different number of cores
Denis Davydov
davydden at gmail.com
Thu Nov 19 03:49:29 CST 2015
Dear all,
I was trying to get some scaling results for the GD eigensolver as applied to the density functional theory.
Interestingly enough, the number of self-consistent iterations (solution of couple eigenvalue problem and poisson equations)
depends on the number of MPI cores used. For my case the range of iterations is 19-24 for MPI cores between 2 and 160.
That makes the whole scaling check useless as the eigenproblem is solved different number of times.
That is **not** the case when I use Krylov-Schur eigensolver with zero shift, which makes me believe that I am missing some settings on GD to make it fully deterministic. The only non-deterministic part I am currently aware of is the initial subspace for the first SC iterations. But that’s the case for both KS and GD. For subsequent iterations I provide previously obtained eigenvectors as initial subspace.
Certainly there will be some round-off error due to different partition of DoFs for different number of MPI cores,
but i don’t expect it to have such a strong influence. Especially given the fact that I don’t see this problem with KS.
Below is the output of -eps-view for GD with -eps_type gd -eps_harmonic -st_pc_type bjacobi -eps_gd_krylov_start -eps_target -10.0
I would appreciate any suggestions on how to address the issue.
As a side question, why GD uses KSP pre-only? It could as well be using a proper linear solver to apply K^{-1} in the expansion state --
I assume the Olsen variant is the default in SLEPc?
Kind regards,
Denis
EPS Object: 4 MPI processes
type: gd
Davidson: search subspace is B-orthogonalized
Davidson: block size=1
Davidson: type of the initial subspace: Krylov
Davidson: size of the subspace after restarting: 6
Davidson: number of vectors after restarting from the previous iteration: 0
problem type: generalized symmetric eigenvalue problem
extraction type: harmonic Ritz
selected portion of the spectrum: closest to target: -10 (in magnitude)
postprocessing eigenvectors with purification
number of eigenvalues (nev): 87
number of column vectors (ncv): 175
maximum dimension of projected problem (mpd): 175
maximum number of iterations: 57575
tolerance: 1e-10
convergence test: absolute
dimension of user-provided initial space: 87
BV Object: 4 MPI processes
type: svec
175 columns of global length 57575
vector orthogonalization method: classical Gram-Schmidt
orthogonalization refinement: if needed (eta: 0.7071)
block orthogonalization method: Gram-Schmidt
non-standard inner product
Mat Object: 4 MPI processes
type: mpiaij
rows=57575, cols=57575
total: nonzeros=1.51135e+06, allocated nonzeros=1.51135e+06
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
doing matmult as a single matrix-matrix product
DS Object: 4 MPI processes
type: gnhep
ST Object: 4 MPI processes
type: precond
shift: -10
number of matrices: 2
all matrices have different nonzero pattern
KSP Object: (st_) 4 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-08, absolute=1e-50, divergence=10000
left preconditioning
using DEFAULT norm type for convergence test
PC Object: (st_) 4 MPI processes
type: bjacobi
block Jacobi: number of blocks = 4
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (st_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (st_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=15557, cols=15557
package used to perform factorization: petsc
total: nonzeros=388947, allocated nonzeros=388947
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=15557, cols=15557
total: nonzeros=388947, allocated nonzeros=388947
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 4 MPI processes
type: mpiaij
rows=57575, cols=57575
total: nonzeros=1.51135e+06, allocated nonzeros=1.51135e+06
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
More information about the petsc-users
mailing list