[petsc-users] [SLEPc] Performance of Krylov-Schur with MUMPS-based shift-and-invert

Wed Feb 28 03:17:21 CST 2018

Balancing may reduce the norm of the matrix or the condition number of some eigenvalues. It applies to the ST operator, (A-sigma*B)^{-1}*B in case of shift-and-invert. The case that sigma is close to an eigenvalue is usually not a problem, provided that you use a robust direct solver (MUMPS). In cases where both A and B are singular, balancing may improve residual errors.

I think it is not necessary in your case.
Jose

> El 27 feb 2018, a las 19:03, Thibaut Appel <t.appel17 at imperial.ac.uk> escribió:
> 
> Good afternoon Mr Roman,
> 
> Thank you very much for your detailed and quick answer. I'll make the use of eps_view and log_view and see if I can optimize the preallocation of my matrix.
> 
> Good to know that it is possible to play with the "mpd" parameter, I thought it was only for when large values of nev were requested - as suggested by the user guide.
> 
> My residual norms are decent (10E-6 to 10E-8) so I do not think my application code requires significant tuning.
> 
> When you say "EPSSetBalance() sometimes helps, even in the case of using spectral transformation", did you mean "particularly" instead of "even"? If yes, why? If I misunderstood, what did you want to explain?
> 
> Regards,
> 
> 
> Thibaut
> 
> On 19/02/18 18:36, Jose E. Roman wrote:
>> 
>>> El 19 feb 2018, a las 19:15, Thibaut Appel <t.appel17 at imperial.ac.uk> escribió:
>>> 
>>> Good afternoon,
>>> 
>>> I am solving generalized eigenvalue problems {Ax = omegaBx} in complex arithmetic, where A is non-hermitian and B is singular. I think the only way to get round the singularity is to employ a shift-and-invert method, where I am using MUMPS to invert the shifted matrix.
>>> 
>>> I am using the Fortran interface of PETSc 3.8.3 and SLEPc 3.8.2 where my ./configure line was
>>> ./configure --with-fortran-kernels=1 --with-scalar-type=complex --with-blaslapack-dir=/home/linuxbrew/.linuxbrew/opt/openblas --PETSC_ARCH=cplx_dble_optim --with-cmake-dir=/home/linuxbrew/.linuxbrew/opt/cmake --with-mpi-dir=/home/linuxbrew/.linuxbrew/opt/openmpi --with-debugging=0 --download-scalapack --download-mumps --COPTFLAGS="-O3 -march=native" --CXXOPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 -march=native"
>>> 
>>> My matrices A and B are assembled correctly in parallel and my preallocation is quasi-optimal in the sense that I don't have any called to mallocs but I may overestimate the required memory for some rows of the matrices. Here is how I setup the EPS problem and solve:
>>> 
>>>     CALL EPSSetProblemType(eps,EPS_GNHEP,ierr)
>>>     CALL EPSSetOperators(eps,MatA,MatB,ierr)
>>>     CALL EPSSetType(eps,EPSKRYLOVSCHUR,ierr)
>>>     CALL EPSSetDimensions(eps,nev,ncv,PETSC_DECIDE,ierr)
>>>     CALL EPSSetTolerances(eps,tol_ev,PETSC_DECIDE,ierr)
>>> 
>>>     CALL EPSSetFromOptions(eps,ierr)
>>>     CALL EPSSetTarget(eps,shift,ierr)
>>>     CALL EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE,ierr)
>>> 
>>>     CALL EPSGetST(eps,st,ierr)
>>>     CALL STGetKSP(st,ksp,ierr)
>>>     CALL KSPGetPC(ksp,pc,ierr)
>>> 
>>>     CALL STSetType(st,STSINVERT,ierr)
>>>     CALL KSPSetType(ksp,KSPPREONLY,ierr)
>>>     CALL PCSetType(pc,PCLU,ierr)
>>> 
>>>     CALL PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS,ierr)
>>>     CALL PCSetFromOptions(pc,ierr)
>>> 
>>>     CALL EPSSolve(eps,ierr)
>>>     CALL EPSGetIterationNumber(eps,iter,ierr)
>>>     CALL EPSGetConverged(eps,nev_conv,ierr)
>> The settings seem ok. You can use -eps_view to make sure that everything is set as you want.
>> 
>>>     - Using one MPI process, it takes 1 hour and 22 minutes to retrieve 250 eigenvalues with a Krylov subspace of size 500, a tolerance of 10^-12 when the leading dimension of the matrices is 405000. My matrix A has 98,415,000 non-zero elements and B has 1,215,000 non zero elements. Would you be shocked by that computation time? I would have expected something much lower given the values of nev and ncv I have but could be completely wrong in my understanding of the Krylov-Schur method.
>> If you run with -log_view you will see the breakup in the different steps. Most probably a large percentage of the time is in the factorization of the matrix (MatLUFactorSym and MatLUFactorNum).
>> 
>> The matrix is quite dense (about 250 nonzero elements per row), so factorizing it is costly. You may want to try inexact shift-and-invert with an iterative method, but you will need a good preconditioner.
>> 
>> The time needed for the other steps may be reduced a little bit by setting a smaller subspace size, for instance with -eps_mpd 200
>> 
>>>     - My goal is speed and reliability. Is there anything you notice in my EPS solver that could be improved or corrected? I remember an exchange with Jose E. Roman where he said that the parameters of MUMPS are not worth being changed, however I notice some people play with the -mat_mumps_cntl_1 and  -mat_mumps_cntl_3 which control the relative/absolute pivoting threshold?
>> Yes, you can try tuning MUMPS options. Maybe they are relevant in your application.
>> 
>>>     - Would you advise the use of EPSSetTrueResidual and EPSSetBalance since I am using a spectral transformation?
>> Are you getting large residual norms?
>> I would not suggest using EPSSetTrueResidual() because it may prevent convergence, especially if the target is not very close to the wanted eigenvalues.
>> EPSSetBalance() sometimes helps, even in the case of using spectral transformation. It is intended for ill-conditioned problems where the obtained residuals are not so good.
>> 
>>>     - Would you see anything that would prevent me from getting speedup in parallel executions?
>> I guess it will depend on how MUMPS scales for your problem.
>> 
>> Jose
>> 
>>> Thank you very much in advance and I look forward to exchanging with you about these different points,
>>> 
>>> Thibaut
>>> 
>