[petsc-users] Slepc JD and GD converge to wrong eigenpair

Toon Weyens toon.weyens at gmail.com
Fri Mar 31 09:45:00 CDT 2017


Dear both,

I have recompiled slepc and petsc without debugging, as well as with the
recommended --with-fortran-kernels=1. In the attachment I show the scaling
for a typical "large" simulation with about 120 000 unkowns, using
Krylov-Schur.

There are two sets of datapoints there, as I do two EPS solves in one
simulations. The second solve is faster as it results from a grid
refinement of the first solve, and takes the solution of the first solve as
a first, good guess. Note that there are two pages in the PDF and in the
second page I show the time · n_procs.

As you can see, the scaling is better than before, especially up to 8
processes (which means about 15,000 unknowns per process, which is, as I
recall, cited as a good minimum on the website.

I am currently trying to run  make streams NPMAX=8, but the cluster is
extraordinarily crowded today and it does not like my interactive jobs. I
will try to run them asap.

The main issue now, however, is again the first issue: the Generalizeid
Davidson method does not converge to the physically correct negative
eigenvalue (it should be about -0.05 as Krylov-Schur gives me). In stead it
stays stuck at some small positive eigenvalue of about +0.0002. It looks as
if the solver really does not like passing the eigenvalue = 0 barrier, a
behavior I also see in smaller simulations, where the convergence is
greatly slowed down when crossing this.

However, this time, for this big simulation, just increasing NCV does *not* do
the trick, at least not until NCV=2048.

Also, I tried to use target magnitude without success either.

I started implementing the capability to start with Krylov-Schur and then
switch to GD with EPSSetInitialSpace when a certain precision has been
reached, but then realized it might be a bit of overkill as the SLEPC
solution phase in my code is generally not more than 15% of the time. There
are probably other places where I can gain more than a few percents.

However, if there is another trick that can make GD to work, it would
certainly be appreciated, as in my experience it is really about 5 times
faster than Krylov-Schur!

Thanks!

Toon

On Thu, Mar 30, 2017 at 2:47 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Mar 30, 2017 at 3:05 AM, Jose E. Roman <jroman at dsic.upv.es> wrote:
>
>
> > El 30 mar 2017, a las 9:27, Toon Weyens <toon.weyens at gmail.com>
> escribió:
> >
> > Hi, thanks for the answer.
> >
> > I use MUMPS as a PC. The options -ksp_converged_reason,
> -ksp_monitor_true_residual and -ksp_view  are not used.
> >
> > The difference between the log_view outputs of running a simple solution
> with 1, 2, 3 or 4 MPI procs is attached (debug version).
> >
> > I can see that with 2 procs it takes about 22 seconds, versus 7 seconds
> for 1 proc. For 3 and 4 the situation is worse: 29 and 37 seconds.
> >
> > Looks like the difference is mainly in the BVmult and especially in the
> BVorthogonalize routines:
> >
> > BVmult takes 1, 6.5, 10 or even a whopping 17 seconds for the different
> number of proceses
> > BVorthogonalize takes 1, 4, 6, 10.
> >
> > Calculating the preconditioner does not take more time for different
> number of proceses, and applying it only slightly increases. So it cannot
> be mumps' fault...
> >
> > Does this makes sense? Is there any way to improve this?
> >
> > Thanks!
>
> Cannot trust performance data in a debug build:
>
>
> Yes, you should definitely make another build configured using
> --with-debugging=no.
>
> What do you get for STREAMS on this machine
>
>   make streams NP=4
>
> From this data, it looks like you have already saturated the bandwidth at
> 2 procs.
>
>   Thanks,
>
>     Matt
>
>
>
>       ##########################################################
>       #                                                        #
>       #                          WARNING!!!                    #
>       #                                                        #
>       #   This code was compiled with a debugging option,      #
>       #   To get timing results run ./configure                #
>       #   using --with-debugging=no, the performance will      #
>       #   be generally two or three times faster.              #
>       #                                                        #
>       ##########################################################
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170331/753dd291/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: time_SLEPC.pdf
Type: application/pdf
Size: 11716 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170331/753dd291/attachment-0001.pdf>


More information about the petsc-users mailing list