[petsc-users] Multiple solves with PCMG fail

Mark Adams mfadams at lbl.gov
Mon Oct 27 13:31:45 CDT 2014


On Thu, Oct 23, 2014 at 2:32 AM, Filippo Leonardi <
filippo.leonardi at sam.math.ethz.ch> wrote:

> Tested on this branch: the program no longer crashes, but the solution is
> actually still wrong.
>
> To reproduce you can compile a version with the for cycle starting from i
> = 0
> and one from i = 1 and see the difference with:

mpirun -np 4 ./test1 -pc_type mg -pc_mg_levels 5 -draw_pause -1
> and without mg.
>
>
I'm not sure what you are asking.  starting i=0 or i=1, and with and
without mg.  So four possibilities?  The solutions will be different for
i=0 or 1 in this code:

  ierr = VecSet(ctx.rhs, 1); CHKERRQ(ierr);

  ierr = setup(&ctx); CHKERRQ(ierr);

  PetscViewer viewer;
  PetscViewerDrawOpen(PETSC_COMM_WORLD,NULL,NULL,0,0,300,300,&viewer);
  for(int i = 0; i < N; ++i) {

    ierr = VecSet(ctx.rhs, (float) i * mx); CHKERRQ(ierr);
    ierr = VecScale(ctx.rhs,  rank); CHKERRQ(ierr);
    ierr = KSPSolve(ksp, NULL, NULL); CHKERRQ(ierr);
    ierr = KSPGetSolution(ksp, &phi); CHKERRQ(ierr);
    ierr = VecView(ctx.rhs, viewer); CHKERRQ(ierr);
    ierr = VecView(phi, viewer); CHKERRQ(ierr);
  }




> On Wednesday 22 October 2014 09:31:40 Mark Adams wrote:
> > I think we have a fix for this.  You can test by using the branch.  In
> > PETSC_DIR:
> >
> > $ git checkout mark/ksp-zero-eig
> > $ make
> >
> > Now remake your code and test,
> >
> >
> > On Wed, Oct 22, 2014 at 8:36 AM, Filippo Leonardi <
> >
> > filippo.leonardi at sam.math.ethz.ch> wrote:
> > > Granted that I solved my problem by **not** solving the system if the
> Rhs
> > > is
> > > zero. Here's some result:
> > >
> > > 1) Using Multigrid with debug crashes (see below)
> > > 2) Using MG without debug doesn't crash, but gives wrong solution
> > > 3) defaults or gamg or wathever*works both with and without debug
> > >
> > > Now weirder:
> > > 4) changing line 145 from
> > >
> > >   for(int i = 1; i < N; ++i) {
> > >
> > > to
> > >
> > >   for(int i = 0; i < N; ++i) {
> > >
> > > i.e. avoiding broken case 1) above (rhs never 0) makes 1), 2) and 3)
> work
> > > perfectly (and any combination thereof)
> > >
> > > (You can look at the differences in the pictures given by Draw to see
> that
> > > the
> > > solution is more or less wrong.)
> > >
> > > Details:
> > >
> > > Release build with no debug:
> > >
> > > mpirun -np 4 ./test1 -pc_type mg -pc_mg_levels 5 -draw_pause -1
> > > -ksp_monitor -
> > > ksp_monitor_true_residual
> > >
> > >   0 KSP Residual norm 0.000000000000e+00
> > >   0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm
> > >
> > > 0.000000000000e+00 ||r(i)||/||b||           -nan
> > >
> > >   0 KSP Residual norm 3.964678029172e+02
> > >   0 KSP preconditioned resid norm 3.964678029172e+02 true resid norm
> > >
> > > 1.118033988750e+00 ||r(i)||/||b|| 1.000000000000e+00
> > >
> > >   1 KSP Residual norm 6.290716828453e-13
> > >   1 KSP preconditioned resid norm 6.290716828453e-13 true resid norm
> > >
> > > 3.788861141557e+00 ||r(i)||/||b|| 3.388860427931e+00
> > >
> > >   0 KSP Residual norm 7.929356058344e+02
> > >   0 KSP preconditioned resid norm 7.929356058344e+02 true resid norm
> > >
> > > 2.236067977500e+00 ||r(i)||/||b|| 1.000000000000e+00
> > >
> > >   1 KSP Residual norm 1.258143365691e-12
> > >   1 KSP preconditioned resid norm 1.258143365691e-12 true resid norm
> > >
> > > 7.577722283114e+00 ||r(i)||/||b|| 3.388860427931e+00
> > >
> > > i.e. I don't see the residual to be zero each time.
> > >
> > > Debug build:
> > >
> > > mpirun -np 4 ./test1 -pc_type mg -pc_mg_levels 5 -draw_pause -1
> > > -ksp_monitor -
> > > ksp_monitor_true_residual
> > > [1]PETSC ERROR: [0]PETSC ERROR: --------------------- Error Message
> > > --------------------------------------------------------------
> > > [0]PETSC ERROR: Null argument, when expecting valid pointer
> > > [0]PETSC ERROR: Null Pointer: Parameter # 3
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html
> > > for
> > > trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.5.0, Jun, 30, 2014
> > > [0]PETSC ERROR: ./test1 on a arch-linux2-cxx-debug named
> Besikovitch-II by
> > > filippo Wed Oct 22 14:20:46 2014
> > > [0]PETSC ERROR: Configure options --with-clanguage=cxx
> > > --download-scalapack --
> > > download-mumps --download-parmetis --download-metis --download-fftw
> > > [0]PETSC ERROR: #1 KSPComputeEigenvalues() line 119 in
> > > /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/interface/itfunc.c
> > > [0]PETSC ERROR: #2 KSPChebyshevComputeExtremeEigenvalues_Private() line
> > > 328 in
> > > /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/impls/cheby/cheby.c
> > > [0]PETSC ERROR: #3 KSPSolve_Chebyshev() line 373 in
> > > /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/impls/cheby/cheby.c
> > > [0]PETSC ERROR: #4 KSPSolve() line 459 in
> > > /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/interface/itfunc.c
> > > [0]PETSC ERROR: #5 PCMGMCycle_Private() line 19 in
> > > /home/filippo/Workspace/petsc-3.5.0/src/ksp/pc/impls/mg/mg.c
> > >
> > > Debug build again with defaults:
> > >
> > > mpirun -np 4 ./test1 -draw_pause -1 -ksp_monitor
> > > -ksp_monitor_true_residual
> > >
> > >   0 KSP Residual norm 0.000000000000e+00
> > >   0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm
> > >
> > > 0.000000000000e+00 ||r(i)||/||b||           -nan
> > >
> > >   0 KSP Residual norm 1.839181443725e+00
> > >   0 KSP preconditioned resid norm 1.839181443725e+00 true resid norm
> > >
> > > 1.118033988750e+00 ||r(i)||/||b|| 1.000000000000e+00
> > >
> > >   1 KSP Residual norm 1.767908128952e+00
> > >   1 KSP preconditioned resid norm 1.767908128952e+00 true resid norm
> > >
> > > 1.182712814433e+00 ||r(i)||/||b|| 1.057850500373e+00
> > > [...]
> > >
> > >  82 KSP Residual norm 1.642117972681e-05
> > >  82 KSP preconditioned resid norm 1.642117972681e-05 true resid norm
> > >
> > > 1.556660502826e-05 ||r(i)||/||b|| 1.392319480883e-05
> > > [...]
> > > 0 KSP Residual norm 3.678362887450e+00
> > >
> > >   0 KSP preconditioned resid norm 3.678362887450e+00 true resid norm
> > >
> > > 2.236067977500e+00 ||r(i)||/||b|| 1.000000000000e+00
> > >
> > >   1 KSP Residual norm 3.535816257904e+00
> > >   1 KSP preconditioned resid norm 3.535816257904e+00 true resid norm
> > >
> > > 2.365425628867e+00 ||r(i)||/||b|| 1.057850500373e+00
> > >
> > > On Wednesday 22 October 2014 07:12:50 you wrote:
> > > > > On Oct 22, 2014, at 1:09 AM, Leonardi Filippo
> > > > > <filippo.leonardi at sam.math.ethz.ch> wrote:
> > > > >
> > > > > For 1): I must be missing something then: in main i set and scale
> > >
> > > ctx.rhs,
> > >
> > > > > then I call KSPSolve that calls ComputeRhs that further scales
> ctx.rhs
> > > > > and copy it to b, which is then used by the solve. To my
> understanding
> > >
> > > I
> > >
> > > > > never override ctx.rhs.
> > > > >
> > > >   Oh yes. I missed that.
> > > >
> > > >   What I observed was when I ran your code each solve had an initial
> > > >
> > > > residual norm of exactly zero. I’m not sure why that happens.
> > > >
> > > > > For 2): that was a small typo, problem is present no matter what
> > >
> > > number of
> > >
> > > > > levels I use.
> > > > >
> > > > > Thanks,
> > > > > Filippo
> > > > >
> > > > >> On 22-ott-2014, at 04:27, "Barry Smith" <bsmith at mcs.anl.gov>
> wrote:
> > > > >>
> > > > >>
> > > > >> There are a few issues with your code
> > > > >>
> > > > >> 1) If you use KSPSetComputeRHS() then the function you provide is
> > >
> > > called
> > >
> > > > >> immediately before the solve automatically for every solve. Thus
> your
> > > > >> code with the vector operations here
> > > > >>
> > > > >>  ierr = VecSet(ctx.rhs, (float) i * mx); CHKERRQ(ierr);
> > > > >>
> > > > >>   ierr = VecScale(ctx.rhs,  rank); CHKERRQ(ierr);
> > > > >>   ierr = KSPSolve(ksp, NULL, NULL); CHKERRQ(ierr);
> > > > >>
> > > > >> does nothing. And each linear solve ends up being with a zero
> right
> > >
> > > hand
> > >
> > > > >> side.
> > > > >>
> > > > >> 2) -pc_type mg doesn’t give enough information for the multigrid
> to
> > >
> > > know
> > >
> > > > >> how many levels to use. In fact it is always using one hence it is
> > > > >> not
> > > > >> multigrid. You  need to use -p_mg_levels 2 or 3 or whatever to
> get it
> > >
> > > to
> > >
> > > > >> use more than one level.
> > > > >>
> > > > >> Barry
> > > > >>
> > > > >>> On Oct 21, 2014, at 1:50 AM, Filippo Leonardi
> > > > >>> <filippo.leonardi at sam.math.ethz.ch> wrote:
> > > > >>>
> > > > >>> Here,
> > > > >>>
> > > > >>> run this e.g. with
> > > > >>> mpirun -np 4 ./test1 -draw_pause -1 -pc_type mg -pc_mg_galerkin
> > > > >>>
> > > > >>> In Debug mode it crashes, in standard mode it fails. Using Petsc
> > > > >>> current.
> > > > >>>
> > > > >>> Run without mg and works!
> > > > >>>
> > > > >>> Hope it's not me being stupid and doing something terribly wrong!
> > > > >>>
> > > > >>> Best,
> > > > >>> Filippo
> > > > >>>
> > > > >>>> On Monday 20 October 2014 07:29:04 Barry Smith wrote:
> > > > >>>> Can you send us the code: to petsc-maint at mcs.anl.gov or
> > > > >>>> petsc-users at mcs.anl.gov ? Or something that reproduces the
> problem?
> > > > >>>>
> > > > >>>> Barry
> > > > >>>>
> > > > >>>>> On Oct 20, 2014, at 3:31 AM, Filippo Leonardi
> > > > >>>>> <filippo.leonardi at sam.math.ethz.ch> wrote:
> > > > >>>>>
> > > > >>>>> Hi,
> > > > >>>>>
> > > > >>>>> I have a very specific problem that I cannot figure out with
> PCMG
> > >
> > > and
> > >
> > > > >>>>> multiple solves. I got a linear system that I solve many times,
> > >
> > > with
> > >
> > > > >>>>> same
> > > > >>>>> matrix but different RHS. I can successfully solve the system
> with
> > > > >>>>> standard techniques, such as default solver or LU or PCGAMG.
> Even
> > >
> > > MG
> > >
> > > > >>>>> works if I destroy the ksp each time or I recompute the
> matrices
> > > > >>>>> at
> > > > >>>>> each
> > > > >>>>> time. But when I try and go to MG and not recomputing the
> matrices
> > > > >>>>> each
> > > > >>>>> time the solver fails. Any idea?
> > > > >>>>>
> > > > >>>>> Here some detail: the setup:
> > > > >>>>> ierr = KSPSetDMActive(ksp, PETSC_FALSE); CHKERRQ(ierr);
> > > > >>>>> ierr = KSPSetDM(ksp,  da); CHKERRQ(ierr);
> > > > >>>>> ierr = KSPSetComputeOperators(ksp, ComputeMatrix, ctx);
> > >
> > > CHKERRQ(ierr);
> > >
> > > > >>>>> ierr = KSPSetComputeRHS(ksp, ComputeRHS, ctx); CHKERRQ(ierr);
> > > > >>>>> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr);
> > > > >>>>>
> > > > >>>>> Then I solve as usual, for a large number of time steps:
> > > > >>>>> ierr = KSPSolve(ksp, NULL, NULL); CHKERRQ(ierr);
> > > > >>>>> ierr = KSPGetSolution(ksp, &phi);
> > > > >>>>>
> > > > >>>>> The solver converges and does that in a reasonable number of
> > > > >>>>> iterations:
> > > > >>>>> Linear solve converged due to CONVERGED_RTOL iterations 7
> > > > >>>>> And ksp_view and ksp_monitor do not show any weird stuff.
> > > > >>>>>
> > > > >>>>> - Weirdly enough using any solver (for instance cg+bjacobi or
> > > > >>>>> gamg)
> > > > >>>>> everything works (So the matrix and RHS are working fine).
> > > > >>>>> - But the problem persists with Galerkin matrices
> > > > >>>>> (-pc_mg_galerkin)
> > > > >>>>> (So is
> > > > >>>>> not a ComputeMatrix problem).
> > > > >>>>> - If I do:
> > > > >>>>> ierr = KSPSetComputeOperators(ksp, ComputeMatrix, this);
> > > > >>>>> CHKERRQ(ierr);
> > > > >>>>> between each solve or destroy the ksp entirely each time the
> > >
> > > solution
> > >
> > > > >>>>> is
> > > > >>>>> also perfect (So is not a boundary scaling or other stuff
> > > > >>>>> problem).
> > > > >>>>> - If I run MG with only 2 levels (so just coarse) I also get a
> > > > >>>>> fine
> > > > >>>>> result.
> > > > >>>>>
> > > > >>>>> Setup RHS is called each time as expected, setup matrix is
> called
> > >
> > > just
> > >
> > > > >>>>> once, also as expected.
> > > > >>>>>
> > > > >>>>> The only thing I can think is that MG does not update some
> value
> > >
> > > that
> > >
> > > > >>>>> actually needs to be recomputed.
> > > > >>>>>
> > > > >>>>> Any idea?
> > >
> > > > >>>>> The solution is not that different from:
> > > http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutori
> > >
> > > > >>>>> als/
> > > > >>>>> ex29.c.html
> > > > >>>>>
> > > > >>>>> Best,
> > > > >>>>> Filippo
> > > > >>>
> > > > >>> <test1.cpp>
> > >
> > > --
> > > Filippo Leonardi
> > > SAM ETH Zürich
> > > (HG J 45) Rämistrasse 101, Zürich (CH)
>
> --
> Filippo Leonardi
> SAM ETH Zürich
> (HG J 45) Rämistrasse 101, Zürich (CH)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141027/338e7a54/attachment.html>


More information about the petsc-users mailing list