[petsc-users] Multiple solves with PCMG fail

Barry Smith bsmith at mcs.anl.gov
Thu Oct 23 07:32:56 CDT 2014


> On Oct 23, 2014, at 1:32 AM, Filippo Leonardi <filippo.leonardi at sam.math.ethz.ch> wrote:
> 
> Tested on this branch: the program no longer crashes, but the solution is 
> actually still wrong.

   Yes, we never said that this fixed the problem.

    Jed ?


> 
> To reproduce you can compile a version with the for cycle starting from i = 0 
> and one from i = 1 and see the difference with:
> mpirun -np 4 ./test1 -pc_type mg -pc_mg_levels 5 -draw_pause -1
> and without mg.
> 
> On Wednesday 22 October 2014 09:31:40 Mark Adams wrote:
>> I think we have a fix for this.  You can test by using the branch.  In
>> PETSC_DIR:
>> 
>> $ git checkout mark/ksp-zero-eig
>> $ make
>> 
>> Now remake your code and test,
>> 
>> 
>> On Wed, Oct 22, 2014 at 8:36 AM, Filippo Leonardi <
>> 
>> filippo.leonardi at sam.math.ethz.ch> wrote:
>>> Granted that I solved my problem by **not** solving the system if the Rhs
>>> is
>>> zero. Here's some result:
>>> 
>>> 1) Using Multigrid with debug crashes (see below)
>>> 2) Using MG without debug doesn't crash, but gives wrong solution
>>> 3) defaults or gamg or wathever*works both with and without debug
>>> 
>>> Now weirder:
>>> 4) changing line 145 from
>>> 
>>>  for(int i = 1; i < N; ++i) {
>>> 
>>> to
>>> 
>>>  for(int i = 0; i < N; ++i) {
>>> 
>>> i.e. avoiding broken case 1) above (rhs never 0) makes 1), 2) and 3) work
>>> perfectly (and any combination thereof)
>>> 
>>> (You can look at the differences in the pictures given by Draw to see that
>>> the
>>> solution is more or less wrong.)
>>> 
>>> Details:
>>> 
>>> Release build with no debug:
>>> 
>>> mpirun -np 4 ./test1 -pc_type mg -pc_mg_levels 5 -draw_pause -1
>>> -ksp_monitor -
>>> ksp_monitor_true_residual
>>> 
>>>  0 KSP Residual norm 0.000000000000e+00
>>>  0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm
>>> 
>>> 0.000000000000e+00 ||r(i)||/||b||           -nan
>>> 
>>>  0 KSP Residual norm 3.964678029172e+02
>>>  0 KSP preconditioned resid norm 3.964678029172e+02 true resid norm
>>> 
>>> 1.118033988750e+00 ||r(i)||/||b|| 1.000000000000e+00
>>> 
>>>  1 KSP Residual norm 6.290716828453e-13
>>>  1 KSP preconditioned resid norm 6.290716828453e-13 true resid norm
>>> 
>>> 3.788861141557e+00 ||r(i)||/||b|| 3.388860427931e+00
>>> 
>>>  0 KSP Residual norm 7.929356058344e+02
>>>  0 KSP preconditioned resid norm 7.929356058344e+02 true resid norm
>>> 
>>> 2.236067977500e+00 ||r(i)||/||b|| 1.000000000000e+00
>>> 
>>>  1 KSP Residual norm 1.258143365691e-12
>>>  1 KSP preconditioned resid norm 1.258143365691e-12 true resid norm
>>> 
>>> 7.577722283114e+00 ||r(i)||/||b|| 3.388860427931e+00
>>> 
>>> i.e. I don't see the residual to be zero each time.
>>> 
>>> Debug build:
>>> 
>>> mpirun -np 4 ./test1 -pc_type mg -pc_mg_levels 5 -draw_pause -1
>>> -ksp_monitor -
>>> ksp_monitor_true_residual
>>> [1]PETSC ERROR: [0]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> [0]PETSC ERROR: Null argument, when expecting valid pointer
>>> [0]PETSC ERROR: Null Pointer: Parameter # 3
>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>> for
>>> trouble shooting.
>>> [0]PETSC ERROR: Petsc Release Version 3.5.0, Jun, 30, 2014
>>> [0]PETSC ERROR: ./test1 on a arch-linux2-cxx-debug named Besikovitch-II by
>>> filippo Wed Oct 22 14:20:46 2014
>>> [0]PETSC ERROR: Configure options --with-clanguage=cxx
>>> --download-scalapack --
>>> download-mumps --download-parmetis --download-metis --download-fftw
>>> [0]PETSC ERROR: #1 KSPComputeEigenvalues() line 119 in
>>> /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/interface/itfunc.c
>>> [0]PETSC ERROR: #2 KSPChebyshevComputeExtremeEigenvalues_Private() line
>>> 328 in
>>> /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/impls/cheby/cheby.c
>>> [0]PETSC ERROR: #3 KSPSolve_Chebyshev() line 373 in
>>> /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/impls/cheby/cheby.c
>>> [0]PETSC ERROR: #4 KSPSolve() line 459 in
>>> /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/interface/itfunc.c
>>> [0]PETSC ERROR: #5 PCMGMCycle_Private() line 19 in
>>> /home/filippo/Workspace/petsc-3.5.0/src/ksp/pc/impls/mg/mg.c
>>> 
>>> Debug build again with defaults:
>>> 
>>> mpirun -np 4 ./test1 -draw_pause -1 -ksp_monitor
>>> -ksp_monitor_true_residual
>>> 
>>>  0 KSP Residual norm 0.000000000000e+00
>>>  0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm
>>> 
>>> 0.000000000000e+00 ||r(i)||/||b||           -nan
>>> 
>>>  0 KSP Residual norm 1.839181443725e+00
>>>  0 KSP preconditioned resid norm 1.839181443725e+00 true resid norm
>>> 
>>> 1.118033988750e+00 ||r(i)||/||b|| 1.000000000000e+00
>>> 
>>>  1 KSP Residual norm 1.767908128952e+00
>>>  1 KSP preconditioned resid norm 1.767908128952e+00 true resid norm
>>> 
>>> 1.182712814433e+00 ||r(i)||/||b|| 1.057850500373e+00
>>> [...]
>>> 
>>> 82 KSP Residual norm 1.642117972681e-05
>>> 82 KSP preconditioned resid norm 1.642117972681e-05 true resid norm
>>> 
>>> 1.556660502826e-05 ||r(i)||/||b|| 1.392319480883e-05
>>> [...]
>>> 0 KSP Residual norm 3.678362887450e+00
>>> 
>>>  0 KSP preconditioned resid norm 3.678362887450e+00 true resid norm
>>> 
>>> 2.236067977500e+00 ||r(i)||/||b|| 1.000000000000e+00
>>> 
>>>  1 KSP Residual norm 3.535816257904e+00
>>>  1 KSP preconditioned resid norm 3.535816257904e+00 true resid norm
>>> 
>>> 2.365425628867e+00 ||r(i)||/||b|| 1.057850500373e+00
>>> 
>>> On Wednesday 22 October 2014 07:12:50 you wrote:
>>>>> On Oct 22, 2014, at 1:09 AM, Leonardi Filippo
>>>>> <filippo.leonardi at sam.math.ethz.ch> wrote:
>>>>> 
>>>>> For 1): I must be missing something then: in main i set and scale
>>> 
>>> ctx.rhs,
>>> 
>>>>> then I call KSPSolve that calls ComputeRhs that further scales ctx.rhs
>>>>> and copy it to b, which is then used by the solve. To my understanding
>>> 
>>> I
>>> 
>>>>> never override ctx.rhs.
>>>>> 
>>>>  Oh yes. I missed that.
>>>> 
>>>>  What I observed was when I ran your code each solve had an initial
>>>> 
>>>> residual norm of exactly zero. I’m not sure why that happens.
>>>> 
>>>>> For 2): that was a small typo, problem is present no matter what
>>> 
>>> number of
>>> 
>>>>> levels I use.
>>>>> 
>>>>> Thanks,
>>>>> Filippo
>>>>> 
>>>>>> On 22-ott-2014, at 04:27, "Barry Smith" <bsmith at mcs.anl.gov> wrote:
>>>>>> 
>>>>>> 
>>>>>> There are a few issues with your code
>>>>>> 
>>>>>> 1) If you use KSPSetComputeRHS() then the function you provide is
>>> 
>>> called
>>> 
>>>>>> immediately before the solve automatically for every solve. Thus your
>>>>>> code with the vector operations here
>>>>>> 
>>>>>> ierr = VecSet(ctx.rhs, (float) i * mx); CHKERRQ(ierr);
>>>>>> 
>>>>>>  ierr = VecScale(ctx.rhs,  rank); CHKERRQ(ierr);
>>>>>>  ierr = KSPSolve(ksp, NULL, NULL); CHKERRQ(ierr);
>>>>>> 
>>>>>> does nothing. And each linear solve ends up being with a zero right
>>> 
>>> hand
>>> 
>>>>>> side.
>>>>>> 
>>>>>> 2) -pc_type mg doesn’t give enough information for the multigrid to
>>> 
>>> know
>>> 
>>>>>> how many levels to use. In fact it is always using one hence it is
>>>>>> not
>>>>>> multigrid. You  need to use -p_mg_levels 2 or 3 or whatever to get it
>>> 
>>> to
>>> 
>>>>>> use more than one level.
>>>>>> 
>>>>>> Barry
>>>>>> 
>>>>>>> On Oct 21, 2014, at 1:50 AM, Filippo Leonardi
>>>>>>> <filippo.leonardi at sam.math.ethz.ch> wrote:
>>>>>>> 
>>>>>>> Here,
>>>>>>> 
>>>>>>> run this e.g. with
>>>>>>> mpirun -np 4 ./test1 -draw_pause -1 -pc_type mg -pc_mg_galerkin
>>>>>>> 
>>>>>>> In Debug mode it crashes, in standard mode it fails. Using Petsc
>>>>>>> current.
>>>>>>> 
>>>>>>> Run without mg and works!
>>>>>>> 
>>>>>>> Hope it's not me being stupid and doing something terribly wrong!
>>>>>>> 
>>>>>>> Best,
>>>>>>> Filippo
>>>>>>> 
>>>>>>>> On Monday 20 October 2014 07:29:04 Barry Smith wrote:
>>>>>>>> Can you send us the code: to petsc-maint at mcs.anl.gov or
>>>>>>>> petsc-users at mcs.anl.gov ? Or something that reproduces the problem?
>>>>>>>> 
>>>>>>>> Barry
>>>>>>>> 
>>>>>>>>> On Oct 20, 2014, at 3:31 AM, Filippo Leonardi
>>>>>>>>> <filippo.leonardi at sam.math.ethz.ch> wrote:
>>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> I have a very specific problem that I cannot figure out with PCMG
>>> 
>>> and
>>> 
>>>>>>>>> multiple solves. I got a linear system that I solve many times,
>>> 
>>> with
>>> 
>>>>>>>>> same
>>>>>>>>> matrix but different RHS. I can successfully solve the system with
>>>>>>>>> standard techniques, such as default solver or LU or PCGAMG. Even
>>> 
>>> MG
>>> 
>>>>>>>>> works if I destroy the ksp each time or I recompute the matrices
>>>>>>>>> at
>>>>>>>>> each
>>>>>>>>> time. But when I try and go to MG and not recomputing the matrices
>>>>>>>>> each
>>>>>>>>> time the solver fails. Any idea?
>>>>>>>>> 
>>>>>>>>> Here some detail: the setup:
>>>>>>>>> ierr = KSPSetDMActive(ksp, PETSC_FALSE); CHKERRQ(ierr);
>>>>>>>>> ierr = KSPSetDM(ksp,  da); CHKERRQ(ierr);
>>>>>>>>> ierr = KSPSetComputeOperators(ksp, ComputeMatrix, ctx);
>>> 
>>> CHKERRQ(ierr);
>>> 
>>>>>>>>> ierr = KSPSetComputeRHS(ksp, ComputeRHS, ctx); CHKERRQ(ierr);
>>>>>>>>> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr);
>>>>>>>>> 
>>>>>>>>> Then I solve as usual, for a large number of time steps:
>>>>>>>>> ierr = KSPSolve(ksp, NULL, NULL); CHKERRQ(ierr);
>>>>>>>>> ierr = KSPGetSolution(ksp, &phi);
>>>>>>>>> 
>>>>>>>>> The solver converges and does that in a reasonable number of
>>>>>>>>> iterations:
>>>>>>>>> Linear solve converged due to CONVERGED_RTOL iterations 7
>>>>>>>>> And ksp_view and ksp_monitor do not show any weird stuff.
>>>>>>>>> 
>>>>>>>>> - Weirdly enough using any solver (for instance cg+bjacobi or
>>>>>>>>> gamg)
>>>>>>>>> everything works (So the matrix and RHS are working fine).
>>>>>>>>> - But the problem persists with Galerkin matrices
>>>>>>>>> (-pc_mg_galerkin)
>>>>>>>>> (So is
>>>>>>>>> not a ComputeMatrix problem).
>>>>>>>>> - If I do:
>>>>>>>>> ierr = KSPSetComputeOperators(ksp, ComputeMatrix, this);
>>>>>>>>> CHKERRQ(ierr);
>>>>>>>>> between each solve or destroy the ksp entirely each time the
>>> 
>>> solution
>>> 
>>>>>>>>> is
>>>>>>>>> also perfect (So is not a boundary scaling or other stuff
>>>>>>>>> problem).
>>>>>>>>> - If I run MG with only 2 levels (so just coarse) I also get a
>>>>>>>>> fine
>>>>>>>>> result.
>>>>>>>>> 
>>>>>>>>> Setup RHS is called each time as expected, setup matrix is called
>>> 
>>> just
>>> 
>>>>>>>>> once, also as expected.
>>>>>>>>> 
>>>>>>>>> The only thing I can think is that MG does not update some value
>>> 
>>> that
>>> 
>>>>>>>>> actually needs to be recomputed.
>>>>>>>>> 
>>>>>>>>> Any idea?
>>> 
>>>>>>>>> The solution is not that different from:
>>> http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutori
>>> 
>>>>>>>>> als/
>>>>>>>>> ex29.c.html
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Filippo
>>>>>>> 
>>>>>>> <test1.cpp>
>>> 
>>> --
>>> Filippo Leonardi
>>> SAM ETH Zürich
>>> (HG J 45) Rämistrasse 101, Zürich (CH)
> 
> -- 
> Filippo Leonardi
> SAM ETH Zürich
> (HG J 45) Rämistrasse 101, Zürich (CH)<ETHZ.vcf>



More information about the petsc-users mailing list