[petsc-users] Multiple solves with PCMG fail
Barry Smith
bsmith at mcs.anl.gov
Thu Oct 23 07:32:56 CDT 2014
> On Oct 23, 2014, at 1:32 AM, Filippo Leonardi <filippo.leonardi at sam.math.ethz.ch> wrote:
>
> Tested on this branch: the program no longer crashes, but the solution is
> actually still wrong.
Yes, we never said that this fixed the problem.
Jed ?
>
> To reproduce you can compile a version with the for cycle starting from i = 0
> and one from i = 1 and see the difference with:
> mpirun -np 4 ./test1 -pc_type mg -pc_mg_levels 5 -draw_pause -1
> and without mg.
>
> On Wednesday 22 October 2014 09:31:40 Mark Adams wrote:
>> I think we have a fix for this. You can test by using the branch. In
>> PETSC_DIR:
>>
>> $ git checkout mark/ksp-zero-eig
>> $ make
>>
>> Now remake your code and test,
>>
>>
>> On Wed, Oct 22, 2014 at 8:36 AM, Filippo Leonardi <
>>
>> filippo.leonardi at sam.math.ethz.ch> wrote:
>>> Granted that I solved my problem by **not** solving the system if the Rhs
>>> is
>>> zero. Here's some result:
>>>
>>> 1) Using Multigrid with debug crashes (see below)
>>> 2) Using MG without debug doesn't crash, but gives wrong solution
>>> 3) defaults or gamg or wathever*works both with and without debug
>>>
>>> Now weirder:
>>> 4) changing line 145 from
>>>
>>> for(int i = 1; i < N; ++i) {
>>>
>>> to
>>>
>>> for(int i = 0; i < N; ++i) {
>>>
>>> i.e. avoiding broken case 1) above (rhs never 0) makes 1), 2) and 3) work
>>> perfectly (and any combination thereof)
>>>
>>> (You can look at the differences in the pictures given by Draw to see that
>>> the
>>> solution is more or less wrong.)
>>>
>>> Details:
>>>
>>> Release build with no debug:
>>>
>>> mpirun -np 4 ./test1 -pc_type mg -pc_mg_levels 5 -draw_pause -1
>>> -ksp_monitor -
>>> ksp_monitor_true_residual
>>>
>>> 0 KSP Residual norm 0.000000000000e+00
>>> 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm
>>>
>>> 0.000000000000e+00 ||r(i)||/||b|| -nan
>>>
>>> 0 KSP Residual norm 3.964678029172e+02
>>> 0 KSP preconditioned resid norm 3.964678029172e+02 true resid norm
>>>
>>> 1.118033988750e+00 ||r(i)||/||b|| 1.000000000000e+00
>>>
>>> 1 KSP Residual norm 6.290716828453e-13
>>> 1 KSP preconditioned resid norm 6.290716828453e-13 true resid norm
>>>
>>> 3.788861141557e+00 ||r(i)||/||b|| 3.388860427931e+00
>>>
>>> 0 KSP Residual norm 7.929356058344e+02
>>> 0 KSP preconditioned resid norm 7.929356058344e+02 true resid norm
>>>
>>> 2.236067977500e+00 ||r(i)||/||b|| 1.000000000000e+00
>>>
>>> 1 KSP Residual norm 1.258143365691e-12
>>> 1 KSP preconditioned resid norm 1.258143365691e-12 true resid norm
>>>
>>> 7.577722283114e+00 ||r(i)||/||b|| 3.388860427931e+00
>>>
>>> i.e. I don't see the residual to be zero each time.
>>>
>>> Debug build:
>>>
>>> mpirun -np 4 ./test1 -pc_type mg -pc_mg_levels 5 -draw_pause -1
>>> -ksp_monitor -
>>> ksp_monitor_true_residual
>>> [1]PETSC ERROR: [0]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> [0]PETSC ERROR: Null argument, when expecting valid pointer
>>> [0]PETSC ERROR: Null Pointer: Parameter # 3
>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>> for
>>> trouble shooting.
>>> [0]PETSC ERROR: Petsc Release Version 3.5.0, Jun, 30, 2014
>>> [0]PETSC ERROR: ./test1 on a arch-linux2-cxx-debug named Besikovitch-II by
>>> filippo Wed Oct 22 14:20:46 2014
>>> [0]PETSC ERROR: Configure options --with-clanguage=cxx
>>> --download-scalapack --
>>> download-mumps --download-parmetis --download-metis --download-fftw
>>> [0]PETSC ERROR: #1 KSPComputeEigenvalues() line 119 in
>>> /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/interface/itfunc.c
>>> [0]PETSC ERROR: #2 KSPChebyshevComputeExtremeEigenvalues_Private() line
>>> 328 in
>>> /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/impls/cheby/cheby.c
>>> [0]PETSC ERROR: #3 KSPSolve_Chebyshev() line 373 in
>>> /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/impls/cheby/cheby.c
>>> [0]PETSC ERROR: #4 KSPSolve() line 459 in
>>> /home/filippo/Workspace/petsc-3.5.0/src/ksp/ksp/interface/itfunc.c
>>> [0]PETSC ERROR: #5 PCMGMCycle_Private() line 19 in
>>> /home/filippo/Workspace/petsc-3.5.0/src/ksp/pc/impls/mg/mg.c
>>>
>>> Debug build again with defaults:
>>>
>>> mpirun -np 4 ./test1 -draw_pause -1 -ksp_monitor
>>> -ksp_monitor_true_residual
>>>
>>> 0 KSP Residual norm 0.000000000000e+00
>>> 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm
>>>
>>> 0.000000000000e+00 ||r(i)||/||b|| -nan
>>>
>>> 0 KSP Residual norm 1.839181443725e+00
>>> 0 KSP preconditioned resid norm 1.839181443725e+00 true resid norm
>>>
>>> 1.118033988750e+00 ||r(i)||/||b|| 1.000000000000e+00
>>>
>>> 1 KSP Residual norm 1.767908128952e+00
>>> 1 KSP preconditioned resid norm 1.767908128952e+00 true resid norm
>>>
>>> 1.182712814433e+00 ||r(i)||/||b|| 1.057850500373e+00
>>> [...]
>>>
>>> 82 KSP Residual norm 1.642117972681e-05
>>> 82 KSP preconditioned resid norm 1.642117972681e-05 true resid norm
>>>
>>> 1.556660502826e-05 ||r(i)||/||b|| 1.392319480883e-05
>>> [...]
>>> 0 KSP Residual norm 3.678362887450e+00
>>>
>>> 0 KSP preconditioned resid norm 3.678362887450e+00 true resid norm
>>>
>>> 2.236067977500e+00 ||r(i)||/||b|| 1.000000000000e+00
>>>
>>> 1 KSP Residual norm 3.535816257904e+00
>>> 1 KSP preconditioned resid norm 3.535816257904e+00 true resid norm
>>>
>>> 2.365425628867e+00 ||r(i)||/||b|| 1.057850500373e+00
>>>
>>> On Wednesday 22 October 2014 07:12:50 you wrote:
>>>>> On Oct 22, 2014, at 1:09 AM, Leonardi Filippo
>>>>> <filippo.leonardi at sam.math.ethz.ch> wrote:
>>>>>
>>>>> For 1): I must be missing something then: in main i set and scale
>>>
>>> ctx.rhs,
>>>
>>>>> then I call KSPSolve that calls ComputeRhs that further scales ctx.rhs
>>>>> and copy it to b, which is then used by the solve. To my understanding
>>>
>>> I
>>>
>>>>> never override ctx.rhs.
>>>>>
>>>> Oh yes. I missed that.
>>>>
>>>> What I observed was when I ran your code each solve had an initial
>>>>
>>>> residual norm of exactly zero. I’m not sure why that happens.
>>>>
>>>>> For 2): that was a small typo, problem is present no matter what
>>>
>>> number of
>>>
>>>>> levels I use.
>>>>>
>>>>> Thanks,
>>>>> Filippo
>>>>>
>>>>>> On 22-ott-2014, at 04:27, "Barry Smith" <bsmith at mcs.anl.gov> wrote:
>>>>>>
>>>>>>
>>>>>> There are a few issues with your code
>>>>>>
>>>>>> 1) If you use KSPSetComputeRHS() then the function you provide is
>>>
>>> called
>>>
>>>>>> immediately before the solve automatically for every solve. Thus your
>>>>>> code with the vector operations here
>>>>>>
>>>>>> ierr = VecSet(ctx.rhs, (float) i * mx); CHKERRQ(ierr);
>>>>>>
>>>>>> ierr = VecScale(ctx.rhs, rank); CHKERRQ(ierr);
>>>>>> ierr = KSPSolve(ksp, NULL, NULL); CHKERRQ(ierr);
>>>>>>
>>>>>> does nothing. And each linear solve ends up being with a zero right
>>>
>>> hand
>>>
>>>>>> side.
>>>>>>
>>>>>> 2) -pc_type mg doesn’t give enough information for the multigrid to
>>>
>>> know
>>>
>>>>>> how many levels to use. In fact it is always using one hence it is
>>>>>> not
>>>>>> multigrid. You need to use -p_mg_levels 2 or 3 or whatever to get it
>>>
>>> to
>>>
>>>>>> use more than one level.
>>>>>>
>>>>>> Barry
>>>>>>
>>>>>>> On Oct 21, 2014, at 1:50 AM, Filippo Leonardi
>>>>>>> <filippo.leonardi at sam.math.ethz.ch> wrote:
>>>>>>>
>>>>>>> Here,
>>>>>>>
>>>>>>> run this e.g. with
>>>>>>> mpirun -np 4 ./test1 -draw_pause -1 -pc_type mg -pc_mg_galerkin
>>>>>>>
>>>>>>> In Debug mode it crashes, in standard mode it fails. Using Petsc
>>>>>>> current.
>>>>>>>
>>>>>>> Run without mg and works!
>>>>>>>
>>>>>>> Hope it's not me being stupid and doing something terribly wrong!
>>>>>>>
>>>>>>> Best,
>>>>>>> Filippo
>>>>>>>
>>>>>>>> On Monday 20 October 2014 07:29:04 Barry Smith wrote:
>>>>>>>> Can you send us the code: to petsc-maint at mcs.anl.gov or
>>>>>>>> petsc-users at mcs.anl.gov ? Or something that reproduces the problem?
>>>>>>>>
>>>>>>>> Barry
>>>>>>>>
>>>>>>>>> On Oct 20, 2014, at 3:31 AM, Filippo Leonardi
>>>>>>>>> <filippo.leonardi at sam.math.ethz.ch> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have a very specific problem that I cannot figure out with PCMG
>>>
>>> and
>>>
>>>>>>>>> multiple solves. I got a linear system that I solve many times,
>>>
>>> with
>>>
>>>>>>>>> same
>>>>>>>>> matrix but different RHS. I can successfully solve the system with
>>>>>>>>> standard techniques, such as default solver or LU or PCGAMG. Even
>>>
>>> MG
>>>
>>>>>>>>> works if I destroy the ksp each time or I recompute the matrices
>>>>>>>>> at
>>>>>>>>> each
>>>>>>>>> time. But when I try and go to MG and not recomputing the matrices
>>>>>>>>> each
>>>>>>>>> time the solver fails. Any idea?
>>>>>>>>>
>>>>>>>>> Here some detail: the setup:
>>>>>>>>> ierr = KSPSetDMActive(ksp, PETSC_FALSE); CHKERRQ(ierr);
>>>>>>>>> ierr = KSPSetDM(ksp, da); CHKERRQ(ierr);
>>>>>>>>> ierr = KSPSetComputeOperators(ksp, ComputeMatrix, ctx);
>>>
>>> CHKERRQ(ierr);
>>>
>>>>>>>>> ierr = KSPSetComputeRHS(ksp, ComputeRHS, ctx); CHKERRQ(ierr);
>>>>>>>>> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr);
>>>>>>>>>
>>>>>>>>> Then I solve as usual, for a large number of time steps:
>>>>>>>>> ierr = KSPSolve(ksp, NULL, NULL); CHKERRQ(ierr);
>>>>>>>>> ierr = KSPGetSolution(ksp, &phi);
>>>>>>>>>
>>>>>>>>> The solver converges and does that in a reasonable number of
>>>>>>>>> iterations:
>>>>>>>>> Linear solve converged due to CONVERGED_RTOL iterations 7
>>>>>>>>> And ksp_view and ksp_monitor do not show any weird stuff.
>>>>>>>>>
>>>>>>>>> - Weirdly enough using any solver (for instance cg+bjacobi or
>>>>>>>>> gamg)
>>>>>>>>> everything works (So the matrix and RHS are working fine).
>>>>>>>>> - But the problem persists with Galerkin matrices
>>>>>>>>> (-pc_mg_galerkin)
>>>>>>>>> (So is
>>>>>>>>> not a ComputeMatrix problem).
>>>>>>>>> - If I do:
>>>>>>>>> ierr = KSPSetComputeOperators(ksp, ComputeMatrix, this);
>>>>>>>>> CHKERRQ(ierr);
>>>>>>>>> between each solve or destroy the ksp entirely each time the
>>>
>>> solution
>>>
>>>>>>>>> is
>>>>>>>>> also perfect (So is not a boundary scaling or other stuff
>>>>>>>>> problem).
>>>>>>>>> - If I run MG with only 2 levels (so just coarse) I also get a
>>>>>>>>> fine
>>>>>>>>> result.
>>>>>>>>>
>>>>>>>>> Setup RHS is called each time as expected, setup matrix is called
>>>
>>> just
>>>
>>>>>>>>> once, also as expected.
>>>>>>>>>
>>>>>>>>> The only thing I can think is that MG does not update some value
>>>
>>> that
>>>
>>>>>>>>> actually needs to be recomputed.
>>>>>>>>>
>>>>>>>>> Any idea?
>>>
>>>>>>>>> The solution is not that different from:
>>> http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutori
>>>
>>>>>>>>> als/
>>>>>>>>> ex29.c.html
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Filippo
>>>>>>>
>>>>>>> <test1.cpp>
>>>
>>> --
>>> Filippo Leonardi
>>> SAM ETH Zürich
>>> (HG J 45) Rämistrasse 101, Zürich (CH)
>
> --
> Filippo Leonardi
> SAM ETH Zürich
> (HG J 45) Rämistrasse 101, Zürich (CH)<ETHZ.vcf>
More information about the petsc-users
mailing list