[petsc-dev] [petsc-users] Multiple solves with PCMG fail

Barry Smith bsmith at mcs.anl.gov
Tue Oct 21 21:32:24 CDT 2014


  Jed

    There are some additional issues when the GMRES runs for zero iterations in computing eigenvalues for Cheby.

static PetscErrorCode KSPChebyshevComputeExtremeEigenvalues_Private(KSP kspest,PetscReal *emin,PetscReal *emax)
{
  PetscErrorCode ierr;
  PetscInt       n,neig;
  PetscReal      *re,*im,min,max;

  PetscFunctionBegin;
  ierr = KSPGetIterationNumber(kspest,&n);CHKERRQ(ierr);
  ierr = PetscMalloc2(n,&re,n,&im);CHKERRQ(ierr);
  ierr = KSPComputeEigenvalues(kspest,n,re,im,&neig);CHKERRQ(ierr);
  min  = PETSC_MAX_REAL;
  max  = PETSC_MIN_REAL;
  for (n=0; n<neig; n++) {
    min = PetscMin(min,re[n]);
    max = PetscMax(max,re[n]);
  }
  ierr  = PetscFree2(re,im);CHKERRQ(ierr);
  *emax = max;
  *emin = min;

* thread #1: tid = 0x3c16cf, 0x00000001041b6777 libpetsc.3.5.dylib`KSPSolve_Chebyshev(ksp=0x00007fb85197ac60) + 1495 at cheby.c:378, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x00000001041b6777 libpetsc.3.5.dylib`KSPSolve_Chebyshev(ksp=0x00007fb85197ac60) + 1495 at cheby.c:378
   375 	    cheb->emin = cheb->tform[0]*min + cheb->tform[1]*max;
   376 	    cheb->emax = cheb->tform[2]*min + cheb->tform[3]*max;
   377 	
-> 378 	    cheb->estimate_current = PETSC_TRUE;
   379 	  }
   380 	
   381 	  ksp->its = 0;
(lldb) p cheb->emin
(PetscReal) $11 = -1.7976931348623158E+307
(lldb) p cheb->emax
(PetscReal) $12 = -Inf

This puts giberish into the eigenvalue estimates. 

I am not sure how you want to handle this? 

The zero right hand side zero, initial guess case is a great corner case for testing complicated solvers :-) I’d forgotten about it for many years

  Barry


> On Oct 21, 2014, at 6:47 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>  Mark,
> 
>    Thanks for figuring it out. 
> 
>    It is legitimate to malloc 0 size memory in PETSc and to have arrays of zero length. So 
> 
> 1) the asserts
> PetscValidScalarPointer(r,3);
>  PetscValidScalarPointer(c,4);
> 
>  are over zealous. They should actually be if (n) PetscValidScalarPointer(r,3);
> 
>  BTW: we should have an additional assert that n >=0 
> 
> 2) that said I do not know if  all the later code will function correctly in this case. I see some bad stuff
> 
> PetscErrorCode KSPComputeEigenvalues_CG(KSP ksp,PetscInt nmax,PetscReal *r,PetscReal *c,PetscInt *neig)
> {
>  KSP_CG         *cgP = (KSP_CG*)ksp->data;
>  PetscScalar    *d,*e;
>  PetscReal      *ee;
>  PetscErrorCode ierr;
>  PetscInt       j,n = ksp->its;
> 
>  PetscFunctionBegin;
>  if (nmax < n) SETERRQ(PetscObjectComm((PetscObject)ksp),PETSC_ERR_ARG_SIZ,"Not enough room in work space r and c for eigenvalues");
>  *neig = n;
> 
>  ierr = PetscMemzero(c,nmax*sizeof(PetscReal));CHKERRQ(ierr);
>  if (!n) {
>    *r = 0.0;
>   ^^^^ assumes there is space in r for something even if the array dimension is zero which is not good.
> 
>  This likely requires some more work to insure the code can run through properly in this case. We don’t want to simply short circuit the process.
> 
>  Barry
> 
> 
> 
>> On Oct 21, 2014, at 5:53 PM, Mark Adams <mfadams at lbl.gov> wrote:
>> 
>> I am getting this in the output:
>> 
>> [0] VecNormalize(): Vector of zero norm can not be normalized; Returning only the zero norm
>> [0] KSPGMRESCycle(): Converged due to zero residual norm on entry
>> [0]PETSC ERROR: KSPComputeEigenvalues() line 119 in /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c Null Pointer: Parameter # 3
>> [0]PETSC ERROR: PETSC: Attaching /usr/bin/gdb to ./test1 of pid 50632 on display /tmp/launch-ikNsM1/org.macosforge.xquartz:0 on machine Macintosh-6.local
>> 
>> and in gdb I see this (appended).  The problem is that KSPComputeEigenvalues is called with NULL outputs.  This because KSPChebyshevComputeExtremeEigenvalues_Private get the number of iterations and it is zero, it mallocs zero, and this is an error.
>> 
>> Should there be a check like:
>> 
>> 326       ierr = KSPGetIterationNumber(kspest,&n);CHKERRQ(ierr);
>> if (!n) bail out
>> 327       ierr = PetscMalloc2(n,&re,n,&im);CHKERRQ(ierr);
>> 
>> or is something going wrong upstream?
>> 
>> Mark
>> 
>> #5  0x000000010fb86057 in PetscAttachDebuggerErrorHandler (comm=1140850689, line=119, fun=0x1115b0a25 "KSPComputeEigenvalues", file=0x1115b09b9 "/Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c", num=85, p=PETSC_ERROR_INITIAL, mess=0x7fff50102290 "Null Pointer: Parameter # 3", ctx=0x0) at /Users/markadams/Codes/petsc/src/sys/error/adebug.c:462
>> #6  0x000000010fb88d30 in PetscError (comm=1140850689, line=119, func=0x1115b0a25 "KSPComputeEigenvalues", file=0x1115b09b9 "/Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c", n=85, p=PETSC_ERROR_INITIAL, mess=0x1115518ca "Null Pointer: Parameter # %d") at /Users/markadams/Codes/petsc/src/sys/error/err.c:378
>> #7  0x0000000110bd3464 in KSPComputeEigenvalues (ksp=0x7fe92d813e60, n=0, r=0x0, c=0x0, neig=0x7fff50102c14) at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:119
>> #8  0x0000000110b6a915 in KSPChebyshevComputeExtremeEigenvalues_Private (kspest=0x7fe92d813e60, emin=0x7fff50102ca0, emax=0x7fff50102ca8) at /Users/markadams/Codes/petsc/src/ksp/ksp/impls/cheby/cheby.c:328
>> #9  0x0000000110b69043 in KSPSolve_Chebyshev (ksp=0x7fe92c869a60) at /Users/markadams/Codes/petsc/src/ksp/ksp/impls/cheby/cheby.c:373
>> #10 0x0000000110bd81ce in KSPSolve (ksp=0x7fe92c869a60, b=0x7fe92d002860, x=0x7fe92d007c60) at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:460
>> #11 0x0000000110a8e4b8 in PCMGMCycle_Private (pc=0x7fe92c862e60, mglevelsin=0x7fe92c868a60, reason=0x0) at /Users/markadams/Codes/petsc/src/ksp/pc/impls/mg/mg.c:19
>> #12 0x0000000110a9e1d5 in PCApply_MG (pc=0x7fe92c862e60, b=0x7fe92d002860, x=0x7fe92d007c60) at /Users/markadams/Codes/petsc/src/ksp/pc/impls/mg/mg.c:337
>> #13 0x0000000110ad6f2c in PCApply (pc=0x7fe92c862e60, x=0x7fe92d002860, y=0x7fe92d007c60) at /Users/markadams/Codes/petsc/src/ksp/pc/interface/precon.c:440
>> #14 0x0000000110affbd2 in KSP_PCApply (ksp=0x7fe92c833a60, x=0x7fe92d002860, y=0x7fe92d007c60) at kspimpl.h:230
>> #15 0x0000000110afcd31 in KSPSolve_CG (ksp=0x7fe92c833a60) at /Users/markadams/Codes/petsc/src/ksp/ksp/impls/cg/cg.c:139
>> #16 0x0000000110bd81ce in KSPSolve (ksp=0x7fe92c833a60, b=0x0, x=0x0) at /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c:460
>> #17 0x000000010fafea7d in main ()
>> (gdb) frame 8
>> #8  0x0000000110b6a915 in KSPChebyshevComputeExtremeEigenvalues_Private (kspest=0x7fe92d813e60, emin=0x7fff50102ca0, emax=0x7fff50102ca8) at /Users/markadams/Codes/petsc/src/ksp/ksp/impls/cheby/cheby.c:328
>> 328       ierr = KSPComputeEigenvalues(kspest,n,re,im,&neig);CHKERRQ(ierr);
>> Current language:  auto; currently minimal
>> (gdb) p re
>> $1 = (PetscReal *) 0x0
>> (gdb) list
>> 323       PetscReal      *re,*im,min,max;
>> 324     
>> 325       PetscFunctionBegin;
>> 326       ierr = KSPGetIterationNumber(kspest,&n);CHKERRQ(ierr);
>> 327       ierr = PetscMalloc2(n,&re,n,&im);CHKERRQ(ierr);
>> 328       ierr = KSPComputeEigenvalues(kspest,n,re,im,&neig);CHKERRQ(ierr);
>> 329       min  = PETSC_MAX_REAL;
>> 330       max  = PETSC_MIN_REAL;
>> 331       for (n=0; n<neig; n++) {
>> 332         min = PetscMin(min,re[n]);
>> 
>> 
>> 
>> On Tue, Oct 21, 2014 at 6:29 PM, Mark Adams <mfadams at lbl.gov> wrote:
>> Humm,  GAMG works fine form (like you said), but with MG (with or without Galerkin) on 1 or 4 procs I get:
>> 
>> [0]PETSC ERROR: KSPComputeEigenvalues() line 119 in /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c Null Pointer: Parameter # 3
>> 
>> Can you tell me exactly what you get with:
>> 
>> ./test1 -pc_type mg
>> 
>> I get:
>> 
>> [0]PETSC ERROR: KSPComputeEigenvalues() line 119 in /Users/markadams/Codes/petsc/src/ksp/ksp/interface/itfunc.c Null Pointer: Parameter # 3
>> [0]PETSC ERROR: PETSC: Attaching /usr/bin/gdb to ./test1 of pid 50433 on display /tmp/launch-ikNsM1/org.macosforge.xquartz:0 on machine Macintosh-6.local
>> 
>> And, it seems that the RHS is all zeros on one processor.
>> 
>> I am using a debug build.  Try to do the same.
>> 
>> 
>> 
>> 
>> On Tue, Oct 21, 2014 at 2:50 AM, Filippo Leonardi <filippo.leonardi at sam.math.ethz.ch> wrote:
>> Here,
>> 
>> run this e.g. with
>> mpirun -np 4 ./test1 -draw_pause -1 -pc_type mg -pc_mg_galerkin
>> 
>> In Debug mode it crashes, in standard mode it fails. Using Petsc current.
>> 
>> Run without mg and works!
>> 
>> Hope it's not me being stupid and doing something terribly wrong!
>> 
>> Best,
>> Filippo
>> 
>> On Monday 20 October 2014 07:29:04 Barry Smith wrote:
>>>  Can you send us the code: to petsc-maint at mcs.anl.gov or
>>> petsc-users at mcs.anl.gov ? Or something that reproduces the problem?
>>> 
>>>  Barry
>>> 
>>>> On Oct 20, 2014, at 3:31 AM, Filippo Leonardi
>>>> <filippo.leonardi at sam.math.ethz.ch> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I have a very specific problem that I cannot figure out with PCMG and
>>>> multiple solves. I got a linear system that I solve many times, with same
>>>> matrix but different RHS. I can successfully solve the system with
>>>> standard techniques, such as default solver or LU or PCGAMG. Even MG
>>>> works if I destroy the ksp each time or I recompute the matrices at each
>>>> time. But when I try and go to MG and not recomputing the matrices each
>>>> time the solver fails. Any idea?
>>>> 
>>>> Here some detail: the setup:
>>>> ierr = KSPSetDMActive(ksp, PETSC_FALSE); CHKERRQ(ierr);
>>>> ierr = KSPSetDM(ksp,  da); CHKERRQ(ierr);
>>>> ierr = KSPSetComputeOperators(ksp, ComputeMatrix, ctx); CHKERRQ(ierr);
>>>> ierr = KSPSetComputeRHS(ksp, ComputeRHS, ctx); CHKERRQ(ierr);
>>>> ierr = KSPSetFromOptions(ksp); CHKERRQ(ierr);
>>>> 
>>>> Then I solve as usual, for a large number of time steps:
>>>> ierr = KSPSolve(ksp, NULL, NULL); CHKERRQ(ierr);
>>>> ierr = KSPGetSolution(ksp, &phi);
>>>> 
>>>> The solver converges and does that in a reasonable number of iterations:
>>>> Linear solve converged due to CONVERGED_RTOL iterations 7
>>>> And ksp_view and ksp_monitor do not show any weird stuff.
>>>> 
>>>> - Weirdly enough using any solver (for instance cg+bjacobi or gamg)
>>>> everything works (So the matrix and RHS are working fine).
>>>> - But the problem persists with Galerkin matrices (-pc_mg_galerkin) (So is
>>>> not a ComputeMatrix problem).
>>>> - If I do:
>>>> ierr = KSPSetComputeOperators(ksp, ComputeMatrix, this); CHKERRQ(ierr);
>>>> between each solve or destroy the ksp entirely each time the solution is
>>>> also perfect (So is not a boundary scaling or other stuff problem).
>>>> - If I run MG with only 2 levels (so just coarse) I also get a fine
>>>> result.
>>>> 
>>>> Setup RHS is called each time as expected, setup matrix is called just
>>>> once, also as expected.
>>>> 
>>>> The only thing I can think is that MG does not update some value that
>>>> actually needs to be recomputed.
>>>> 
>>>> Any idea?
>>>> 
>>>> The solution is not that different from:
>>>> http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/
>>>> ex29.c.html
>>>> 
>>>> Best,
>>>> Filippo
>> 
>> 
> 




More information about the petsc-dev mailing list