[petsc-users] How to debug PETSc error

Barry Smith bsmith at mcs.anl.gov
Mon Jun 30 00:53:45 CDT 2014


On Jun 30, 2014, at 12:00 AM, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi,
> 
> I have a CFD code which gives an error when solving the momentum eqn at time step = 1109. Using KSPGetConvergedReason give < 0 using optimized build.

   What value < 0? It is possible there is no bug. Bi-CG-stab (though it is stabilized) is not always stable and it can grief even if the matrix and right hand side are “reasonable”. Or the preconditioner may be generating inappropriately huge values (for example if ILU is being used inside it). 

   Yes, don’t try to print the matrix or anything like that.

   I would start by trying with KSPBCGSL (manual page below). It is designed to be more stable than Bi-CG-stab. Try it with the default options; you can also increase the ell if it fails.

   GMRES is always a good bet but I am thinking you are not using it because it requires too much memory due to restart length.

  Barry


KSPBCGSL - Implements a slight variant of the Enhanced
                BiCGStab(L) algorithm in (3) and (2).  The variation
                concerns cases when either kappa0**2 or kappa1**2 is
                negative due to round-off. Kappa0 has also been pulled
                out of the denominator in the formula for ghat.

    References:
      1. G.L.G. Sleijpen, H.A. van der Vorst, "An overview of
         approaches for the stable computation of hybrid BiCG
         methods", Applied Numerical Mathematics: Transactions
         f IMACS, 19(3), pp 235-54, 1996.
      2. G.L.G. Sleijpen, H.A. van der Vorst, D.R. Fokkema,
         "BiCGStab(L) and other hybrid Bi-CG methods",
          Numerical Algorithms, 7, pp 75-109, 1994.
      3. D.R. Fokkema, "Enhanced implementation of BiCGStab(L)
         for solving linear systems of equations", preprint
         from www.citeseer.com.

   Contributed by: Joel M. Malard, email jm.malard at pnl.gov

   Options Database Keys:
+  -ksp_bcgsl_ell <ell> Number of Krylov search directions, defaults to 2 -- KSPBCGSLSetEll()
.  -ksp_bcgsl_cxpol - Use a convex function of the MinRes and OR polynomials after the BiCG step instead of default MinRes -- KSPBCGSLSetPol()
.  -ksp_bcgsl_mrpoly - Use the default MinRes polynomial after the BiCG step  -- KSPBCGSLSetPol()
.  -ksp_bcgsl_xres <res> Threshold used to decide when to refresh computed residuals -- KSPBCGSLSetXRes()
-  -ksp_bcgsl_pinv <true/false> - (de)activate use of pseudoinverse -- KSPBCGSLSetUsePseudoinverse()

   Level: beginner

.seealso:  KSPCreate(), KSPSetType(), KSPType (for list of available types), KSP, KSPFGMRES, KSPBCGS, KSPSetPCSide(), KSPBCGSLSetEll(), KSPBCGSLSetXRes()

> 
> I retry using debug build and it gives the error below. I sent the job to a job scheduler on 32 procs. So what is best way to debug? Should I print out the matrix but it is very big since grid size is 13 million.
> 
> Thanks. Regards.
> 
> n12-10:13681] 31 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
> [n12-10:13681] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
> [17]PETSC ERROR: ------------------------------------------------------------------------
> [17]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably divide by zero
> [17]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [17]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[17]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [17]PETSC ERROR: likely location of problem given in stack below
> [17]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [17]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [17]PETSC ERROR:       INSTEAD the line number of the start of the function
> [17]PETSC ERROR:       is given.
> [17]PETSC ERROR: [17] VecNorm_MPI line 57 /home/wtay/Codes/petsc-3.4.4/src/vec/vec/impls/mpi/pvec2.c
> [17]PETSC ERROR: [17] VecNorm line 224 /home/wtay/Codes/petsc-3.4.4/src/vec/vec/interface/rvector.c
> [17]PETSC ERROR: [17] KSPSolve_BCGS line 39 /home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/impls/bcgs/bcgs.c
> [17]PETSC ERROR: [17] KSPSolve line 356 /home/wtay/Codes/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
> 
> -- 
> Thank you
> 
> Yours sincerely,
> 
> TAY wee-beng
> 



More information about the petsc-users mailing list