[petsc-users] Debugging suggestions: GAMG
Barry Smith
bsmith at petsc.dev
Sat Jun 13 11:04:33 CDT 2020
The LAPACK routine ieeeck_ intentionally does a divide by zero to check if the system can handle it without generating an exception. It doesn't have anything to do
with the particular matrix data passed to LAPACK.
In KSPComputeExtremeSingularValues_GMRES() we have the code structure
ierr = PetscFPTrapPush(PETSC_FP_TRAP_OFF);CHKERRQ(ierr);
#if !defined(PETSC_USE_COMPLEX)
PetscStackCallBLAS("LAPACKgesvd",LAPACKgesvd_("N","N",&bn,&bn,R,&bN,realpart,&sdummy,&idummy,&sdummy,&idummy,work,&lwork,&lierr));
#else
PetscStackCallBLAS("LAPACKgesvd",LAPACKgesvd_("N","N",&bn,&bn,R,&bN,realpart,&sdummy,&idummy,&sdummy,&idummy,work,&lwork,realpart+N,&lierr));
#endif
if (lierr) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"Error in SVD Lapack routine %d",(int)lierr);
ierr = PetscFPTrapPop();CHKERRQ(ierr);
So PETSc tries to turn off trapping of floating point exceptions before calling the LAPACK routines that eventually lead to the exception.
PetscErrorCode PetscFPTrapPush(PetscFPTrap trap)
{
PetscErrorCode ierr;
struct PetscFPTrapLink *link;
PetscFunctionBegin;
ierr = PetscNew(&link);CHKERRQ(ierr);
link->trapmode = _trapmode;
link->next = _trapstack;
_trapstack = link;
if (trap != _trapmode) {ierr = PetscSetFPTrap(trap);CHKERRQ(ierr);}
PetscFunctionReturn(0);
}
PetscErrorCode PetscSetFPTrap(PetscFPTrap flag)
{
char *out;
PetscFunctionBegin;
/* Clear accumulated exceptions. Used to suppress meaningless messages from f77 programs */
(void) ieee_flags("clear","exception","all",&out);
if (flag == PETSC_FP_TRAP_ON) {
/*
To trap more fp exceptions, including underflow, change the line below to
if (ieee_handler("set","all",PetscDefaultFPTrap)) {
*/
if (ieee_handler("set","common",PetscDefaultFPTrap)) (*PetscErrorPrintf)("Can't set floatingpoint handler\n");
} else if (ieee_handler("clear","common",PetscDefaultFPTrap)) (*PetscErrorPrintf)("Can't clear floatingpoint handler\n");
_trapmode = flag;
PetscFunctionReturn(0);
}
So either the ieee_handler clear is not working for your system or some other code, AFTER PETSc calls ieee_handler sets the ieee_handler to trap divide by zero.
A git grep -i ieee_handler shows that the reference BLAS/LAPACK and OpenBLAS never seem to call the ieee_handler.
We need to know what lapack/blas you are using and how they were compiled.
Some Fortran compilers/linkers set nonstandard exception handlers, but since PETSc clears them I don't know how they could get set again
You could try in gdb to put a break point in ieee_handler and find all the places it gets called, maybe this will lead to the location of the cause.
Barry
> On Jun 13, 2020, at 1:30 AM, Sanjay Govindjee <s_g at berkeley.edu> wrote:
>
> I have a FEA problem that I am trying to solve with GAMG. The problem solves
> just fine with direct solvers (mumps, superlu) and iterative solvers (gmres, ml, hypre-boomer) etc.
>
> However with GAMG I am getting a divide by zero that I am having trouble tracking down. Below
> is the gdb stack trace and the source lines going up the stack.
>
> When I run in valgrind the problem runs fine (and gets the correct answer).
> Valgrind reports nothing of note (just lots of indirectly lost blocks related to PMP_INIT).
>
> I'm only running on one processor.
>
> Any suggestions on where to start to trace the problem?
>
> -sanjay
>
> #0 0x00007fb262dc5be1 in ieeeck_ () from /lib64/liblapack.so.3
> #1 0x00007fb262dc5332 in ilaenv_ () from /lib64/liblapack.so.3
> #2 0x00007fb262dbbcef in dlasq2_ () from /lib64/liblapack.so.3
> #3 0x00007fb262dbb78c in dlasq1_ () from /lib64/liblapack.so.3
> #4 0x00007fb262da1e2e in dbdsqr_ () from /lib64/liblapack.so.3
> #5 0x00007fb262960110 in dgesvd_ () from /lib64/liblapack.so.3
> #6 0x00007fb264e74b66 in KSPComputeExtremeSingularValues_GMRES (ksp=0x1816560, emax=0x7ffc5010e7c8, emin=0x7ffc5010e7d0) at /home/sg/petsc-3.13.2/src/ksp/ksp/impls/gmres/gmreig.c:32
> #7 0x00007fb264dfe69a in KSPComputeExtremeSingularValues (ksp=0x1816560, emax=0x7ffc5010e7c8, emin=0x7ffc5010e7d0) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:64
> #8 0x00007fb264b44a1f in PCGAMGOptProlongator_AGG (pc=0x12f3d30, Amat=0x11a2630, a_P=0x7ffc5010ebe0) at /home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/agg.c:1145
> #9 0x00007fb264b248a1 in PCSetUp_GAMG (pc=0x12f3d30) at /home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/gamg.c:557
> #10 0x00007fb264d8535b in PCSetUp (pc=0x12f3d30) at /home/sg/petsc-3.13.2/src/ksp/pc/interface/precon.c:898
> #11 0x00007fb264e01a93 in KSPSetUp (ksp=0x128dd80) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:376
> #12 0x00007fb264e057af in KSPSolve_Private (ksp=0x128dd80, b=0x1259f30, x=0x125d910) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:633
> #13 0x00007fb264e086b9 in KSPSolve (ksp=0x128dd80, b=0x1259f30, x=0x125d910) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:853
> #14 0x00007fb264e46216 in kspsolve_ (ksp=0x832670 <__pfeapc_MOD_kspsol>, b=0x832698 <__pfeapc_MOD_rhs>, x=0x8326a0 <__pfeapc_MOD_sol>, __ierr=0x7ffc5010f358)
> at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/ftn-auto/itfuncf.c:266
> #15 0x000000000043298d in usolve (flags=..., b=...) at usolve.F:313
> #16 0x000000000044afba in psolve (stype=-3, b=..., fp=..., factor=.TRUE., solve=.TRUE., cfr=.FALSE., prnt=.TRUE.) at psolve.f:212
> #17 0x00000000006b7393 in pmacr1 (lct=..., ct=..., j=3, _lct=_lct at entry=15) at pmacr1.f:578
> #18 0x00000000005c247b in pmacr (initf=.FALSE.) at pmacr.f:578
> #19 0x000000000044ff20 in pcontr () at pcontr.f:1307
> #20 0x0000000000404d9b in feap () at feap86.f:162
> #21 main (argc=<optimized out>, argv=<optimized out>) at feap86.f:168
> #22 0x00007fb261aaef43 in __libc_start_main () from /lib64/libc.so.6
> #23 0x0000000000404dde in _start ()
>
> (gdb) list
> 1 <built-in>: No such file or directory.
> (gdb) up
> #1 0x00007fb262dc5332 in ilaenv_ () from /lib64/liblapack.so.3
> (gdb) up
> #2 0x00007fb262dbbcef in dlasq2_ () from /lib64/liblapack.so.3
> (gdb) up
> #3 0x00007fb262dbb78c in dlasq1_ () from /lib64/liblapack.so.3
> (gdb) up
> #4 0x00007fb262da1e2e in dbdsqr_ () from /lib64/liblapack.so.3
> (gdb) up
> #5 0x00007fb262960110 in dgesvd_ () from /lib64/liblapack.so.3
> (gdb) up
> #6 0x00007fb264e74b66 in KSPComputeExtremeSingularValues_GMRES (ksp=0x1816560, emax=0x7ffc5010e7c8, emin=0x7ffc5010e7d0) at /home/sg/petsc-3.13.2/src/ksp/ksp/impls/gmres/gmreig.c:32
> 32 PetscStackCallBLAS("LAPACKgesvd",LAPACKgesvd_("N","N",&bn,&bn,R,&bN,realpart,&sdummy,&idummy,&sdummy,&idummy,work,&lwork,&lierr));
> (gdb) up
> #7 0x00007fb264dfe69a in KSPComputeExtremeSingularValues (ksp=0x1816560, emax=0x7ffc5010e7c8, emin=0x7ffc5010e7d0) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:64
> 64 ierr = (*ksp->ops->computeextremesingularvalues)(ksp,emax,emin);CHKERRQ(ierr);
> (gdb) up
> #8 0x00007fb264b44a1f in PCGAMGOptProlongator_AGG (pc=0x12f3d30, Amat=0x11a2630, a_P=0x7ffc5010ebe0) at /home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/agg.c:1145
> 1145 ierr = KSPComputeExtremeSingularValues(eksp, &emax, &emin);CHKERRQ(ierr);
> (gdb) up
> #9 0x00007fb264b248a1 in PCSetUp_GAMG (pc=0x12f3d30) at /home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/gamg.c:557
> 557 ierr = pc_gamg->ops->optprolongator(pc, Aarr[level], &Prol11);CHKERRQ(ierr);
> (gdb) up
> #10 0x00007fb264d8535b in PCSetUp (pc=0x12f3d30) at /home/sg/petsc-3.13.2/src/ksp/pc/interface/precon.c:898
> 898 ierr = (*pc->ops->setup)(pc);CHKERRQ(ierr);
> (gdb) up
> #11 0x00007fb264e01a93 in KSPSetUp (ksp=0x128dd80) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:376
> 376 ierr = PCSetUp(ksp->pc);CHKERRQ(ierr);
> (gdb) up
> #12 0x00007fb264e057af in KSPSolve_Private (ksp=0x128dd80, b=0x1259f30, x=0x125d910) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:633
> 633 ierr = KSPSetUp(ksp);CHKERRQ(ierr);
> (gdb) up
> #13 0x00007fb264e086b9 in KSPSolve (ksp=0x128dd80, b=0x1259f30, x=0x125d910) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:853
> 853 ierr = KSPSolve_Private(ksp,b,x);CHKERRQ(ierr);
> (gdb) up
> #14 0x00007fb264e46216 in kspsolve_ (ksp=0x832670 <__pfeapc_MOD_kspsol>, b=0x832698 <__pfeapc_MOD_rhs>, x=0x8326a0 <__pfeapc_MOD_sol>, __ierr=0x7ffc5010f358)
> at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/ftn-auto/itfuncf.c:266
> 266 *__ierr = KSPSolve(
> (gdb) up
> #15 0x000000000043298d in usolve (flags=..., b=...) at usolve.F:313
> 313 call KSPSolve (kspsol, rhs, sol, ierr)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200613/86f5dd62/attachment.html>
More information about the petsc-users
mailing list