[petsc-users] Debugging suggestions: GAMG

Barry Smith bsmith at petsc.dev
Sat Jun 13 11:04:33 CDT 2020


   The LAPACK routine ieeeck_ intentionally does a divide by zero to check if the system can handle it without generating an exception. It doesn't have anything to do
with the particular matrix data passed to LAPACK. 

    In KSPComputeExtremeSingularValues_GMRES() we have the code structure

  ierr = PetscFPTrapPush(PETSC_FP_TRAP_OFF);CHKERRQ(ierr);
#if !defined(PETSC_USE_COMPLEX)
  PetscStackCallBLAS("LAPACKgesvd",LAPACKgesvd_("N","N",&bn,&bn,R,&bN,realpart,&sdummy,&idummy,&sdummy,&idummy,work,&lwork,&lierr));
#else
  PetscStackCallBLAS("LAPACKgesvd",LAPACKgesvd_("N","N",&bn,&bn,R,&bN,realpart,&sdummy,&idummy,&sdummy,&idummy,work,&lwork,realpart+N,&lierr));
#endif
  if (lierr) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"Error in SVD Lapack routine %d",(int)lierr);
  ierr = PetscFPTrapPop();CHKERRQ(ierr);

   So PETSc tries to turn off trapping of floating point exceptions before calling the LAPACK routines that eventually lead to the exception. 

PetscErrorCode PetscFPTrapPush(PetscFPTrap trap)
{
  PetscErrorCode         ierr;
  struct PetscFPTrapLink *link;

  PetscFunctionBegin;
  ierr           = PetscNew(&link);CHKERRQ(ierr);
  link->trapmode = _trapmode;
  link->next     = _trapstack;
  _trapstack     = link;
  if (trap != _trapmode) {ierr = PetscSetFPTrap(trap);CHKERRQ(ierr);}
  PetscFunctionReturn(0);
}

PetscErrorCode PetscSetFPTrap(PetscFPTrap flag)
{
  char *out;

  PetscFunctionBegin;
  /* Clear accumulated exceptions.  Used to suppress meaningless messages from f77 programs */
  (void) ieee_flags("clear","exception","all",&out);
  if (flag == PETSC_FP_TRAP_ON) {
    /*
      To trap more fp exceptions, including underflow, change the line below to
      if (ieee_handler("set","all",PetscDefaultFPTrap)) {
    */
    if (ieee_handler("set","common",PetscDefaultFPTrap))        (*PetscErrorPrintf)("Can't set floatingpoint handler\n");
  } else if (ieee_handler("clear","common",PetscDefaultFPTrap)) (*PetscErrorPrintf)("Can't clear floatingpoint handler\n");

  _trapmode = flag;
  PetscFunctionReturn(0);
}

  So either the ieee_handler clear is not working for your system or some other code, AFTER PETSc calls ieee_handler sets the  ieee_handler to trap divide by zero. 

  A git grep -i ieee_handler  shows that the reference BLAS/LAPACK and OpenBLAS never seem to call the ieee_handler. 

  We need to know what lapack/blas you are using and how they were compiled.

  Some Fortran compilers/linkers set nonstandard exception handlers, but since PETSc clears them I don't know how they could get set again 

  You could try in gdb to put a break point in ieee_handler and find all the places it gets called, maybe this will lead to the location of the cause.

  Barry


> On Jun 13, 2020, at 1:30 AM, Sanjay Govindjee <s_g at berkeley.edu> wrote:
> 
> I have a FEA problem that I am trying to solve with GAMG.  The problem solves
> just fine with direct solvers (mumps, superlu) and iterative solvers (gmres, ml, hypre-boomer) etc.
> 
> However with GAMG I am getting a divide by zero that I am having trouble tracking down.  Below
> is the gdb stack trace and the source lines going up the stack.  
> 
> When I run in valgrind the problem runs fine (and gets the correct answer).
> Valgrind reports nothing of note (just lots of indirectly lost blocks  related to PMP_INIT).
> 
> I'm only running on one processor.
> 
> Any suggestions on where to start to trace the problem?
> 
> -sanjay
> 
> #0  0x00007fb262dc5be1 in ieeeck_ () from /lib64/liblapack.so.3
> #1  0x00007fb262dc5332 in ilaenv_ () from /lib64/liblapack.so.3
> #2  0x00007fb262dbbcef in dlasq2_ () from /lib64/liblapack.so.3
> #3  0x00007fb262dbb78c in dlasq1_ () from /lib64/liblapack.so.3
> #4  0x00007fb262da1e2e in dbdsqr_ () from /lib64/liblapack.so.3
> #5  0x00007fb262960110 in dgesvd_ () from /lib64/liblapack.so.3
> #6  0x00007fb264e74b66 in KSPComputeExtremeSingularValues_GMRES (ksp=0x1816560, emax=0x7ffc5010e7c8, emin=0x7ffc5010e7d0) at /home/sg/petsc-3.13.2/src/ksp/ksp/impls/gmres/gmreig.c:32
> #7  0x00007fb264dfe69a in KSPComputeExtremeSingularValues (ksp=0x1816560, emax=0x7ffc5010e7c8, emin=0x7ffc5010e7d0) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:64
> #8  0x00007fb264b44a1f in PCGAMGOptProlongator_AGG (pc=0x12f3d30, Amat=0x11a2630, a_P=0x7ffc5010ebe0) at /home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/agg.c:1145
> #9  0x00007fb264b248a1 in PCSetUp_GAMG (pc=0x12f3d30) at /home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/gamg.c:557
> #10 0x00007fb264d8535b in PCSetUp (pc=0x12f3d30) at /home/sg/petsc-3.13.2/src/ksp/pc/interface/precon.c:898
> #11 0x00007fb264e01a93 in KSPSetUp (ksp=0x128dd80) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:376
> #12 0x00007fb264e057af in KSPSolve_Private (ksp=0x128dd80, b=0x1259f30, x=0x125d910) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:633
> #13 0x00007fb264e086b9 in KSPSolve (ksp=0x128dd80, b=0x1259f30, x=0x125d910) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:853
> #14 0x00007fb264e46216 in kspsolve_ (ksp=0x832670 <__pfeapc_MOD_kspsol>, b=0x832698 <__pfeapc_MOD_rhs>, x=0x8326a0 <__pfeapc_MOD_sol>, __ierr=0x7ffc5010f358)
>     at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/ftn-auto/itfuncf.c:266
> #15 0x000000000043298d in usolve (flags=..., b=...) at usolve.F:313
> #16 0x000000000044afba in psolve (stype=-3, b=..., fp=..., factor=.TRUE., solve=.TRUE., cfr=.FALSE., prnt=.TRUE.) at psolve.f:212
> #17 0x00000000006b7393 in pmacr1 (lct=..., ct=..., j=3, _lct=_lct at entry=15) at pmacr1.f:578
> #18 0x00000000005c247b in pmacr (initf=.FALSE.) at pmacr.f:578
> #19 0x000000000044ff20 in pcontr () at pcontr.f:1307
> #20 0x0000000000404d9b in feap () at feap86.f:162
> #21 main (argc=<optimized out>, argv=<optimized out>) at feap86.f:168
> #22 0x00007fb261aaef43 in __libc_start_main () from /lib64/libc.so.6
> #23 0x0000000000404dde in _start ()
> 
> (gdb) list
> 1       <built-in>: No such file or directory.
> (gdb) up
> #1  0x00007fb262dc5332 in ilaenv_ () from /lib64/liblapack.so.3
> (gdb) up
> #2  0x00007fb262dbbcef in dlasq2_ () from /lib64/liblapack.so.3
> (gdb) up
> #3  0x00007fb262dbb78c in dlasq1_ () from /lib64/liblapack.so.3
> (gdb) up
> #4  0x00007fb262da1e2e in dbdsqr_ () from /lib64/liblapack.so.3
> (gdb) up
> #5  0x00007fb262960110 in dgesvd_ () from /lib64/liblapack.so.3
> (gdb) up
> #6  0x00007fb264e74b66 in KSPComputeExtremeSingularValues_GMRES (ksp=0x1816560, emax=0x7ffc5010e7c8, emin=0x7ffc5010e7d0) at /home/sg/petsc-3.13.2/src/ksp/ksp/impls/gmres/gmreig.c:32
> 32        PetscStackCallBLAS("LAPACKgesvd",LAPACKgesvd_("N","N",&bn,&bn,R,&bN,realpart,&sdummy,&idummy,&sdummy,&idummy,work,&lwork,&lierr));
> (gdb) up
> #7  0x00007fb264dfe69a in KSPComputeExtremeSingularValues (ksp=0x1816560, emax=0x7ffc5010e7c8, emin=0x7ffc5010e7d0) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:64
> 64          ierr = (*ksp->ops->computeextremesingularvalues)(ksp,emax,emin);CHKERRQ(ierr);
> (gdb) up
> #8  0x00007fb264b44a1f in PCGAMGOptProlongator_AGG (pc=0x12f3d30, Amat=0x11a2630, a_P=0x7ffc5010ebe0) at /home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/agg.c:1145
> 1145          ierr = KSPComputeExtremeSingularValues(eksp, &emax, &emin);CHKERRQ(ierr);
> (gdb) up
> #9  0x00007fb264b248a1 in PCSetUp_GAMG (pc=0x12f3d30) at /home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/gamg.c:557
> 557               ierr = pc_gamg->ops->optprolongator(pc, Aarr[level], &Prol11);CHKERRQ(ierr);
> (gdb) up
> #10 0x00007fb264d8535b in PCSetUp (pc=0x12f3d30) at /home/sg/petsc-3.13.2/src/ksp/pc/interface/precon.c:898
> 898         ierr = (*pc->ops->setup)(pc);CHKERRQ(ierr);
> (gdb) up
> #11 0x00007fb264e01a93 in KSPSetUp (ksp=0x128dd80) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:376
> 376       ierr = PCSetUp(ksp->pc);CHKERRQ(ierr);
> (gdb) up
> #12 0x00007fb264e057af in KSPSolve_Private (ksp=0x128dd80, b=0x1259f30, x=0x125d910) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:633
> 633       ierr = KSPSetUp(ksp);CHKERRQ(ierr);
> (gdb) up
> #13 0x00007fb264e086b9 in KSPSolve (ksp=0x128dd80, b=0x1259f30, x=0x125d910) at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:853
> 853       ierr = KSPSolve_Private(ksp,b,x);CHKERRQ(ierr);
> (gdb) up
> #14 0x00007fb264e46216 in kspsolve_ (ksp=0x832670 <__pfeapc_MOD_kspsol>, b=0x832698 <__pfeapc_MOD_rhs>, x=0x8326a0 <__pfeapc_MOD_sol>, __ierr=0x7ffc5010f358)
>     at /home/sg/petsc-3.13.2/src/ksp/ksp/interface/ftn-auto/itfuncf.c:266
> 266     *__ierr = KSPSolve(
> (gdb) up
> #15 0x000000000043298d in usolve (flags=..., b=...) at usolve.F:313
> 313               call KSPSolve         (kspsol, rhs, sol, ierr)
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200613/86f5dd62/attachment.html>


More information about the petsc-users mailing list