<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body>
Machine details: <br>
Fedora Core 30: 5.6.13-100.fc30.x86_64<br>
gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2)<br>
GNU Fortran (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2)<br>
<br>
Lapack/BLAS: Are whatever came with the machine and are in
/usr/lib64. I did not compile them myself<br>
<br>
I'll try two things: (1) Rebuil with a different BLAS/LAPACK and (2)
set a stop in ieee_handler( ) to see when and where it is getting
called.<br>
<br>
Also just for completeness here are the rest of the error messages
from the run:<br>
<br>
<blockquote>Thread 1 "feap" received signal SIGFPE, Arithmetic
exception.<br>
0x00007f0fe77e5be1 in ieeeck_ () from /lib64/liblapack.so.3<br>
<br>
<br>
[0]PETSC ERROR:
------------------------------------------------------------------------<br>
[0]PETSC ERROR: Caught signal number 8 FPE: Floating Point
Exception,probably divide by zero<br>
[0]PETSC ERROR: Try option -start_in_debugger or
-on_error_attach_debugger<br>
[0]PETSC ERROR: or see
<a class="moz-txt-link-freetext" href="https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind">https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind</a><br>
[0]PETSC ERROR: or try <a class="moz-txt-link-freetext" href="http://valgrind.org">http://valgrind.org</a> on GNU/linux and Apple
Mac OS X to find memory corruption errors<br>
[0]PETSC ERROR: likely location of problem given in stack below<br>
[0]PETSC ERROR: --------------------- Stack Frames
------------------------------------<br>
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
available,<br>
[0]PETSC ERROR: INSTEAD the line number of the start of the
function<br>
[0]PETSC ERROR: is given.<br>
[0]PETSC ERROR: [0] LAPACKgesvd line 32
/home/sg/petsc-3.13.2/src/ksp/ksp/impls/gmres/gmreig.c<br>
[0]PETSC ERROR: [0] KSPComputeExtremeSingularValues_GMRES line 14
/home/sg/petsc-3.13.2/src/ksp/ksp/impls/gmres/gmreig.c<br>
[0]PETSC ERROR: [0] KSPComputeExtremeSingularValues line 57
/home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c<br>
[0]PETSC ERROR: [0] PCGAMGOptProlongator_AGG line 1107
/home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/agg.c<br>
[0]PETSC ERROR: User provided function() line 0 in unknown file
(null)<br>
<br>
Program received signal SIGABRT: Process abort signal.<br>
<br>
</blockquote>
<pre class="moz-signature" cols="72">
</pre>
<div class="moz-cite-prefix">On 6/13/20 9:04 AM, Barry Smith wrote:<br>
</div>
<blockquote type="cite"
cite="mid:24AC8444-94DD-4885-8E04-9107DD7DB673@petsc.dev">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<div class=""><br class="">
</div>
The LAPACK routine ieeeck_ intentionally does a divide by zero
to check if the system can handle it without generating an
exception. It doesn't have anything to do
<div class="">with the particular matrix data passed to LAPACK. <br
class="">
<div class=""><br class="">
</div>
<div class=""> In KSPComputeExtremeSingularValues_GMRES() we
have the code structure</div>
<div class=""><br class="">
</div>
<div class="">
<div class=""> ierr =
PetscFPTrapPush(PETSC_FP_TRAP_OFF);CHKERRQ(ierr);</div>
<div class="">#if !defined(PETSC_USE_COMPLEX)</div>
<div class="">
PetscStackCallBLAS("LAPACKgesvd",LAPACKgesvd_("N","N",&bn,&bn,R,&bN,realpart,&sdummy,&idummy,&sdummy,&idummy,work,&lwork,&lierr));</div>
<div class="">#else</div>
<div class="">
PetscStackCallBLAS("LAPACKgesvd",LAPACKgesvd_("N","N",&bn,&bn,R,&bN,realpart,&sdummy,&idummy,&sdummy,&idummy,work,&lwork,realpart+N,&lierr));</div>
<div class="">#endif</div>
<div class=""> if (lierr)
SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"Error in SVD Lapack
routine %d",(int)lierr);</div>
<div class=""> ierr = PetscFPTrapPop();CHKERRQ(ierr);</div>
<div class=""><br class="">
</div>
<div class=""> So PETSc tries to turn off trapping of
floating point exceptions before calling the LAPACK routines
that eventually lead to the exception. </div>
<div class=""><br class="">
</div>
<div class="">
<div class="">PetscErrorCode PetscFPTrapPush(PetscFPTrap
trap)</div>
<div class="">{</div>
<div class=""> PetscErrorCode ierr;</div>
<div class=""> struct PetscFPTrapLink *link;</div>
<div class=""><br class="">
</div>
<div class=""> PetscFunctionBegin;</div>
<div class=""> ierr =
PetscNew(&link);CHKERRQ(ierr);</div>
<div class=""> link->trapmode = _trapmode;</div>
<div class=""> link->next = _trapstack;</div>
<div class=""> _trapstack = link;</div>
<div class=""> if (trap != _trapmode) {ierr =
PetscSetFPTrap(trap);CHKERRQ(ierr);}</div>
<div class=""> PetscFunctionReturn(0);</div>
<div class="">}</div>
</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">PetscErrorCode PetscSetFPTrap(PetscFPTrap
flag)</div>
<div class="">{</div>
<div class=""> char *out;</div>
<div class=""><br class="">
</div>
<div class=""> PetscFunctionBegin;</div>
<div class=""> /* Clear accumulated exceptions. Used to
suppress meaningless messages from f77 programs */</div>
<div class=""> (void)
ieee_flags("clear","exception","all",&out);</div>
<div class=""> if (flag == PETSC_FP_TRAP_ON) {</div>
<div class=""> /*</div>
<div class=""> To trap more fp exceptions, including
underflow, change the line below to</div>
<div class=""> if
(ieee_handler("set","all",PetscDefaultFPTrap)) {</div>
<div class=""> */</div>
<div class=""> if
(ieee_handler("set","common",PetscDefaultFPTrap))
(*PetscErrorPrintf)("Can't set floatingpoint handler\n");</div>
<div class=""> } else if
(ieee_handler("clear","common",PetscDefaultFPTrap))
(*PetscErrorPrintf)("Can't clear floatingpoint
handler\n");</div>
<div class=""><br class="">
</div>
<div class=""> _trapmode = flag;</div>
<div class=""> PetscFunctionReturn(0);</div>
<div class="">}</div>
</div>
<div class=""><br class="">
</div>
<div class=""> So either the ieee_handler clear is not
working for your system or some other code, AFTER PETSc
calls ieee_handler sets the ieee_handler to trap divide by
zero. </div>
<div class=""><br class="">
</div>
<div class=""> A <span style="font-family: Menlo; font-size:
14px;" class="">git grep -i ieee_handler shows that the
reference BLAS/LAPACK and OpenBLAS never seem to call the
ieee_handler. </span></div>
<div class=""><br class="">
</div>
<div class=""> We need to know what lapack/blas you are using
and how they were compiled.</div>
<div class=""><br class="">
</div>
<div class=""> Some Fortran compilers/linkers set nonstandard
exception handlers, but since PETSc clears them I don't know
how they could get set again </div>
<div class=""><br class="">
</div>
<div class=""> You could try in gdb to put a break point in
ieee_handler and find all the places it gets called, maybe
this will lead to the location of the cause.</div>
<div class=""><br class="">
</div>
<div class=""> Barry</div>
<div class=""><br class="">
</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Jun 13, 2020, at 1:30 AM, Sanjay
Govindjee <<a href="mailto:s_g@berkeley.edu" class=""
moz-do-not-send="true">s_g@berkeley.edu</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<meta http-equiv="content-type" content="text/html;
charset=windows-1252" class="">
<div class=""> I have a FEA problem that I am trying to
solve with GAMG. The problem solves<br class="">
just fine with direct solvers (mumps, superlu) and
iterative solvers (gmres, ml, hypre-boomer) etc.<br
class="">
<br class="">
However with GAMG I am getting a divide by zero that I
am having trouble tracking down. Below<br class="">
is the gdb stack trace and the source lines going up
the stack. <br class="">
<br class="">
When I run in valgrind the problem runs fine (and gets
the correct answer).<br class="">
Valgrind reports nothing of note (just lots of
indirectly lost blocks related to PMP_INIT).<br
class="">
<br class="">
I'm only running on one processor.<br class="">
<br class="">
Any suggestions on where to start to trace the
problem?<br class="">
<br class="">
-sanjay<br class="">
<br class="">
<blockquote class="">#0 0x00007fb262dc5be1 in ieeeck_
() from /lib64/liblapack.so.3<br class="">
#1 0x00007fb262dc5332 in ilaenv_ () from
/lib64/liblapack.so.3<br class="">
#2 0x00007fb262dbbcef in dlasq2_ () from
/lib64/liblapack.so.3<br class="">
#3 0x00007fb262dbb78c in dlasq1_ () from
/lib64/liblapack.so.3<br class="">
#4 0x00007fb262da1e2e in dbdsqr_ () from
/lib64/liblapack.so.3<br class="">
#5 0x00007fb262960110 in dgesvd_ () from
/lib64/liblapack.so.3<br class="">
#6 0x00007fb264e74b66 in
KSPComputeExtremeSingularValues_GMRES
(ksp=0x1816560, emax=0x7ffc5010e7c8,
emin=0x7ffc5010e7d0) at
/home/sg/petsc-3.13.2/src/ksp/ksp/impls/gmres/gmreig.c:32<br
class="">
#7 0x00007fb264dfe69a in
KSPComputeExtremeSingularValues (ksp=0x1816560,
emax=0x7ffc5010e7c8, emin=0x7ffc5010e7d0) at
/home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:64<br
class="">
#8 0x00007fb264b44a1f in PCGAMGOptProlongator_AGG
(pc=0x12f3d30, Amat=0x11a2630, a_P=0x7ffc5010ebe0)
at
/home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/agg.c:1145<br
class="">
#9 0x00007fb264b248a1 in PCSetUp_GAMG
(pc=0x12f3d30) at
/home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/gamg.c:557<br
class="">
#10 0x00007fb264d8535b in PCSetUp (pc=0x12f3d30) at
/home/sg/petsc-3.13.2/src/ksp/pc/interface/precon.c:898<br class="">
#11 0x00007fb264e01a93 in KSPSetUp (ksp=0x128dd80)
at
/home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:376<br
class="">
#12 0x00007fb264e057af in KSPSolve_Private
(ksp=0x128dd80, b=0x1259f30, x=0x125d910) at
/home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:633<br
class="">
#13 0x00007fb264e086b9 in KSPSolve (ksp=0x128dd80,
b=0x1259f30, x=0x125d910) at
/home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:853<br
class="">
#14 0x00007fb264e46216 in kspsolve_ (ksp=0x832670
<__pfeapc_MOD_kspsol>, b=0x832698
<__pfeapc_MOD_rhs>, x=0x8326a0
<__pfeapc_MOD_sol>, __ierr=0x7ffc5010f358)<br
class="">
at
/home/sg/petsc-3.13.2/src/ksp/ksp/interface/ftn-auto/itfuncf.c:266<br
class="">
#15 0x000000000043298d in usolve (flags=..., b=...)
at usolve.F:313<br class="">
#16 0x000000000044afba in psolve (stype=-3, b=...,
fp=..., factor=.TRUE., solve=.TRUE., cfr=.FALSE.,
prnt=.TRUE.) at psolve.f:212<br class="">
#17 0x00000000006b7393 in pmacr1 (lct=..., ct=...,
j=3, _lct=_lct@entry=15) at pmacr1.f:578<br class="">
#18 0x00000000005c247b in pmacr (initf=.FALSE.) at
pmacr.f:578<br class="">
#19 0x000000000044ff20 in pcontr () at pcontr.f:1307<br
class="">
#20 0x0000000000404d9b in feap () at feap86.f:162<br
class="">
#21 main (argc=<optimized out>,
argv=<optimized out>) at feap86.f:168<br
class="">
#22 0x00007fb261aaef43 in __libc_start_main () from
/lib64/libc.so.6<br class="">
#23 0x0000000000404dde in _start ()<br class="">
<br class="">
(gdb) list<br class="">
1 <built-in>: No such file or directory.<br
class="">
(gdb) up<br class="">
#1 0x00007fb262dc5332 in ilaenv_ () from
/lib64/liblapack.so.3<br class="">
(gdb) up<br class="">
#2 0x00007fb262dbbcef in dlasq2_ () from
/lib64/liblapack.so.3<br class="">
(gdb) up<br class="">
#3 0x00007fb262dbb78c in dlasq1_ () from
/lib64/liblapack.so.3<br class="">
(gdb) up<br class="">
#4 0x00007fb262da1e2e in dbdsqr_ () from
/lib64/liblapack.so.3<br class="">
(gdb) up<br class="">
#5 0x00007fb262960110 in dgesvd_ () from
/lib64/liblapack.so.3<br class="">
(gdb) up<br class="">
#6 0x00007fb264e74b66 in
KSPComputeExtremeSingularValues_GMRES
(ksp=0x1816560, emax=0x7ffc5010e7c8,
emin=0x7ffc5010e7d0) at
/home/sg/petsc-3.13.2/src/ksp/ksp/impls/gmres/gmreig.c:32<br
class="">
32
PetscStackCallBLAS("LAPACKgesvd",LAPACKgesvd_("N","N",&bn,&bn,R,&bN,realpart,&sdummy,&idummy,&sdummy,&idummy,work,&lwork,&lierr));<br
class="">
(gdb) up<br class="">
#7 0x00007fb264dfe69a in
KSPComputeExtremeSingularValues (ksp=0x1816560,
emax=0x7ffc5010e7c8, emin=0x7ffc5010e7d0) at
/home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:64<br
class="">
64 ierr =
(*ksp->ops->computeextremesingularvalues)(ksp,emax,emin);CHKERRQ(ierr);<br
class="">
(gdb) up<br class="">
#8 0x00007fb264b44a1f in PCGAMGOptProlongator_AGG
(pc=0x12f3d30, Amat=0x11a2630, a_P=0x7ffc5010ebe0)
at
/home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/agg.c:1145<br
class="">
1145 ierr =
KSPComputeExtremeSingularValues(eksp, &emax,
&emin);CHKERRQ(ierr);<br class="">
(gdb) up<br class="">
#9 0x00007fb264b248a1 in PCSetUp_GAMG
(pc=0x12f3d30) at
/home/sg/petsc-3.13.2/src/ksp/pc/impls/gamg/gamg.c:557<br
class="">
557 ierr =
pc_gamg->ops->optprolongator(pc, Aarr[level],
&Prol11);CHKERRQ(ierr);<br class="">
(gdb) up<br class="">
#10 0x00007fb264d8535b in PCSetUp (pc=0x12f3d30) at
/home/sg/petsc-3.13.2/src/ksp/pc/interface/precon.c:898<br class="">
898 ierr =
(*pc->ops->setup)(pc);CHKERRQ(ierr);<br
class="">
(gdb) up<br class="">
#11 0x00007fb264e01a93 in KSPSetUp (ksp=0x128dd80)
at
/home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:376<br
class="">
376 ierr = PCSetUp(ksp->pc);CHKERRQ(ierr);<br
class="">
(gdb) up<br class="">
#12 0x00007fb264e057af in KSPSolve_Private
(ksp=0x128dd80, b=0x1259f30, x=0x125d910) at
/home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:633<br
class="">
633 ierr = KSPSetUp(ksp);CHKERRQ(ierr);<br
class="">
(gdb) up<br class="">
#13 0x00007fb264e086b9 in KSPSolve (ksp=0x128dd80,
b=0x1259f30, x=0x125d910) at
/home/sg/petsc-3.13.2/src/ksp/ksp/interface/itfunc.c:853<br
class="">
853 ierr =
KSPSolve_Private(ksp,b,x);CHKERRQ(ierr);<br class="">
(gdb) up<br class="">
#14 0x00007fb264e46216 in kspsolve_ (ksp=0x832670
<__pfeapc_MOD_kspsol>, b=0x832698
<__pfeapc_MOD_rhs>, x=0x8326a0
<__pfeapc_MOD_sol>, __ierr=0x7ffc5010f358)<br
class="">
at
/home/sg/petsc-3.13.2/src/ksp/ksp/interface/ftn-auto/itfuncf.c:266<br
class="">
266 *__ierr = KSPSolve(<br class="">
(gdb) up<br class="">
#15 0x000000000043298d in usolve (flags=..., b=...)
at usolve.F:313<br class="">
313 call KSPSolve (kspsol,
rhs, sol, ierr)<br class="">
</blockquote>
<br class="">
<br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
</blockquote>
<br>
</body>
</html>