[petsc-users] divide by zero coming in hypre
Barry Smith
bsmith at mcs.anl.gov
Tue Apr 28 15:11:59 CDT 2015
Trying again, hyper-support rejected my first email do to something in the subject line.
I am forwarding this on to the hypre support email list; hopefully they =
can have some advice on how to proceed.=20
hypre folks, this is with version hypre-2.10.0b (the previous version ha=
d the same behavior). We are getting a divide by zero in gselim() line 4363=
(this happens in a time dependent problem after many successful solves wit=
h time dependent matrix). I looked at the code, below, and note that there=
are some checks for the diagonal of A not being zero but not for the line =
that causes the divide by zero; is this perhaps an oversight in the hypre c=
ode? Any advice appreciated.
Thanks
Barry
HYPRE_Int gselim(A,x,n)
HYPRE_Real *A;
HYPRE_Real *x;
HYPRE_Int n;
{
HYPRE_Int err_flag =3D 0;
HYPRE_Int j,k,m;
HYPRE_Real factor;
=20
if (n=3D=3D1) /* A is 1x1 */ =20
{
if (A[0] !=3D 0.0)
{
x[0] =3D x[0]/A[0];
return(err_flag);
}
else
{
err_flag =3D 1;
return(err_flag);
}
}
else /* A is nxn. Forward elimination */=
=20
{
for (k =3D 0; k < n-1; k++)
{
if (A[k*n+k] !=3D 0.0)
{ =20
for (j =3D k+1; j < n; j++)
{
if (A[j*n+k] !=3D 0.0)
{
factor =3D A[j*n+k]/A[k*n+k];
for (m =3D k+1; m < n; m++)
{
A[j*n+m] -=3D factor * A[k*n+m];
}
/* Elimination step for rhs */=20
x[j] -=3D factor * x[k]; =20
}
}
}
}
/* Back Substitution */
for (k =3D n-1; k > 0; --k)
{
x[k] /=3D A[k*n+k];
for (j =3D 0; j < k; j++)
{
if (A[j*n+k] !=3D 0.0)
{
x[j] -=3D x[k] * A[j*n+k];
}
}
}
x[0] /=3D A[0];
return(err_flag);
}
}
> On Apr 28, 2015, at 12:55 PM, Danyang Su <danyang.su at gmail.com> wrote:
>=20
> Hi Barry,
>=20
> The development version of PETSc does not help to solve my problem. It st=
ill crashed due to the same error information.
>=20
> As Matthew mentioned, I checked the matrix and non of the diagonal entry =
is zero. But the diagonal values for different rows range from -1.0d-18 to =
1.0d33 and the matrix is not strict diagonal dominant. When using 1 process=
or, the matrix is also not strict diagonal dominant, but the diagonal value=
s have a much smaller range and the codes can run without error. Maybe ther=
e is something wrong in the matrix for this case and I need to check the ma=
trix first.
>=20
> Thanks,
>=20
> Danyang
>=20
> Program received signal SIGFPE, Arithmetic exception.
> 0x00007f668c1299d5 in gselim (A=3D0x2fe3330, x=3D0x2210680, n=3D9)
> at par_relax.c:4363
> 4363 x[k] /=3D A[k*n+k];
> (gdb)
>=20
> [1]PETSC ERROR: *** unknown floating point error occurred ***
> [1]PETSC ERROR: The specific exception can be determined by running in a =
debugger. When the
> [1]PETSC ERROR: debugger traps the signal, the exception can be found wit=
h fetestexcept(0x3d)
> [1]PETSC ERROR: where the result is a bitwise OR of the following flags:
> [1]PETSC ERROR: FE_INVALID=3D0x1 FE_DIVBYZERO=3D0x4 FE_OVERFLOW=3D0x8 FE_=
UNDERFLOW=3D0x10 FE_INEXACT=3D0x20
> [1]PETSC ERROR: Try option -start_in_debugger
> [1]PETSC ERROR: likely location of problem given in stack below
> [1]PETSC ERROR: --------------------- Stack Frames ---------------------=
---------------
> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not availab=
le,
> [1]PETSC ERROR: INSTEAD the line number of the start of the functio=
n
> [1]PETSC ERROR: is given.
> [1]PETSC ERROR: [1] PetscDefaultFPTrap line 379 /home/dsu/Soft/PETSc/pets=
c-dev/src/sys/error/fp.c
> [1]PETSC ERROR: [1] Hypre solve line 216 /home/dsu/Soft/PETSc/petsc-dev/s=
rc/ksp/pc/impls/hypre/hypre.c
> [1]PETSC ERROR: [1] PCApply_HYPRE line 203 /home/dsu/Soft/PETSc/petsc-dev=
/src/ksp/pc/impls/hypre/hypre.c
> [1]PETSC ERROR: [1] PCApply line 430 /home/dsu/Soft/PETSc/petsc-dev/src/k=
sp/pc/interface/precon.c
> [1]PETSC ERROR: [1] KSP_PCApply line 234 /home/dsu/Soft/PETSc/petsc-dev/i=
nclude/petsc/private/kspimpl.h
> [1]PETSC ERROR: [1] KSPInitialResidual line 44 /home/dsu/Soft/PETSc/petsc=
-dev/src/ksp/ksp/interface/itres.c
> [1]PETSC ERROR: [1] KSPSolve_GMRES line 224 /home/dsu/Soft/PETSc/petsc-de=
v/src/ksp/ksp/impls/gmres/gmres.c
> [1]PETSC ERROR: [1] KSPSolve line 506 /home/dsu/Soft/PETSc/petsc-dev/src/=
ksp/ksp/interface/itfunc.c
> [1]PETSC ERROR: User provided function() line 0 in Unknown file trapped f=
loating point error
>=20
>=20
>=20
> On 15-04-27 11:47 AM, Barry Smith wrote:
>> Danyang,
>>=20
>> I'm glad you were finally able to get to the problematic point; sorr=
y it took so long (I want to strangle whoever turned on catching of underfl=
ows in PETSc).
>>=20
>> The hypre folks have done a great deal of work on their relaxation c=
ode since the PETSc 3.5.3 release. There is a good chance they have fixed t=
he divide by zero problem you are hitting here. You will need to upgrade to=
the development version of PETSc (that uses the latest version of hypre), =
here are the instructions on how to obtain it http://www.mcs.anl.gov/petsc/=
developers/index.html
>>=20
>> Please let us know if this resolves the problem with hypre failing.
>>=20
>> Barry
>>=20
>>=20
>>=20
>>> On Apr 27, 2015, at 11:44 AM, Danyang Su <danyang.su at gmail.com> wrote:
>>>=20
>>> Hi Barry,
>>>=20
>>> I got the following arithemetic exception after the previous bug is fix=
ed.
>>>=20
>>> Loaded symbols for /lib/x86_64-linux-gnu/libnss_files.so.2
>>> 0x00007f3b23b98f20 in __nanosleep_nocancel ()
>>> at ../sysdeps/unix/syscall-template.S:81
>>> 81 ../sysdeps/unix/syscall-template.S: No such file or directory.
>>> (gdb) cont
>>> Continuing.
>>>=20
>>> Program received signal SIGFPE, Arithmetic exception.
>>> 0x00007f3b260df449 in gselim (A=3D0x1e4c580, x=3D0x1d30c40, n=3D9)
>>> at par_relax.c:3442
>>> 3442 x[k] /=3D A[k*n+k];
>>> (gdb)
>>>=20
>>> I tried both PETSc 3.5.2 and 3.5.3 and they return the same error as sh=
own above. For 3.5.3, i edited fp.c file and then configure and make.
>>>=20
>>> Thanks,
>>>=20
More information about the petsc-users
mailing list