[petsc-users] The PETSC ERROR: VecMAXPY() when using GMRES

Song Gao song.gao2 at mail.mcgill.ca
Wed Mar 11 20:35:39 CDT 2015


Thanks for your suggestion.   Yes,  but sorry that unfortunately I'm not
allowed to upgrade it.

As for the preconditioner. I tried with -pc_type none and still have the
same error message.

Sorry, I am not clear. The code crashes
at  PetscValidLogicalCollectiveScalar(y,alpha[i],3) and I have seen that on
both processors  b2 = {-0.013169008988739605, 0.013169008988739673},
Forgive me if I ask a silly question. Is it because we compare two double
precision number with the operator '!=' ?    Looking at line 309,   Of
course, (-b2[0] != b2[1]) is true, right?

Thanks for your help.

I pasted PetscValidLogicalCollectiveScalar here:

303 #define PetscValidLogicalCollectiveScalar(a,b,c)
 \
304   do {
 \
305     PetscErrorCode _7_ierr;
\
306     PetscReal b1[2],b2[2];
 \
307     b1[0] = -PetscRealPart(b); b1[1] = PetscRealPart(b);
 \
308     _7_ierr =
MPI_Allreduce(b1,b2,2,MPIU_REAL,MPIU_MAX,((PetscObject)a)->comm);CHKERRQ(_7_ierr);
\
309     if (-b2[0] != b2[1])
SETERRQ1(((PetscObject)a)->comm,PETSC_ERR_ARG_WRONG,"Scalar value must be
same on all processes, argument # %d",c); \
310   } while (0)


Regards,
Song

2015-03-11 20:16 GMT-04:00 Barry Smith <bsmith at mcs.anl.gov>:

>
>   In IEEE floating point standard Nan is not equal to itself. This is what
> triggers these types of non-intuitive error messages.
>
>   My guess is that the preconditioner is triggering a Nan; for example ILU
> sometimes triggers Nan due to very small pivots.
>
>   You are using an old version of PETSc the first step I recommend is to
> upgrade http://www.mcs.anl.gov/petsc/documentation/changes/index.html;
> the more recent version of PETSc has more checks for Nan etc in the code.
>
>   Barry
>
>
>
> > On Mar 11, 2015, at 7:05 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> >
> > Thank you. That's cool. Sorry, I'm not good at gdb.
> >
> > I did that. They are the same. One gave
> > (gdb) p *alpha
> > $1 = 0.013169008988739605
> >
> > and another gave
> > (gdb) p *alpha
> > $1 = 0.013169008988739673
> >
> > 2015-03-11 17:35 GMT-04:00 Matthew Knepley <knepley at gmail.com>:
> > On Wed, Mar 11, 2015 at 3:39 PM, Song Gao <song.gao2 at mail.mcgill.ca>
> wrote:
> > Thanks.
> >
> > I run with two processes. When the code stop, I'm in      raise()
>  and alpha is not in the current context.
> >
> > Here you would use:
> >
> >   (gdb) up 4
> >
> >   (gdb) p *alpha
> >
> >    Matt
> >
> > (gdb)p alpha
> > No symbol "alpha" in current context.
> > (gdb)  bt
> > #0  0x0000003764432625 in raise () from /lib64/libc.so.6
> > #1  0x0000003764433e05 in abort () from /lib64/libc.so.6
> > #2  0x00000000015d02f5 in PetscAbortErrorHandler (comm=0x36e17a0,
> line=1186,
> >     fun=0x279cad4 "VecMAXPY", file=0x279c404 "rvector.c",
> >     dir=0x279c1c8 "src/vec/vec/interface/", n=62, p=PETSC_ERROR_INITIAL,
> >     mess=0x7fff33ffa4c0 "Scalar value must be same on all processes,
> argument # 3", ctx=0x0) at errabort.c:62
> > #3  0x000000000130cf44 in PetscError (comm=0x36e17a0, line=1186,
> >     func=0x279cad4 "VecMAXPY", file=0x279c404 "rvector.c",
> >     dir=0x279c1c8 "src/vec/vec/interface/", n=62, p=PETSC_ERROR_INITIAL,
> >     mess=0x279c720 "Scalar value must be same on all processes, argument
> # %d")
> >     at err.c:356
> > #4  0x00000000013f8184 in VecMAXPY (y=0x3b35000, nv=20, alpha=0x3b31840,
> >     x=0x3b33080) at rvector.c:1186
> > #5  0x0000000001581062 in KSPGMRESBuildSoln (nrs=0x3b31840, vs=0x3ab2090,
> >     vdest=0x3ab2090, ksp=0x39a9700, it=19) at gmres.c:345
> >
> >
> > But I break at VecMAXPY. then print out alpha on both processes. For the
> first few times that the breakpoint is hit , I checked the values on both
> processes and they are the same.
> >
> >
> > (gdb) b VecMAXPY
> > Breakpoint 1 at 0x13f73e0: file rvector.c, line 1174.
> > (gdb) c
> > Continuing.
> > Breakpoint 1, VecMAXPY (y=0x3f2b790, nv=1, alpha=0x3f374e0, x=0x3f1fde0)
> >     at rvector.c:1174
> > 1174      PetscFunctionBegin;
> > (gdb) p alpha
> > $1 = (const PetscScalar *) 0x3f374e0
> > (gdb) p *alpha
> > $2 = -0.54285016977140765
> > (gdb)
> >
> > 2015-03-11 15:52 GMT-04:00 Matthew Knepley <knepley at gmail.com>:
> > On Wed, Mar 11, 2015 at 2:33 PM, Song Gao <song.gao2 at mail.mcgill.ca>
> wrote:
> > Hello,
> >
> > I'm solving Navier-Stokes equations by finite element method.  I use KSP
> as the linear system solver. I run with 2 cpu. The code runs fine in
> non-debug version. But when I switch to the debug version, the code gives
> the following error.
> >
> > I output the matrix and rhs before calling KSPSolve to make sure no NAN
> or INF in them. The condition number of matrix is ~2e4.  Seems okay.
> >
> > I also run the code with valgrind, but didn't find any other errors. The
> valgrind output is attached. Any ideas of what I can do next?
> >
> > Is there any chance you could spawn the debugger, -start_in_debugger,
> and when you get the error,
> > print out the value of 'alpha' on each process?
> >
> > Otherwise the best thing to do is output your Mat and RHS in binary and
> send them so we can try to reproduce.
> >
> >     Matt
> >
> >   Thanks,
> >
> >      Matt
> >
> > Thanks in advance.
> >
> >
> > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > [0]PETSC ERROR: Invalid argument!
> > [0]PETSC ERROR: Scalar value must be same on all processes, argument # 3!
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
> 22:15:24 CDT 2013
> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > [0]PETSC ERROR: See docs/index.html for manual pages.
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR:
> /home/cfd/sgao/mycodes/fensap_new_edge_coefficient/fensapng-mf-newmuscl-overledg_org/bin/fensapMPI_LINUX64_DEBUG
> on a linux named anakin by sgao Wed Mar 11 15:07:53 2015
> > [0]PETSC ERROR: Libraries linked from /tmp/PETSC33/petsc-3.3-p7/linux/lib
> > [0]PETSC ERROR: Configure run at Wed Jan 15 12:04:54 2014
> > [0]PETSC ERROR: Configure options
> --with-mpi-dir=/usr/local.linux64/lib64/MPI-openmpi-1.4.5/
> --with-shared-libraries=0 --COPTFLAGS=-g --FOPTFLAGS=-g --with-debugging=yes
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: VecMAXPY() line 1186 in src/vec/vec/interface/rvector.c
> > [0]PETSC ERROR: KSPGMRESBuildSoln() line 345 in
> src/ksp/ksp/impls/gmres/gmres.c
> > [0]PETSC ERROR: KSPGMRESCycle() line 206 in
> src/ksp/ksp/impls/gmres/gmres.c
> > [0]PETSC ERROR: KSPSolve_GMRES() line 231 in
> src/ksp/ksp/impls/gmres/gmres.c
> > [0]PETSC ERROR: KSPSolve() line 446 in src/ksp/ksp/interface/itfunc.c
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150311/acb921fe/attachment.html>


More information about the petsc-users mailing list