[petsc-users] The PETSC ERROR: VecMAXPY() when using GMRES

Barry Smith bsmith at mcs.anl.gov
Wed Mar 11 19:16:17 CDT 2015


  In IEEE floating point standard Nan is not equal to itself. This is what triggers these types of non-intuitive error messages.

  My guess is that the preconditioner is triggering a Nan; for example ILU sometimes triggers Nan due to very small pivots.

  You are using an old version of PETSc the first step I recommend is to upgrade http://www.mcs.anl.gov/petsc/documentation/changes/index.html; the more recent version of PETSc has more checks for Nan etc in the code.

  Barry



> On Mar 11, 2015, at 7:05 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> 
> Thank you. That's cool. Sorry, I'm not good at gdb.
> 
> I did that. They are the same. One gave
> (gdb) p *alpha
> $1 = 0.013169008988739605
> 
> and another gave
> (gdb) p *alpha
> $1 = 0.013169008988739673
> 
> 2015-03-11 17:35 GMT-04:00 Matthew Knepley <knepley at gmail.com>:
> On Wed, Mar 11, 2015 at 3:39 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> Thanks.
> 
> I run with two processes. When the code stop, I'm in      raise()     and alpha is not in the current context.
> 
> Here you would use:
> 
>   (gdb) up 4
> 
>   (gdb) p *alpha
> 
>    Matt
>  
> (gdb)p alpha
> No symbol "alpha" in current context.
> (gdb)  bt
> #0  0x0000003764432625 in raise () from /lib64/libc.so.6
> #1  0x0000003764433e05 in abort () from /lib64/libc.so.6
> #2  0x00000000015d02f5 in PetscAbortErrorHandler (comm=0x36e17a0, line=1186, 
>     fun=0x279cad4 "VecMAXPY", file=0x279c404 "rvector.c", 
>     dir=0x279c1c8 "src/vec/vec/interface/", n=62, p=PETSC_ERROR_INITIAL, 
>     mess=0x7fff33ffa4c0 "Scalar value must be same on all processes, argument # 3", ctx=0x0) at errabort.c:62
> #3  0x000000000130cf44 in PetscError (comm=0x36e17a0, line=1186, 
>     func=0x279cad4 "VecMAXPY", file=0x279c404 "rvector.c", 
>     dir=0x279c1c8 "src/vec/vec/interface/", n=62, p=PETSC_ERROR_INITIAL, 
>     mess=0x279c720 "Scalar value must be same on all processes, argument # %d")
>     at err.c:356
> #4  0x00000000013f8184 in VecMAXPY (y=0x3b35000, nv=20, alpha=0x3b31840, 
>     x=0x3b33080) at rvector.c:1186
> #5  0x0000000001581062 in KSPGMRESBuildSoln (nrs=0x3b31840, vs=0x3ab2090, 
>     vdest=0x3ab2090, ksp=0x39a9700, it=19) at gmres.c:345
> 
> 
> But I break at VecMAXPY. then print out alpha on both processes. For the first few times that the breakpoint is hit , I checked the values on both processes and they are the same. 
> 
> 
> (gdb) b VecMAXPY
> Breakpoint 1 at 0x13f73e0: file rvector.c, line 1174.
> (gdb) c
> Continuing.
> Breakpoint 1, VecMAXPY (y=0x3f2b790, nv=1, alpha=0x3f374e0, x=0x3f1fde0)
>     at rvector.c:1174
> 1174      PetscFunctionBegin;
> (gdb) p alpha
> $1 = (const PetscScalar *) 0x3f374e0
> (gdb) p *alpha
> $2 = -0.54285016977140765
> (gdb) 
> 
> 2015-03-11 15:52 GMT-04:00 Matthew Knepley <knepley at gmail.com>:
> On Wed, Mar 11, 2015 at 2:33 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> Hello,
> 
> I'm solving Navier-Stokes equations by finite element method.  I use KSP as the linear system solver. I run with 2 cpu. The code runs fine in non-debug version. But when I switch to the debug version, the code gives the following error.
> 
> I output the matrix and rhs before calling KSPSolve to make sure no NAN or INF in them. The condition number of matrix is ~2e4.  Seems okay.
> 
> I also run the code with valgrind, but didn't find any other errors. The valgrind output is attached. Any ideas of what I can do next?
> 
> Is there any chance you could spawn the debugger, -start_in_debugger, and when you get the error,
> print out the value of 'alpha' on each process?
> 
> Otherwise the best thing to do is output your Mat and RHS in binary and send them so we can try to reproduce.
> 
>     Matt
> 
>   Thanks,
> 
>      Matt
>  
> Thanks in advance.
> 
> 
> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Scalar value must be same on all processes, argument # 3!
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: /home/cfd/sgao/mycodes/fensap_new_edge_coefficient/fensapng-mf-newmuscl-overledg_org/bin/fensapMPI_LINUX64_DEBUG on a linux named anakin by sgao Wed Mar 11 15:07:53 2015
> [0]PETSC ERROR: Libraries linked from /tmp/PETSC33/petsc-3.3-p7/linux/lib
> [0]PETSC ERROR: Configure run at Wed Jan 15 12:04:54 2014
> [0]PETSC ERROR: Configure options --with-mpi-dir=/usr/local.linux64/lib64/MPI-openmpi-1.4.5/ --with-shared-libraries=0 --COPTFLAGS=-g --FOPTFLAGS=-g --with-debugging=yes
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: VecMAXPY() line 1186 in src/vec/vec/interface/rvector.c
> [0]PETSC ERROR: KSPGMRESBuildSoln() line 345 in src/ksp/ksp/impls/gmres/gmres.c
> [0]PETSC ERROR: KSPGMRESCycle() line 206 in src/ksp/ksp/impls/gmres/gmres.c
> [0]PETSC ERROR: KSPSolve_GMRES() line 231 in src/ksp/ksp/impls/gmres/gmres.c
> [0]PETSC ERROR: KSPSolve() line 446 in src/ksp/ksp/interface/itfunc.c
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 



More information about the petsc-users mailing list