<div dir="ltr">Thanks for your suggestion.   Yes,  but sorry that unfortunately I'm not allowed to upgrade it.  <div><br></div><div>As for the preconditioner. I tried with -pc_type none and still have the same error message.</div><div><br></div><div>Sorry, I am not clear. The code crashes at  PetscValidLogicalCollectiveScalar(y,alpha[i],3) and I have seen that on both processors  <span style="font-size:12.6666669845581px">b2 = {-0.013169008988739605, 0.013169008988739673}, Forgive me if I ask a silly question. Is it because we compare two double precision number with the operator '!=' ?    Looking at line 309,   Of course, (-b2[0] != b2[1]) is true, right? </span></div><div><span style="font-size:12.6666669845581px"><br></span></div><div><span style="font-size:12.6666669845581px">Thanks for your help.</span></div><div><div style="font-size:12.6666669845581px"><br></div></div><div>I pasted PetscValidLogicalCollectiveScalar here:</div><div><br></div><div><div>303 #define PetscValidLogicalCollectiveScalar(a,b,c)                        \</div><div>304   do {                                                                  \</div><div>305     PetscErrorCode _7_ierr;                                             \</div><div>306     PetscReal b1[2],b2[2];                                              \</div><div>307     b1[0] = -PetscRealPart(b); b1[1] = PetscRealPart(b);                \</div><div>308     _7_ierr = MPI_Allreduce(b1,b2,2,MPIU_REAL,MPIU_MAX,((PetscObject)a)->comm);CHKERRQ(_7_ierr); \</div><div>309     if (-b2[0] != b2[1]) SETERRQ1(((PetscObject)a)->comm,PETSC_ERR_ARG_WRONG,"Scalar value must be same on all processes, argument # %d",c); \</div><div>310   } while (0)</div></div><div><br></div><div><br></div><div><div><span style="font-size:12.6666669845581px">Regards,</span></div><div><span style="font-size:12.6666669845581px">Song </span></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">2015-03-11 20:16 GMT-04:00 Barry Smith <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

  In IEEE floating point standard Nan is not equal to itself. This is what triggers these types of non-intuitive error messages.<br>

<br>

  My guess is that the preconditioner is triggering a Nan; for example ILU sometimes triggers Nan due to very small pivots.<br>

<br>

  You are using an old version of PETSc the first step I recommend is to upgrade <a href="http://www.mcs.anl.gov/petsc/documentation/changes/index.html" target="_blank">http://www.mcs.anl.gov/petsc/documentation/changes/index.html</a>; the more recent version of PETSc has more checks for Nan etc in the code.<br>

<span class="HOEnZb"><font color="#888888"><br>

  Barry<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

<br>

<br>

> On Mar 11, 2015, at 7:05 PM, Song Gao <<a href="mailto:song.gao2@mail.mcgill.ca">song.gao2@mail.mcgill.ca</a>> wrote:<br>

><br>

> Thank you. That's cool. Sorry, I'm not good at gdb.<br>

><br>

> I did that. They are the same. One gave<br>

> (gdb) p *alpha<br>

> $1 = 0.013169008988739605<br>

><br>

> and another gave<br>

> (gdb) p *alpha<br>

> $1 = 0.013169008988739673<br>

><br>

> 2015-03-11 17:35 GMT-04:00 Matthew Knepley <<a href="mailto:knepley@gmail.com">knepley@gmail.com</a>>:<br>

> On Wed, Mar 11, 2015 at 3:39 PM, Song Gao <<a href="mailto:song.gao2@mail.mcgill.ca">song.gao2@mail.mcgill.ca</a>> wrote:<br>

> Thanks.<br>

><br>

> I run with two processes. When the code stop, I'm in      raise()     and alpha is not in the current context.<br>

><br>

> Here you would use:<br>

><br>

>   (gdb) up 4<br>

><br>

>   (gdb) p *alpha<br>

><br>

>    Matt<br>

><br>

> (gdb)p alpha<br>

> No symbol "alpha" in current context.<br>

> (gdb)  bt<br>

> #0  0x0000003764432625 in raise () from /lib64/libc.so.6<br>

> #1  0x0000003764433e05 in abort () from /lib64/libc.so.6<br>

> #2  0x00000000015d02f5 in PetscAbortErrorHandler (comm=0x36e17a0, line=1186,<br>

>     fun=0x279cad4 "VecMAXPY", file=0x279c404 "rvector.c",<br>

>     dir=0x279c1c8 "src/vec/vec/interface/", n=62, p=PETSC_ERROR_INITIAL,<br>

>     mess=0x7fff33ffa4c0 "Scalar value must be same on all processes, argument # 3", ctx=0x0) at errabort.c:62<br>

> #3  0x000000000130cf44 in PetscError (comm=0x36e17a0, line=1186,<br>

>     func=0x279cad4 "VecMAXPY", file=0x279c404 "rvector.c",<br>

>     dir=0x279c1c8 "src/vec/vec/interface/", n=62, p=PETSC_ERROR_INITIAL,<br>

>     mess=0x279c720 "Scalar value must be same on all processes, argument # %d")<br>

>     at err.c:356<br>

> #4  0x00000000013f8184 in VecMAXPY (y=0x3b35000, nv=20, alpha=0x3b31840,<br>

>     x=0x3b33080) at rvector.c:1186<br>

> #5  0x0000000001581062 in KSPGMRESBuildSoln (nrs=0x3b31840, vs=0x3ab2090,<br>

>     vdest=0x3ab2090, ksp=0x39a9700, it=19) at gmres.c:345<br>

><br>

><br>

> But I break at VecMAXPY. then print out alpha on both processes. For the first few times that the breakpoint is hit , I checked the values on both processes and they are the same.<br>

><br>

><br>

> (gdb) b VecMAXPY<br>

> Breakpoint 1 at 0x13f73e0: file rvector.c, line 1174.<br>

> (gdb) c<br>

> Continuing.<br>

> Breakpoint 1, VecMAXPY (y=0x3f2b790, nv=1, alpha=0x3f374e0, x=0x3f1fde0)<br>

>     at rvector.c:1174<br>

> 1174      PetscFunctionBegin;<br>

> (gdb) p alpha<br>

> $1 = (const PetscScalar *) 0x3f374e0<br>

> (gdb) p *alpha<br>

> $2 = -0.54285016977140765<br>

> (gdb)<br>

><br>

> 2015-03-11 15:52 GMT-04:00 Matthew Knepley <<a href="mailto:knepley@gmail.com">knepley@gmail.com</a>>:<br>

> On Wed, Mar 11, 2015 at 2:33 PM, Song Gao <<a href="mailto:song.gao2@mail.mcgill.ca">song.gao2@mail.mcgill.ca</a>> wrote:<br>

> Hello,<br>

><br>

> I'm solving Navier-Stokes equations by finite element method.  I use KSP as the linear system solver. I run with 2 cpu. The code runs fine in non-debug version. But when I switch to the debug version, the code gives the following error.<br>

><br>

> I output the matrix and rhs before calling KSPSolve to make sure no NAN or INF in them. The condition number of matrix is ~2e4.  Seems okay.<br>

><br>

> I also run the code with valgrind, but didn't find any other errors. The valgrind output is attached. Any ideas of what I can do next?<br>

><br>

> Is there any chance you could spawn the debugger, -start_in_debugger, and when you get the error,<br>

> print out the value of 'alpha' on each process?<br>

><br>

> Otherwise the best thing to do is output your Mat and RHS in binary and send them so we can try to reproduce.<br>

><br>

>     Matt<br>

><br>

>   Thanks,<br>

><br>

>      Matt<br>

><br>

> Thanks in advance.<br>

><br>

><br>

> [0]PETSC ERROR: --------------------- Error Message ------------------------------------<br>

> [0]PETSC ERROR: Invalid argument!<br>

> [0]PETSC ERROR: Scalar value must be same on all processes, argument # 3!<br>

> [0]PETSC ERROR: ------------------------------------------------------------------------<br>

> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013<br>

> [0]PETSC ERROR: See docs/changes/index.html for recent updates.<br>

> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.<br>

> [0]PETSC ERROR: See docs/index.html for manual pages.<br>

> [0]PETSC ERROR: ------------------------------------------------------------------------<br>

> [0]PETSC ERROR: /home/cfd/sgao/mycodes/fensap_new_edge_coefficient/fensapng-mf-newmuscl-overledg_org/bin/fensapMPI_LINUX64_DEBUG on a linux named anakin by sgao Wed Mar 11 15:07:53 2015<br>

> [0]PETSC ERROR: Libraries linked from /tmp/PETSC33/petsc-3.3-p7/linux/lib<br>

> [0]PETSC ERROR: Configure run at Wed Jan 15 12:04:54 2014<br>

> [0]PETSC ERROR: Configure options --with-mpi-dir=/usr/local.linux64/lib64/MPI-openmpi-1.4.5/ --with-shared-libraries=0 --COPTFLAGS=-g --FOPTFLAGS=-g --with-debugging=yes<br>

> [0]PETSC ERROR: ------------------------------------------------------------------------<br>

> [0]PETSC ERROR: VecMAXPY() line 1186 in src/vec/vec/interface/rvector.c<br>

> [0]PETSC ERROR: KSPGMRESBuildSoln() line 345 in src/ksp/ksp/impls/gmres/gmres.c<br>

> [0]PETSC ERROR: KSPGMRESCycle() line 206 in src/ksp/ksp/impls/gmres/gmres.c<br>

> [0]PETSC ERROR: KSPSolve_GMRES() line 231 in src/ksp/ksp/impls/gmres/gmres.c<br>

> [0]PETSC ERROR: KSPSolve() line 446 in src/ksp/ksp/interface/itfunc.c<br>

><br>

><br>

><br>

><br>

> --<br>

> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>

> -- Norbert Wiener<br>

><br>

><br>

><br>

><br>

> --<br>

> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>

> -- Norbert Wiener<br>

><br>

<br>

</div></div></blockquote></div><br></div>