[petsc-users] issue with NullSpaceRemove in parallel

Barry Smith bsmith at mcs.anl.gov
Wed Oct 5 22:18:15 CDT 2016


  The message "Scalar value must be same on all processes, argument # 2"  comes up often when a Nan or Inf as gotten into the computation. The IEEE standard for floating point operations defines that Nan != Nan; 

  I recommend running again with -fp_trap this should cause the code to stop with an error message as soon as the Nan or Inf is generated. 

  Barry




> On Oct 5, 2016, at 9:21 PM, Mohammad Mirzadeh <mirzadeh at gmail.com> wrote:
> 
> Hi folks,
> 
> I am trying to track down a bug that is sometimes triggered when solving a singular system (poisson+neumann). It only seems to happen in parallel and halfway through the run. I can provide detailed information about the actual problem, but the error message I get boils down to this:
> 
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Invalid argument
> [0]PETSC ERROR: Scalar value must be same on all processes, argument # 2
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 
> [0]PETSC ERROR: ./two_fluid_2d on a linux named bazantserver1 by mohammad Wed Oct  5 21:14:47 2016
> [0]PETSC ERROR: Configure options PETSC_ARCH=linux --prefix=/usr/local --with-clanguage=cxx --with-c-support --with-shared-libraries --download-hypre --download-metis --download-parmetis --download-ml --download-superlu_dist COPTFLAGS=" -O3 -march=native" CXXOPTFLAGS=" -O3 -march=native" FOPTFLAGS=" -O3 -march=native"
> [0]PETSC ERROR: #1 VecShift() line 1480 in /tmp/petsc-3.6.3/src/vec/vec/utils/vinv.c
> [0]PETSC ERROR: #2 MatNullSpaceRemove() line 348 in /tmp/petsc-3.6.3/src/mat/interface/matnull.c
> [0]PETSC ERROR: #3 KSP_RemoveNullSpace() line 207 in /tmp/petsc-3.6.3/include/petsc/private/kspimpl.h
> [0]PETSC ERROR: #4 KSP_PCApply() line 243 in /tmp/petsc-3.6.3/include/petsc/private/kspimpl.h
> [0]PETSC ERROR: #5 KSPInitialResidual() line 63 in /tmp/petsc-3.6.3/src/ksp/ksp/interface/itres.c
> [0]PETSC ERROR: #6 KSPSolve_BCGS() line 50 in /tmp/petsc-3.6.3/src/ksp/ksp/impls/bcgs/bcgs.c
> [0]PETSC ERROR: #7 KSPSolve() line 604 in /tmp/petsc-3.6.3/src/ksp/ksp/interface/itfunc.c
> 
> I understand this is somewhat vague question, but any idea what could cause this sort of problem? This was on 2 processors. The same code runs fine on a single processor. Also the solution seems to converge fine on previous iterations, e.g. this is the convergence info from the last iteration before the code breaks:
> 
>   0 KSP preconditioned resid norm 6.814085878146e+01 true resid norm 2.885308600701e+00 ||r(i)||/||b|| 1.000000000000e+00
>   1 KSP preconditioned resid norm 3.067319980814e-01 true resid norm 8.480307326867e-02 ||r(i)||/||b|| 2.939133555699e-02
>   2 KSP preconditioned resid norm 1.526405979843e-03 true resid norm 1.125228519827e-03 ||r(i)||/||b|| 3.899855008762e-04
>   3 KSP preconditioned resid norm 2.199423175998e-05 true resid norm 4.232832916628e-05 ||r(i)||/||b|| 1.467029528695e-05
>   4 KSP preconditioned resid norm 5.382291463582e-07 true resid norm 8.438732856334e-07 ||r(i)||/||b|| 2.924724535283e-07
>   5 KSP preconditioned resid norm 9.495525177398e-09 true resid norm 1.408250768598e-08 ||r(i)||/||b|| 4.880763077669e-09
>   6 KSP preconditioned resid norm 9.249233376169e-11 true resid norm 2.795840275267e-10 ||r(i)||/||b|| 9.689917655907e-11
>   7 KSP preconditioned resid norm 1.138293762641e-12 true resid norm 2.559058680281e-12 ||r(i)||/||b|| 8.869272006674e-13
> 
> Also, if it matters, this is using hypre as PC and bcgs as KSP.
> 
> Thanks



More information about the petsc-users mailing list