[petsc-users] PETSc 3.2 gmres error

Satish Balay balay at mcs.anl.gov
Sat Sep 10 10:50:09 CDT 2011


You can disable these checks if you build PETSc with --with-debugging=0

The slight difference in numerical values computed by the 2 different
compiler/glibc versions [or slight difference in fpu hardware in the 2
machines] could be the reason for this test to fail.

Satish

On Sat, 10 Sep 2011, Tabrez Ali wrote:

> Jed
> 
> Thanks for your reply. Yes I should have tested with a PETSc example. I do get
> the same error (see below) with ex2 in src/ksp/ksp/examples/tutorials with
> GMRES. Again CG works fine though.
> 
> So I guess my code is fine and something is wrong with my setup.
> 
> Satish
> 
> Yes everything runs fine as 2 jobs on a single machine and that is how I
> usually test. I was just experimenting here.
> 
> Tabrez
> 
> stali at i5:~/petsc-dev/src/ksp/ksp/examples/tutorials$ mpiexec -f hosts -n 2
> ./ex2 -ksp_type cg -ksp_monitor
>   0 KSP Residual norm 3.562148313266e+00
>   1 KSP Residual norm 1.215355568718e+00
>   2 KSP Residual norm 5.908378943191e-01
>   3 KSP Residual norm 2.388447476613e-01
>   4 KSP Residual norm 5.291449320146e-02
>   5 KSP Residual norm 1.227766600895e-02
>   6 KSP Residual norm 2.190918491891e-03
>   7 KSP Residual norm 3.758527933277e-04
> Norm of error 0.000432115 iterations 7
> 
> stali at i5:~/petsc-dev/src/ksp/ksp/examples/tutorials$ mpiexec -f hosts -n 2
> ./ex2 -ksp_type gmres -ksp_monitor
>   0 KSP Residual norm 3.562148313266e+00
>   1 KSP Residual norm 1.215348368658e+00
>   2 KSP Residual norm 5.599263969157e-01
>   3 KSP Residual norm 2.185276631601e-01
>   4 KSP Residual norm 5.060212909332e-02
>   5 KSP Residual norm 1.172638597604e-02
>   6 KSP Residual norm 2.158149739691e-03
>   7 KSP Residual norm 3.696833900173e-04
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Scalar value must be same on all processes, argument # 3!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Development HG revision:
> a8941623c0b6225ff3688949b01271e9ae85a545  HG Date: Fri Sep 09 19:37:41 2011
> -0500
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./ex2 on a arch-linu named i5 by stali Sat Sep 10 10:27:41
> 2011
> [0]PETSC ERROR: Libraries linked from /opt/petsc-3.2/lib
> [0]PETSC ERROR: Configure run at Sat Sep 10 10:02:38 2011
> [0]PETSC ERROR: Configure options --prefix=/opt/petsc-3.2
> --with-mpi-dir=/opt/mpich2-gcc --with-parmetis=1 --download-parmetis=1
> --with-shared-libraries
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: VecMAXPY() line 1190 in src/vec/vec/interface/rvector.c
> [0]PETSC ERROR: BuildGmresSoln() line 345 in src/ksp/ksp/impls/gmres/gmres.c
> [0]PETSC ERROR: GMREScycle() line 206 in src/ksp/ksp/impls/gmres/gmres.c
> [0]PETSC ERROR: KSPSolve_GMRES() line 231 in src/ksp/ksp/impls/gmres/gmres.c
> [0]PETSC ERROR: KSPSolve() line 423 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: main() line 199 in src/ksp/ksp/examples/tutorials/ex2.c
> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0
> [cli_0]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0
> 
> 
> On 09/10/2011 06:38 AM, Jed Brown wrote:
> > On Sat, Sep 10, 2011 at 13:09, Tabrez Ali <stali at geology.wisc.edu
> > <mailto:stali at geology.wisc.edu>> wrote:
> > 
> >     Hello
> > 
> >     I am running an application using PETSc 3.2 RC on a poor mans
> >     cluster at my home (for testing only) which has two nodes running
> >     different versions of Debian (they also have different versions of
> >     gcc/gfortran) but have the same MPICH2 1.4 and PETSc 3.2 installed
> >     on them.
> > 
> >     Also they do not share the same file system but I make sure that
> >     input file/executable paths are exactly same on both machines.
> >     After compiling the code separately on the two nodes I launch the
> >     parallel program from node 1 using mpiexec -f hosts -n 2 ....
> >     (hydra process manager).
> > 
> >     With PETSc 3.1 the application runs fine, both with CG and GMRES
> >     and correct output is generated on both nodes.
> > 
> >     With PETSc 3.2 the application runs fine with CG.
> > 
> >     But whenever I use GMRES in 3.2 I get an error (listed below)
> >     during KSPSolve.
> > 
> > 
> > Can you reproduce this with any of the examples? For example
> > 
> > cd src/ksp/ksp/examples/tutorials
> > make ex2
> > mpiexec -f hosts -n 2 ./ex2 -ksp_type gmres
> > 
> > or, to use your matrix, run (any version that works, including 3.1) with
> > -ksp_view_binary and then
> > 
> > cd src/ksp/ksp/examples/tutorials
> > make ex10
> > mpiexec -f hosts -n 2 ./ex10 -f binaryoutput -ksp_type gmres
> > 
> > If these work, there might be memory corruption somewhere in your code
> > causing this.
> > 
> > 
> > You can also run with -start_in_debugger and check what is in the "alpha"
> > array on each process.
> > 
> 
> 



More information about the petsc-users mailing list