[petsc-users] PETSc 3.2 gmres error
Satish Balay
balay at mcs.anl.gov
Sat Sep 10 10:50:09 CDT 2011
You can disable these checks if you build PETSc with --with-debugging=0
The slight difference in numerical values computed by the 2 different
compiler/glibc versions [or slight difference in fpu hardware in the 2
machines] could be the reason for this test to fail.
Satish
On Sat, 10 Sep 2011, Tabrez Ali wrote:
> Jed
>
> Thanks for your reply. Yes I should have tested with a PETSc example. I do get
> the same error (see below) with ex2 in src/ksp/ksp/examples/tutorials with
> GMRES. Again CG works fine though.
>
> So I guess my code is fine and something is wrong with my setup.
>
> Satish
>
> Yes everything runs fine as 2 jobs on a single machine and that is how I
> usually test. I was just experimenting here.
>
> Tabrez
>
> stali at i5:~/petsc-dev/src/ksp/ksp/examples/tutorials$ mpiexec -f hosts -n 2
> ./ex2 -ksp_type cg -ksp_monitor
> 0 KSP Residual norm 3.562148313266e+00
> 1 KSP Residual norm 1.215355568718e+00
> 2 KSP Residual norm 5.908378943191e-01
> 3 KSP Residual norm 2.388447476613e-01
> 4 KSP Residual norm 5.291449320146e-02
> 5 KSP Residual norm 1.227766600895e-02
> 6 KSP Residual norm 2.190918491891e-03
> 7 KSP Residual norm 3.758527933277e-04
> Norm of error 0.000432115 iterations 7
>
> stali at i5:~/petsc-dev/src/ksp/ksp/examples/tutorials$ mpiexec -f hosts -n 2
> ./ex2 -ksp_type gmres -ksp_monitor
> 0 KSP Residual norm 3.562148313266e+00
> 1 KSP Residual norm 1.215348368658e+00
> 2 KSP Residual norm 5.599263969157e-01
> 3 KSP Residual norm 2.185276631601e-01
> 4 KSP Residual norm 5.060212909332e-02
> 5 KSP Residual norm 1.172638597604e-02
> 6 KSP Residual norm 2.158149739691e-03
> 7 KSP Residual norm 3.696833900173e-04
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Scalar value must be same on all processes, argument # 3!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Development HG revision:
> a8941623c0b6225ff3688949b01271e9ae85a545 HG Date: Fri Sep 09 19:37:41 2011
> -0500
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./ex2 on a arch-linu named i5 by stali Sat Sep 10 10:27:41
> 2011
> [0]PETSC ERROR: Libraries linked from /opt/petsc-3.2/lib
> [0]PETSC ERROR: Configure run at Sat Sep 10 10:02:38 2011
> [0]PETSC ERROR: Configure options --prefix=/opt/petsc-3.2
> --with-mpi-dir=/opt/mpich2-gcc --with-parmetis=1 --download-parmetis=1
> --with-shared-libraries
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: VecMAXPY() line 1190 in src/vec/vec/interface/rvector.c
> [0]PETSC ERROR: BuildGmresSoln() line 345 in src/ksp/ksp/impls/gmres/gmres.c
> [0]PETSC ERROR: GMREScycle() line 206 in src/ksp/ksp/impls/gmres/gmres.c
> [0]PETSC ERROR: KSPSolve_GMRES() line 231 in src/ksp/ksp/impls/gmres/gmres.c
> [0]PETSC ERROR: KSPSolve() line 423 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: main() line 199 in src/ksp/ksp/examples/tutorials/ex2.c
> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0
> [cli_0]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0
>
>
> On 09/10/2011 06:38 AM, Jed Brown wrote:
> > On Sat, Sep 10, 2011 at 13:09, Tabrez Ali <stali at geology.wisc.edu
> > <mailto:stali at geology.wisc.edu>> wrote:
> >
> > Hello
> >
> > I am running an application using PETSc 3.2 RC on a poor mans
> > cluster at my home (for testing only) which has two nodes running
> > different versions of Debian (they also have different versions of
> > gcc/gfortran) but have the same MPICH2 1.4 and PETSc 3.2 installed
> > on them.
> >
> > Also they do not share the same file system but I make sure that
> > input file/executable paths are exactly same on both machines.
> > After compiling the code separately on the two nodes I launch the
> > parallel program from node 1 using mpiexec -f hosts -n 2 ....
> > (hydra process manager).
> >
> > With PETSc 3.1 the application runs fine, both with CG and GMRES
> > and correct output is generated on both nodes.
> >
> > With PETSc 3.2 the application runs fine with CG.
> >
> > But whenever I use GMRES in 3.2 I get an error (listed below)
> > during KSPSolve.
> >
> >
> > Can you reproduce this with any of the examples? For example
> >
> > cd src/ksp/ksp/examples/tutorials
> > make ex2
> > mpiexec -f hosts -n 2 ./ex2 -ksp_type gmres
> >
> > or, to use your matrix, run (any version that works, including 3.1) with
> > -ksp_view_binary and then
> >
> > cd src/ksp/ksp/examples/tutorials
> > make ex10
> > mpiexec -f hosts -n 2 ./ex10 -f binaryoutput -ksp_type gmres
> >
> > If these work, there might be memory corruption somewhere in your code
> > causing this.
> >
> >
> > You can also run with -start_in_debugger and check what is in the "alpha"
> > array on each process.
> >
>
>
More information about the petsc-users
mailing list