Problems porting code to an IBM power5+ machine

Knut Erik Teigen knutert at stud.ntnu.no
Tue Mar 20 03:55:09 CDT 2007


On Mon, 2007-03-19 at 08:39 -0500, Matthew Knepley wrote:
>   This smells like a memory overwrite or use of an uninitialized
> variable. The initial norm is 1e28.
> 
> 1) Try Jacobi instead of ICC to see if it is localized to the PC

With jacobi I get:
[0] PCSetUpSetting up new PC
  0 KSP Residual norm 1.466015468749e+13
  1 KSP Residual norm 2.720083022075e+23
[0] KSPDefaultConvergedLinear solver is diverging. Initial right hand
size norm 1.46602e+13, current residual norm 2.72008e+23 at iteration 1

And without a preconditioner at all:
 0 KSP Residual norm 7.719804763794e+00
  1 KSP Residual norm 1.137387752533e+01
[0] KSPSolve_CGdiverging due to indefinite or negative definite matrix

The matrix definitely isn't indefinite or negative definite, as is clear
from the output in my previous post.

> 2) Run with valgrind or something similar to check for a memory overwrite
> 
> 3) Maybe insert CHKMEMQ statements into the code

I've run with CHKMEMQ statements, and with -malloc_debug, but didn't get
any complaints.
Valgrind unfortunately isn't installed on the cluster.

-Knut Erik-
> 
>   Thanks,
> 
>     Matt
> 
> On 3/19/07, Knut Erik Teigen <knutert at stud.ntnu.no> wrote:
> > Hello
> >
> > I have got a working code on my local machine( Pentium 4), but when I
> > try to run the code on a power5+ machine, the equation solver won't
> > converge. It seems like it calculates the wrong right hand side norm.
> > Below is the result with run-time options
> > "-ksp_type cg -pc_type icc -ksp_monitor -ksp_view -info
> > First with the code running on the power5+ machine, then on my local
> > machine. I've also printed the right hand side, jacobian matrix and
> > solution for a small 3x3 grid.
> >
> > Can anyone help me figure out what's wrong?
> >
> > Regards,
> > Knut Erik Teigen
> >
> > Code running on Power5+:
> > rhs:
> > 980
> > 980
> > 980
> > -0
> > -0
> > -0
> > -980
> > -980
> > -980
> > jacobian:
> > row 0: (0, 20)  (1, -10)  (3, -10)
> > row 1: (0, -10)  (1, 30)  (2, -10)  (4, -10)
> > row 2: (1, -10)  (2, 20)  (5, -10)
> > row 3: (0, -10)  (3, 30)  (4, -10)  (6, -10)
> > row 4: (1, -10)  (3, -10)  (4, 40)  (5, -10)  (7, -10)
> > row 5: (2, -10)  (4, -10)  (5, 30)  (8, -10)
> > row 6: (3, -10)  (6, 20)  (7, -10)
> > row 7: (4, -10)  (6, -10)  (7, 30)  (8, -10)
> > row 8: (5, -10)  (7, -10)  (8, 20)
> > [0] PCSetUpSetting up new PC
> > [0] PetscCommDuplicateDuplicating a communicator 1 4 max tags =
> > 1073741823
> > [0] PetscCommDuplicateUsing internal PETSc communicator 1 4
> > [0] PetscCommDuplicateUsing internal PETSc communicator 1 4
> >   0 KSP Residual norm 7.410163701832e+28
> >   1 KSP Residual norm 6.464393707520e+11
> > [0] KSPDefaultConvergedLinear solver has converged. Residual norm
> > 6.46439e+11 is less than relative tolerance 1e-07 times initial right
> > hand side norm 7.41016e+28 at iteration 1
> > KSP Object:
> >   type: cg
> >   maximum iterations=10000, initial guess is zero
> >   tolerances:  relative=1e-07, absolute=1e-50, divergence=10000
> >   left preconditioning
> > PC Object:
> >   type: icc
> >     ICC: 0 levels of fill
> >     ICC: factor fill ratio allocated 1
> >     ICC: factor fill ratio needed 0.636364
> >          Factored matrix follows
> >         Matrix Object:
> >           type=seqsbaij, rows=9, cols=9
> >           total: nonzeros=21, allocated nonzeros=21
> >               block size is 1
> >   linear system matrix = precond matrix:
> >   Matrix Object:
> >     type=seqaij, rows=9, cols=9
> >     total: nonzeros=33, allocated nonzeros=45
> >       not using I-node routines
> > solution:
> > 6.37205e-09
> > 7.13167e-09
> > 7.49911e-09
> > -2.48277e-09
> > -4.56885e-10
> > 0
> > 0
> > 0
> > 0
> >
> > Code running on local machine:
> > rhs:
> > 980
> > 980
> > 980
> > -0
> > -0
> > -0
> > -980
> > -980
> > -980
> > jacobian:
> > row 0: (0, 20)  (1, -10)  (3, -10)
> > row 1: (0, -10)  (1, 30)  (2, -10)  (4, -10)
> > row 2: (1, -10)  (2, 20)  (5, -10)
> > row 3: (0, -10)  (3, 30)  (4, -10)  (6, -10)
> > row 4: (1, -10)  (3, -10)  (4, 40)  (5, -10)  (7, -10)
> > row 5: (2, -10)  (4, -10)  (5, 30)  (8, -10)
> > row 6: (3, -10)  (6, 20)  (7, -10)
> > row 7: (4, -10)  (6, -10)  (7, 30)  (8, -10)
> > row 8: (5, -10)  (7, -10)  (8, 20)
> > [0] PCSetUp(): Setting up new PC
> > [0] PetscCommDuplicate(): Duplicating a communicator 1140850689
> > -2080374783 max tags = 2147483647
> > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
> > -2080374783
> > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
> > -2080374783
> >   0 KSP Residual norm 2.505507810276e+02
> >   1 KSP Residual norm 3.596555656581e+01
> >   2 KSP Residual norm 2.632672485513e+00
> >   3 KSP Residual norm 1.888285055287e-01
> >   4 KSP Residual norm 7.029433008806e-03
> >   5 KSP Residual norm 3.635267067420e-14
> > [0] KSPDefaultConverged(): Linear solver has converged. Residual norm
> > 3.63527e-14 is less than relative tolerance 1e-07 times initial right
> > hand side norm 250.551 at iteration 5
> > KSP Object:
> >   type: cg
> >   maximum iterations=10000, initial guess is zero
> >   tolerances:  relative=1e-07, absolute=1e-50, divergence=10000
> >   left preconditioning
> > PC Object:
> >   type: icc
> >     ICC: 0 levels of fill
> >     ICC: factor fill ratio allocated 1
> >     ICC: factor fill ratio needed 0.636364
> >          Factored matrix follows
> >         Matrix Object:
> >           type=seqsbaij, rows=9, cols=9
> >           total: nonzeros=21, allocated nonzeros=21
> >               block size is 1
> >   linear system matrix = precond matrix:
> >   Matrix Object:
> >     type=seqaij, rows=9, cols=9
> >     total: nonzeros=33, allocated nonzeros=45
> >       not using I-node routines
> > solution:
> > 92.3023
> > 92.3023
> > 92.3023
> > -5.69767
> > -5.69767
> > -5.69767
> > -103.698
> > -103.698
> > -103.698
> >
> >
> >
> >
> 
> 




More information about the petsc-users mailing list