Non repetability issue and difference between 2.3.0 and 2.3.3

Thu Sep 25 05:09:42 CDT 2008

Hi Matt,

I am sure that the partitioning is exactly the same:
I have an external tool that partitions the mesh before launching the FE code. So for all the runs the mesh partitions has been created only once and then reused.

For the case where I wanted every ghost node to be shared by two and only two processors, I used simple geometries like rings or bars with structured meshes. Once again the partitions have been created once and then reused.

The initial residuals and the initial matrix are exactly the same.

I have added some lines in my code:
After calling MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); 
I made a Matrix vector product between A and an unity vector. Then I've computed the norm of the resulting vector. You will see below the results for 4 linear system solves (two with 2.3.0 and two with 2.3.3p8)

Mainly:
With all runs :
1/ the results of the matrix * unity vector product are the same: 6838.31173987650

2/ the     Initial Residual also     : 1.50972105381228e+006

3/ At iteration 40 all the runs provides exactly the same residual:
Iteration= 40   residual= 2.64670054e+003       tolerance=  3.01944211e+000

3/ with 2.3.0 the final residual is always the same : 3.19392726797939e+000

4/ with 2.3.3p8 the final residual vary after iteration 40.

Some statistics made with 12 successive runs :

We obtained 5 times 3.19515221050523, two times 3.19369843187027,  three times 3.19373947848208e  and  two others for the two lasts. 

RUN1:  3.19515221050523e+000
RUN2:  3.19515221050523e+000
RUN3:  3.19369843187027e+000
RUN4:  3.19588480582213e+000
RUN5:  3.19515221050523e+000
RUN6:  3.19373947848208e+000
RUN7:  3.19515221050523e+000
RUN8:  3.19384417350916e+000
RUN9:  3.19515221050523e+000
RUN10: 3.19373947848208e+000 
RUN11: 3.19369843187027e+000
RUN12: 3.19373947848208e+000

So same initial residual, same results for the matrix * unity vector product, same residual at iteration 40.
I always used the options:

OptionTable: -ksp_truemonitor
OptionTable: -log_summary

Any ideas will be very welcome, don't hesitate if you need additional tests.

It sound, perhaps, reuse of a buffer that has not been properly released ?

Best regards,
Etienne

------------------------------------------------------------------------
With 2.3.0: Using Petsc Release Version 2.3.0, Patch 44, April, 26, 2005

RUN1:

Norm A*One =   6838.31173987650

*     Resolution method              : Preconditionned Conjugate Residual
*     Preconditionner                : BJACOBI with ILU, Blocks of 1
*
*     Initial Residual               : 1.50972105381228e+006
Iteration= 1    residual= 9.59236416e+004       tolerance=  7.54860527e-002
Iteration= 2    residual= 8.46044988e+004       tolerance=  1.50972105e-001

Iteration= 66   residual= 3.73014307e+001       tolerance=  4.98207948e+000
Iteration= 67   residual= 3.75579067e+001       tolerance=  5.05756553e+000
Iteration= 68   residual= 3.19392727e+000       tolerance=  5.13305158e+000 *
*     Number of iterations           : 68
*     Convergency code               : 3
*     Final Residual Norm            : 3.19392726797939e+000
*     PETSC : Resolution time        : 1.000389 seconds

RUN2:

Norme A*Un =   6838.31173987650
*     Resolution method              : Preconditionned Conjugate Residual
*     Preconditionner                : BJACOBI with ILU, Blocks of 1
*
*     Initial Residual               : 1.50972105381228e+006
Iteration= 1    residual= 9.59236416e+004       tolerance=  7.54860527e-002 
Iteration= 2    residual= 8.46044988e+004       tolerance=  1.50972105e-001 
Iteration= 10   residual= 2.73382943e+004       tolerance=  7.54860527e-001 
Iteration= 20   residual= 7.27122933e+003       tolerance=  1.50972105e+000 
Iteration= 30   residual= 8.42209039e+003       tolerance=  2.26458158e+000 
Iteration= 40   residual= 2.64670054e+003       tolerance=  3.01944211e+000 
Iteration= 50   residual= 3.17446784e+002       tolerance=  3.77430263e+000 
Iteration= 60   residual= 3.53234217e+001       tolerance=  4.52916316e+000 
Iteration= 66   residual= 3.73014307e+001       tolerance=  4.98207948e+000 
Iteration= 67   residual= 3.75579067e+001       tolerance=  5.05756553e+000 
Iteration= 68   residual= 3.19392727e+000       tolerance=  5.13305158e+000 
*
*     Number of iterations           : 68
*     Convergency code               : 3
*     Final Residual Norm            : 3.19392726797939e+000
*     PETSC : Resolution time        : 0.888913 seconds

********************************************************************************************************************************************************

WITH 2.3.3p8:

Using Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b

RUN1:
Norme A*Un =   6838.31173987650
*     Resolution method              : Preconditionned Conjugate Residual
*     Preconditionner                : BJACOBI with ILU, Blocks of 1
*
*     Initial Residual               : 1.50972105381228e+006
Iteration= 1    residual= 9.59236416e+004       tolerance=  7.54860527e-002
Iteration= 2    residual= 8.46044988e+004       tolerance=  1.50972105e-001
Iteration= 10   residual= 2.73382943e+004       tolerance=  7.54860527e-001
Iteration= 20   residual= 7.27122933e+003       tolerance=  1.50972105e+000
Iteration= 30   residual= 8.42209039e+003       tolerance=  2.26458158e+000
Iteration= 40   residual= 2.64670054e+003       tolerance=  3.01944211e+000
Iteration= 50   residual= 3.17446756e+002       tolerance=  3.77430263e+000
Iteration= 60   residual= 3.53234489e+001       tolerance=  4.52916316e+000
Iteration= 65   residual= 7.12874932e+000       tolerance=  4.90659342e+000
Iteration= 66   residual= 3.72396571e+001       tolerance=  4.98207948e+000
Iteration= 67   residual= 3.75096723e+001       tolerance=  5.05756553e+000
Iteration= 68   residual= 3.19515221e+000       tolerance=  5.13305158e+000
*
*     Number of iterations           : 68
*     Convergency code               : 3
*     Final Residual Norm            : 3.19515221050523e+000
*     PETSC : Resolution time        : 0.928915 seconds

RUN2:

Norme A*Un =   6838.31173987650
*     Resolution method              : Preconditionned Conjugate Residual
*     Preconditionner                : BJACOBI with ILU, Blocks of 1
*
*     Initial Residual               : 1.50972105381228e+006
Iteration= 1    residual= 9.59236416e+004       tolerance=  7.54860527e-002
Iteration= 2    residual= 8.46044988e+004       tolerance=  1.50972105e-001
Iteration= 10   residual= 2.73382943e+004       tolerance=  7.54860527e-001
Iteration= 20   residual= 7.27122933e+003       tolerance=  1.50972105e+000
Iteration= 30   residual= 8.42209039e+003       tolerance=  2.26458158e+000
Iteration= 40   residual= 2.64670054e+003       tolerance=  3.01944211e+000
Iteration= 50   residual= 3.17446774e+002       tolerance=  3.77430263e+000
Iteration= 60   residual= 3.53233608e+001       tolerance=  4.52916316e+000
Iteration= 65   residual= 7.12937602e+000       tolerance=  4.90659342e+000
Iteration= 66   residual= 3.72832632e+001       tolerance=  4.98207948e+000
Iteration= 67   residual= 3.75447170e+001       tolerance=  5.05756553e+000
Iteration= 68   residual= 3.19369843e+000       tolerance=  5.13305158e+000
*
*     Number of iterations           : 68
*     Convergency code               : 3
*     Final Residual Norm            : 3.19369843187027e+000
*     PETSC : Resolution time        : 0.872702 seconds
Etienne

-----Message d'origine-----
De : owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] De la part de Matthew Knepley
Envoyé : mercredi 24 septembre 2008 19:15
À : petsc-users at mcs.anl.gov
Objet : Re: Non repetability issue and difference between 2.3.0 and 2.3.3

On Wed, Sep 24, 2008 at 11:21 AM, Etienne PERCHAT
<etienne.perchat at transvalor.com> wrote:
> Dear Petsc users,
>
>
>
> I come again with my comparisons between v2.3.0 and v2.3.3p8.
>
>
>
> I face a non repeatability issue with v2.3.3 that I didn't have with v2.3.0.
>
> I have read the exchanges made in March on a related subject but in my case
> it is at the first linear system solution that two successive runs differ.
>
>
>
>
>
> It happens when the number of processors used is greater than 2, even on a
> standard PC.
>
> I am solving MPIBAIJ symmetric systems with the Conjugate Residual method
> preconditioned ILU(1) and Block Jacobi  between subdomains.
>
> This system is the results of a FE assembly on an unstructured mesh.
>
>
>
> I made all the runs using -log_summary and -ksp_truemonitor.
>
>
>
> Starting with the same initial matrix and RHS, each run using 2.3.3p8
> provides slightly different results while we obtain exactly the same
> solution with v2.3.0.
>
>
>
> With Petsc 2.3.3p8:
>
>
>
> Run1:   Iteration= 68      residual= 3.19515221e+000       tolerance=
> 5.13305158e+000 0
>
> Run2:    Iteration= 68     residual= 3.19588481e+000       tolerance=
> 5.13305158e+000 0
>
> Run3:    Iteration= 68     residual= 3.19384417e+000       tolerance=
> 5.13305158e+000 0
>
>
>
> With Petsc 2.3.0:
>
>
>
> Run1:   Iteration= 68      residual= 3.19369843e+000       tolerance=
> 5.13305158e+000 0
>
> Run2:   Iteration= 68      residual= 3.19369843e+000       tolerance=
> 5.13305158e+000 0
>
>
>
> If I made a 4proc run with a mesh partitioning such that any node could be
> located on more than 2 proc. I did not face the problem.

It is not clear whether you have verified that on different runs, the
partitioning is
exactly the same.

  Matt

> I first thought about a MPI problem related to the order in which messages
> are received and then summed.
>
> But it would have been exactly the same with 2.3.0 ?
>
>
>
> Any tips/ideas ?
>
>
>
> Thanks by advance.
>
> Best regards,
>
>
>
> Etienne Perchat

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener