Non repetability issue and difference between 2.3.0 and 2.3.3
Etienne PERCHAT
etienne.perchat at transvalor.com
Thu Sep 25 05:09:42 CDT 2008
Hi Matt,
I am sure that the partitioning is exactly the same:
I have an external tool that partitions the mesh before launching the FE code. So for all the runs the mesh partitions has been created only once and then reused.
For the case where I wanted every ghost node to be shared by two and only two processors, I used simple geometries like rings or bars with structured meshes. Once again the partitions have been created once and then reused.
The initial residuals and the initial matrix are exactly the same.
I have added some lines in my code:
After calling MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);
I made a Matrix vector product between A and an unity vector. Then I've computed the norm of the resulting vector. You will see below the results for 4 linear system solves (two with 2.3.0 and two with 2.3.3p8)
Mainly:
With all runs :
1/ the results of the matrix * unity vector product are the same: 6838.31173987650
2/ the Initial Residual also : 1.50972105381228e+006
3/ At iteration 40 all the runs provides exactly the same residual:
Iteration= 40 residual= 2.64670054e+003 tolerance= 3.01944211e+000
3/ with 2.3.0 the final residual is always the same : 3.19392726797939e+000
4/ with 2.3.3p8 the final residual vary after iteration 40.
Some statistics made with 12 successive runs :
We obtained 5 times 3.19515221050523, two times 3.19369843187027, three times 3.19373947848208e and two others for the two lasts.
RUN1: 3.19515221050523e+000
RUN2: 3.19515221050523e+000
RUN3: 3.19369843187027e+000
RUN4: 3.19588480582213e+000
RUN5: 3.19515221050523e+000
RUN6: 3.19373947848208e+000
RUN7: 3.19515221050523e+000
RUN8: 3.19384417350916e+000
RUN9: 3.19515221050523e+000
RUN10: 3.19373947848208e+000
RUN11: 3.19369843187027e+000
RUN12: 3.19373947848208e+000
So same initial residual, same results for the matrix * unity vector product, same residual at iteration 40.
I always used the options:
OptionTable: -ksp_truemonitor
OptionTable: -log_summary
Any ideas will be very welcome, don't hesitate if you need additional tests.
It sound, perhaps, reuse of a buffer that has not been properly released ?
Best regards,
Etienne
------------------------------------------------------------------------
With 2.3.0: Using Petsc Release Version 2.3.0, Patch 44, April, 26, 2005
RUN1:
Norm A*One = 6838.31173987650
* Resolution method : Preconditionned Conjugate Residual
* Preconditionner : BJACOBI with ILU, Blocks of 1
*
* Initial Residual : 1.50972105381228e+006
Iteration= 1 residual= 9.59236416e+004 tolerance= 7.54860527e-002
Iteration= 2 residual= 8.46044988e+004 tolerance= 1.50972105e-001
Iteration= 66 residual= 3.73014307e+001 tolerance= 4.98207948e+000
Iteration= 67 residual= 3.75579067e+001 tolerance= 5.05756553e+000
Iteration= 68 residual= 3.19392727e+000 tolerance= 5.13305158e+000 *
* Number of iterations : 68
* Convergency code : 3
* Final Residual Norm : 3.19392726797939e+000
* PETSC : Resolution time : 1.000389 seconds
RUN2:
Norme A*Un = 6838.31173987650
* Resolution method : Preconditionned Conjugate Residual
* Preconditionner : BJACOBI with ILU, Blocks of 1
*
* Initial Residual : 1.50972105381228e+006
Iteration= 1 residual= 9.59236416e+004 tolerance= 7.54860527e-002
Iteration= 2 residual= 8.46044988e+004 tolerance= 1.50972105e-001
Iteration= 10 residual= 2.73382943e+004 tolerance= 7.54860527e-001
Iteration= 20 residual= 7.27122933e+003 tolerance= 1.50972105e+000
Iteration= 30 residual= 8.42209039e+003 tolerance= 2.26458158e+000
Iteration= 40 residual= 2.64670054e+003 tolerance= 3.01944211e+000
Iteration= 50 residual= 3.17446784e+002 tolerance= 3.77430263e+000
Iteration= 60 residual= 3.53234217e+001 tolerance= 4.52916316e+000
Iteration= 66 residual= 3.73014307e+001 tolerance= 4.98207948e+000
Iteration= 67 residual= 3.75579067e+001 tolerance= 5.05756553e+000
Iteration= 68 residual= 3.19392727e+000 tolerance= 5.13305158e+000
*
* Number of iterations : 68
* Convergency code : 3
* Final Residual Norm : 3.19392726797939e+000
* PETSC : Resolution time : 0.888913 seconds
********************************************************************************************************************************************************
WITH 2.3.3p8:
Using Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b
RUN1:
Norme A*Un = 6838.31173987650
* Resolution method : Preconditionned Conjugate Residual
* Preconditionner : BJACOBI with ILU, Blocks of 1
*
* Initial Residual : 1.50972105381228e+006
Iteration= 1 residual= 9.59236416e+004 tolerance= 7.54860527e-002
Iteration= 2 residual= 8.46044988e+004 tolerance= 1.50972105e-001
Iteration= 10 residual= 2.73382943e+004 tolerance= 7.54860527e-001
Iteration= 20 residual= 7.27122933e+003 tolerance= 1.50972105e+000
Iteration= 30 residual= 8.42209039e+003 tolerance= 2.26458158e+000
Iteration= 40 residual= 2.64670054e+003 tolerance= 3.01944211e+000
Iteration= 50 residual= 3.17446756e+002 tolerance= 3.77430263e+000
Iteration= 60 residual= 3.53234489e+001 tolerance= 4.52916316e+000
Iteration= 65 residual= 7.12874932e+000 tolerance= 4.90659342e+000
Iteration= 66 residual= 3.72396571e+001 tolerance= 4.98207948e+000
Iteration= 67 residual= 3.75096723e+001 tolerance= 5.05756553e+000
Iteration= 68 residual= 3.19515221e+000 tolerance= 5.13305158e+000
*
* Number of iterations : 68
* Convergency code : 3
* Final Residual Norm : 3.19515221050523e+000
* PETSC : Resolution time : 0.928915 seconds
RUN2:
Norme A*Un = 6838.31173987650
* Resolution method : Preconditionned Conjugate Residual
* Preconditionner : BJACOBI with ILU, Blocks of 1
*
* Initial Residual : 1.50972105381228e+006
Iteration= 1 residual= 9.59236416e+004 tolerance= 7.54860527e-002
Iteration= 2 residual= 8.46044988e+004 tolerance= 1.50972105e-001
Iteration= 10 residual= 2.73382943e+004 tolerance= 7.54860527e-001
Iteration= 20 residual= 7.27122933e+003 tolerance= 1.50972105e+000
Iteration= 30 residual= 8.42209039e+003 tolerance= 2.26458158e+000
Iteration= 40 residual= 2.64670054e+003 tolerance= 3.01944211e+000
Iteration= 50 residual= 3.17446774e+002 tolerance= 3.77430263e+000
Iteration= 60 residual= 3.53233608e+001 tolerance= 4.52916316e+000
Iteration= 65 residual= 7.12937602e+000 tolerance= 4.90659342e+000
Iteration= 66 residual= 3.72832632e+001 tolerance= 4.98207948e+000
Iteration= 67 residual= 3.75447170e+001 tolerance= 5.05756553e+000
Iteration= 68 residual= 3.19369843e+000 tolerance= 5.13305158e+000
*
* Number of iterations : 68
* Convergency code : 3
* Final Residual Norm : 3.19369843187027e+000
* PETSC : Resolution time : 0.872702 seconds
Etienne
-----Message d'origine-----
De : owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] De la part de Matthew Knepley
Envoyé : mercredi 24 septembre 2008 19:15
À : petsc-users at mcs.anl.gov
Objet : Re: Non repetability issue and difference between 2.3.0 and 2.3.3
On Wed, Sep 24, 2008 at 11:21 AM, Etienne PERCHAT
<etienne.perchat at transvalor.com> wrote:
> Dear Petsc users,
>
>
>
> I come again with my comparisons between v2.3.0 and v2.3.3p8.
>
>
>
> I face a non repeatability issue with v2.3.3 that I didn't have with v2.3.0.
>
> I have read the exchanges made in March on a related subject but in my case
> it is at the first linear system solution that two successive runs differ.
>
>
>
>
>
> It happens when the number of processors used is greater than 2, even on a
> standard PC.
>
> I am solving MPIBAIJ symmetric systems with the Conjugate Residual method
> preconditioned ILU(1) and Block Jacobi between subdomains.
>
> This system is the results of a FE assembly on an unstructured mesh.
>
>
>
> I made all the runs using -log_summary and -ksp_truemonitor.
>
>
>
> Starting with the same initial matrix and RHS, each run using 2.3.3p8
> provides slightly different results while we obtain exactly the same
> solution with v2.3.0.
>
>
>
> With Petsc 2.3.3p8:
>
>
>
> Run1: Iteration= 68 residual= 3.19515221e+000 tolerance=
> 5.13305158e+000 0
>
> Run2: Iteration= 68 residual= 3.19588481e+000 tolerance=
> 5.13305158e+000 0
>
> Run3: Iteration= 68 residual= 3.19384417e+000 tolerance=
> 5.13305158e+000 0
>
>
>
> With Petsc 2.3.0:
>
>
>
> Run1: Iteration= 68 residual= 3.19369843e+000 tolerance=
> 5.13305158e+000 0
>
> Run2: Iteration= 68 residual= 3.19369843e+000 tolerance=
> 5.13305158e+000 0
>
>
>
> If I made a 4proc run with a mesh partitioning such that any node could be
> located on more than 2 proc. I did not face the problem.
It is not clear whether you have verified that on different runs, the
partitioning is
exactly the same.
Matt
> I first thought about a MPI problem related to the order in which messages
> are received and then summed.
>
> But it would have been exactly the same with 2.3.0 ?
>
>
>
> Any tips/ideas ?
>
>
>
> Thanks by advance.
>
> Best regards,
>
>
>
> Etienne Perchat
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener
More information about the petsc-users
mailing list