[petsc-users] roundoff differences
Boyce Griffith
griffith at cims.nyu.edu
Thu Aug 18 20:17:14 CDT 2011
On 8/18/11 12:49 PM, Kevin Green wrote:
> On that note, do PETSc routines take into account ideas such as those in
> that paper for matrix-matrix matrix-vector etc. operations? I assume
> that that by default they don't (since they take more time), but are
> there flags/functions that do?
One source of this lack of reproducibility is in parallel accumulation,
e.g., calling VecSetValues() with ADD_VALUES with input values coming
from different processors. If efficiency is not an issue, it is not too
hard to modify VecAssemblyBegin/End so that the final accumulation is
done in extended precision arithmetic (e.g., with the QD library), or by
using a more accurate summation algorithm.
-- Boyce
>
> Kevin
>
> On Thu, Aug 18, 2011 at 4:18 AM, Henning Sauerland <uerland at gmail.com
> <mailto:uerland at gmail.com>> wrote:
>
> Here is an interesting paper on the topic of reproducibility of
> parallel summation:
>
> Y. He and C. H. Q. Ding. Using accurate arithmetics to improve
> numerical reproducibility and stability in parallel applications. J.
> Supercomput., 18:259 – 277, 2001.
>
>
> Henning
>
>
> On 18.08.2011, at 09:49, Harald Pfeiffer wrote:
>
>> Hello,
>>
>> we use PETSc to solve the nonlinear system arising from
>> pseudo-spectral discretization of certain elliptic PDEs in
>> Einstein's equations. When running the same job multiple times on
>> the same number of processors on the same workstation, we find
>> roundoff differences. Is this expected, e.g. because MPI
>> reduction calls may behave differently depending on the load of
>> the machine? Or should we be concerned and investigate further?
>>
>> Thanks,
>> Harald
>>
>>
>>
>>
>>
>>
>>
>> -------- Original Message --------
>> Subject: Re: Quick question about derivatives in SpEC
>> Date: Tue, 16 Aug 2011 09:45:27 -0400
>> From: Gregory B. Cook <cookgb at wfu.edu> <mailto:cookgb at wfu.edu>
>> To: Harald Pfeiffer <pfeiffer at cita.utoronto.ca>
>> <mailto:pfeiffer at cita.utoronto.ca>
>> CC: Larry Kidder <kidder at astro.cornell.edu>
>> <mailto:kidder at astro.cornell.edu>, Mark Scheel
>> <scheel at tapir.caltech.edu> <mailto:scheel at tapir.caltech.edu>
>>
>>
>>
>> Hi Harald,
>>
>> All of the tests I was doing were on the same 8 cores on my office
>> workstation. It is running Ubuntu 11, and uses the default OpenMPI
>> communication approach. To make sure it wasn't something I was doing, I
>> ran two elliptic solves of the ExtendedConformalThinSandwich() volume
>> terms. Here are the outputs of snes.dat for the different levels:
>>
>> Run 1 Run 2
>> Six0/snes.dat
>> 0 7.3385297958166698 0 7.3385297958166698
>> 1 5.1229060531500723 1 5.1229060531500723
>> 2 0.32616852761238285 2 0.32616852761238285
>> 3 0.012351417186533147 3 0.012351417186800266<*****
>> 4 9.7478354935351385e-06 4 9.7478351511500114e-06
>> Six1/snes.dat
>> 0 0.13405558402489681 0 0.13405558402540407
>> 1 0.00068002100028642610 1 0.00068002089609322440
>> 2 6.8764357250058596e-08 2 6.3738394418031232e-08
>> Six2/snes.dat
>> 0 0.0063028244769771681 0 0.0063028058475922306
>> 1 1.4538921141731714e-06 1 1.4545032695605256e-06
>> Six3/snes.dat
>> 0 0.00061476105672438877 0 0.00061476093499534406
>> 1 6.0267672358059814e-08 1 5.4897793428123648e-08
>> Six4/snes.dat
>> 0 0.00053059501859595651 0 0.00053059591479892143
>> 1 4.8003269489205705e-08 1 4.8079799390886591e-08
>> Six5/snes.dat
>> 0 3.6402372419546429e-05 0 3.6402169997838670e-05
>> 1 5.3117360561476420e-09 1 5.2732089856727503e-09
>>
>> The differences are clearly at the level of roundoff, but it is
>> "strange" that you cannot reproduce identical results.
>>
>> I've attached all of the .input files for this run in case you want to
>> try to reproduce my findings.
>>
>> Greg
>>
>> On 08/16/2011 06:21 AM, Harald Pfeiffer wrote:
>> > Hi Greg,
>> >
>> > some thoughts:
>> >
>> > Petsc is using standard MPI reduction calls, which may give results that
>> > differ by roundoff. We have positively noticed this happening for
>> > different number of processes, but perhaps this also happens depending
>> > on where in a cluster your jobs run (different network topology
>> > depending on whether all processors are on the same rack, vs. split
>> > among racks; dynamic load-balancing of network communication).
>> >
>> > You might want to try reserving a few nodes interactively, and then
>> > running the elliptic solver multiple times on this same set of nodes.
>> >
>> > The Mover does indeed load-balanced interpolation, but when doing so,
>> > the MPI communication should not affect identical results.
>> >
>> > Once there are roundoff differences, they are typically amplified during
>> > a petsc linear solve. The iterative algorithm takes different paths
>> > toward the solution, and a difference of 1e-10 doesn't seem excessive.
>> >
>> > Harald
>> >
>> > ps. Preconditioning is done differently on-processor and off-processor
>> > and depends therefore highly on the processor count. So if you were to
>> > change number of processors, the iterative solve will proceed very
>> > differently.
>> >
>> > On 8/15/11 10:27 PM, Gregory B. Cook wrote:
>> >> Hi Larry,
>> >>
>> >> I ran a check using the default ExtendedConformalThinSandwich() volume
>> >> terms and this also produced roundoff error differences between
>> >> identical runs, so I feel better about that. I am using the same
>> >> number of processors, but if there is any kind of dynamic load
>> >> balancing for interpolation/communication/etc, then I can see that
>> >> different runs might end up using different boundary communications.
>> >> Maybe that's all there is to it?
>> >>
>> >> Greg
>> >>
>> >> On 08/15/2011 04:16 PM, Larry Kidder wrote:
>> >>> Hi Greg,
>> >>>
>> >>> Harald is traveling, so I am not sure when he will answer.
>> >>> My vague recollection is that there is something about how PETSc does
>> >>> preconditioning in parallel that leads to not producing the same result;
>> >>> but I don't recall if this happens in general, or only if you change the
>> >>> distribution of processes.
>> >>>
>> >>> Larry
>> >>>
>> >>> Gregory B. Cook wrote:
>> >>>> Hi Guys,
>> >>>>
>> >>>> I have a follow-up question that may be tangentially related to my
>> >>>> original question about derivatives. This one is targeted at Harald.
>> >>>>
>> >>>> When I run a version of my code where the very small errors in the
>> >>>> derivative of the metric are not present (I code them in differently),
>> >>>> I find that running the exact same input files successively does not
>> >>>> produce exactly the same results. This is a multi-level elliptic solve
>> >>>> on a complex domain for binary black holes. On Level-0, the the
>> >>>> results returned in snes.dat are identical. On Level-1, the initial
>> >>>> and second snes norms are identical, but the third differs. After
>> >>>> this, all snes norms differ.
>> >>>>
>> >>>> Is this to be expected? Does PETSc not produce identical results on
>> >>>> consecutive solves with the same starting point? Is there something in
>> >>>> the MPI communication that means that the results should differ? The
>> >>>> differences start at the order of 10^-13, but grow by the 6th level to
>> >>>> be of order 10^-10.
>> >>>>
>> >>>> Greg
>> >>>>
>> >>>> On 08/15/2011 01:02 PM, Larry Kidder wrote:
>> >>>>> Hi Greg,
>> >>>>>
>> >>>>> Did you compute the norm of the metric itself?
>> >>>>> What domain did you use?
>> >>>>>
>> >>>>> Larry
>> >>>>>
>> >>>>> Gregory B. Cook wrote:
>> >>>>>> Hi Guys,
>> >>>>>>
>> >>>>>> I was doing a simple test as part of debugging some code I'm writing.
>> >>>>>> I ended up placing the following relevant lines of code into the
>> >>>>>> EllipticItems.input and EllipticObservers.input files:
>> >>>>>>
>> >>>>>> ---EllipticItems.input---
>> >>>>>> EvaluateMatrixFormula(Output=InvConformalMetric; Dim=3; Symm=11;
>> >>>>>> M[0,0]=1; M[1,1]=1; M[2,2]=1),
>> >>>>>> FirstDeriv(Input=InvConformalMetric; Output=dInvConformalMetric),
>> >>>>>> SecondDeriv(Input=InvConformalMetric; Output=ddInvConformalMetric),
>> >>>>>>
>> >>>>>> FlattenDeriv(Input=dInvConformalMetric;
>> >>>>>> Output=fdInvConformalMetric;DerivPosition=Last),
>> >>>>>> FlattenDeriv(Input=ddInvConformalMetric;
>> >>>>>> Output=fddInvConformalMetric;DerivPosition=Last),
>> >>>>>>
>> >>>>>> ---EllipticObservers.input---
>> >>>>>> NormOfTensor(Input=fdInvConformalMetric, fddInvConformalMetric;
>> >>>>>> Filename=dInvCM_L2.dat;Op=L2; MetricForTensors=None),
>> >>>>>> NormOfTensor(Input=fdInvConformalMetric, fddInvConformalMetric;
>> >>>>>> Filename=dInvCM_Linf.dat;Op=Linf; MetricForTensors=None),
>> >>>>>>
>> >>>>>>
>> >>>>>> The odd thing is that the norms that I get out are not exactly zero.
>> >>>>>> They are very small, but I'm taking the first and second derivatives
>> >>>>>> of the identity matrix, so I would expect them to evaluate to exactly
>> >>>>>> zero. The fact that they don't leads me to think that there is
>> >>>>>> something wrong either in my code or in how I have written the input
>> >>>>>> files.
>> >>>>>>
>> >>>>>> Should these derivatives evaluate to exactly zero?
>> >>>>>>
>> >>>>>> Greg
>> >>>>>
>> >>>
>> >
>>
>
>
More information about the petsc-users
mailing list