> Hi Matt,
> i don't understand how it can be the condition of the system.
> After all, the matrix as well as the RHS vector is EXACTLY the same between
> the 2 runs. This is what puzzles me so much.
The order of operations is different in the serial and parallel cases. With
a very ill-conditioned matrix,
which it sounds like you have, the reordering produces noticeably different
residuals, even though
they both satisfy the error bound.

Matt

> The only difference is whether i solve it using 1 core vs 4 cores.
> Of course i could be missing something.
> matt
> > This sounds like it has to do with the condition of your system, not any
> > parallel problem. Errors
> > in the solution can only be reduced to about (condition number) *
> (residual
> > norm).
> > Matt
> > > i have a problem for which i am not exaclty sure about what to do.
> > > I set up a simple 2D rectangular domain and decompose it into four
> equal
> > > boxes. I then build the petsc matrix based on this layout as well as
> the
> > > corresponsing RHS vector.
> > > I print out the matrix and RHS vector right before my KSPSolve call,
> and
> > > right after that call i print out the solution vector 'x'.
> > > I do this for 2 runs.
> > > 1) 1 processor
> > > 2) 4 processors.
> > > For both runs i do a difference (i.e. on the output files using diff)
> on
> > > all 3 quantities (the matrix, the RHS vector and the solution vector).
> > > The 'diff' command reports no difference between the files for the
> matrix
> > > and RHS vector.
> > > However, the soltution vector is different between the 2 runs. How
> > > different depends a little on what precond/solver combination i use and
> > > the tolerances.
> > > However, for example for BJacobi/GMRES with reltol=abstol=1e-12 the
> > > vector element with the maximum difference is on the order 1e-05. This
> is
> > > only after the first timestep. My problem has some nonlinearlity to it
> > > such that this will become a problem later on.
> > > The worst difference i have seen is if i use hypre's euclid. It was on
> > > the order of 1e-02.
> > >
> > > So my question is whether someone has an idea why this is happening (i
> > > suppose it is related to the parallel communication) and if there is
> way
> > > to fix it.
> > > thanks
> > > matt
