[petsc-users] superlu_dist produces random results
Kong, Fande
fande.kong at inl.gov
Wed Nov 15 17:17:51 CST 2017
Thanks, Barry,
On Wed, Nov 15, 2017 at 4:04 PM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>
> Do the ASM runs for thousands of time-steps produce the same final
> "physical results" as the MUMPS run for thousands of timesteps? While with
> SuperLU you get a very different "physical results"?
>
Let me update a little bit more. The simulation with SuperLU may fail at
certain time step. Sometime we can also run the simulation successfully
for the whole time range. It is totally random.
We will try ASM and MUMPS.
Fande,
>
> Barry
>
>
> > On Nov 15, 2017, at 4:52 PM, Kong, Fande <fande.kong at inl.gov> wrote:
> >
> >
> >
> > On Wed, Nov 15, 2017 at 3:35 PM, Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> >
> > Since the convergence labeled linear does not converge to 14 digits in
> one iteration I am assuming you are using lagged preconditioning and or
> lagged Jacobian?
> >
> > We are using Jacobian-free Newton. So Jacobian is different from the
> preconditioning matrix.
> >
> >
> > What happens if you do no lagging and solve each linear solve with a
> new LU factorization?
> >
> > We have the following results without using Jacobian-free Newton. Again,
> superlu_dist produces differences, while MUMPS gives the same results in
> terms of the residual norms.
> >
> >
> > Fande,
> >
> >
> > Superlu_dist run1:
> >
> > 0 Nonlinear |R| = 9.447423e+03
> > 0 Linear |R| = 9.447423e+03
> > 1 Linear |R| = 1.322285e-11
> > 1 Nonlinear |R| = 1.666987e-11
> >
> >
> > Superlu_dist run2:
> >
> > 0 Nonlinear |R| = 9.447423e+03
> > 0 Linear |R| = 9.447423e+03
> > 1 Linear |R| = 1.322171e-11
> > 1 Nonlinear |R| = 1.666977e-11
> >
> >
> > Superlu_dist run3:
> >
> > 0 Nonlinear |R| = 9.447423e+03
> > 0 Linear |R| = 9.447423e+03
> > 1 Linear |R| = 1.321964e-11
> > 1 Nonlinear |R| = 1.666959e-11
> >
> >
> > Superlu_dist run4:
> >
> > 0 Nonlinear |R| = 9.447423e+03
> > 0 Linear |R| = 9.447423e+03
> > 1 Linear |R| = 1.321978e-11
> > 1 Nonlinear |R| = 1.668688e-11
> >
> >
> > MUMPS run1:
> >
> > 0 Nonlinear |R| = 9.447423e+03
> > 0 Linear |R| = 9.447423e+03
> > 1 Linear |R| = 1.360637e-11
> > 1 Nonlinear |R| = 1.654334e-11
> >
> > MUMPS run 2:
> >
> > 0 Nonlinear |R| = 9.447423e+03
> > 0 Linear |R| = 9.447423e+03
> > 1 Linear |R| = 1.360637e-11
> > 1 Nonlinear |R| = 1.654334e-11
> >
> > MUMPS run 3:
> >
> > 0 Nonlinear |R| = 9.447423e+03
> > 0 Linear |R| = 9.447423e+03
> > 1 Linear |R| = 1.360637e-11
> > 1 Nonlinear |R| = 1.654334e-11
> >
> > MUMPS run4:
> >
> > 0 Nonlinear |R| = 9.447423e+03
> > 0 Linear |R| = 9.447423e+03
> > 1 Linear |R| = 1.360637e-11
> > 1 Nonlinear |R| = 1.654334e-11
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Barry
> >
> >
> > > On Nov 15, 2017, at 4:24 PM, Kong, Fande <fande.kong at inl.gov> wrote:
> > >
> > >
> > >
> > > On Wed, Nov 15, 2017 at 2:52 PM, Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >
> > > > On Nov 15, 2017, at 3:36 PM, Kong, Fande <fande.kong at inl.gov> wrote:
> > > >
> > > > Hi Barry,
> > > >
> > > > Thanks for your reply. I was wondering why this happens only when we
> use superlu_dist. I am trying to understand the algorithm in superlu_dist.
> If we use ASM or MUMPS, we do not produce these differences.
> > > >
> > > > The differences actually are NOT meaningless. In fact, we have a
> real transient application that presents this issue. When we run the
> simulation with superlu_dist in parallel for thousands of time steps, the
> final physics solution looks totally different from different runs. The
> differences are not acceptable any more. For a steady problem, the
> difference may be meaningless. But it is significant for the transient
> problem.
> > >
> > > I submit that the "physics solution" of all of these runs is equally
> right and equally wrong. If the solutions are very different due to a small
> perturbation than something is wrong with the model or the integrator, I
> don't think you can blame the linear solver (see below)
> > > >
> > > > This makes the solution not reproducible, and we can not even set a
> targeting solution in the test system because the solution is so different
> from one run to another. I guess there might/may be a tiny bug in
> superlu_dist or the PETSc interface to superlu_dist.
> > >
> > > This is possible but it is also possible this is due to normal round
> off inside of SuperLU dist.
> > >
> > > Since you have SuperLU_Dist inside a nonlinear iteration it
> shouldn't really matter exactly how well SuperLU_Dist does. The nonlinear
> iteration does essential defect correction for you; are you making sure
> that the nonlinear iteration always works for every timestep? For example
> confirm that SNESGetConvergedReason() is always positive.
> > >
> > > Definitely it could be something wrong on my side. But let us focus
> on the simple question first.
> > >
> > > To make the discussion a little simpler, let us back to the simple
> problem (heat conduction). Now I want to understand why this happens to
> superlu_dist only. When we are using ASM or MUMPS, why we can not see the
> differences from one run to another? I posted the residual histories for
> MUMPS and ASM. We can not see any differences in terms of the residual
> norms when using MUMPS or ASM. Does superlu_dist have higher round off than
> other solvers?
> > >
> > >
> > >
> > > MUMPS run1:
> > >
> > > 0 Nonlinear |R| = 9.447423e+03
> > > 0 Linear |R| = 9.447423e+03
> > > 1 Linear |R| = 1.013384e-02
> > > 2 Linear |R| = 4.020993e-08
> > > 1 Nonlinear |R| = 1.404678e-02
> > > 0 Linear |R| = 1.404678e-02
> > > 1 Linear |R| = 4.836162e-08
> > > 2 Linear |R| = 7.055620e-14
> > > 2 Nonlinear |R| = 4.836392e-08
> > >
> > > MUMPS run2:
> > >
> > > 0 Nonlinear |R| = 9.447423e+03
> > > 0 Linear |R| = 9.447423e+03
> > > 1 Linear |R| = 1.013384e-02
> > > 2 Linear |R| = 4.020993e-08
> > > 1 Nonlinear |R| = 1.404678e-02
> > > 0 Linear |R| = 1.404678e-02
> > > 1 Linear |R| = 4.836162e-08
> > > 2 Linear |R| = 7.055620e-14
> > > 2 Nonlinear |R| = 4.836392e-08
> > >
> > > MUMPS run3:
> > >
> > > 0 Nonlinear |R| = 9.447423e+03
> > > 0 Linear |R| = 9.447423e+03
> > > 1 Linear |R| = 1.013384e-02
> > > 2 Linear |R| = 4.020993e-08
> > > 1 Nonlinear |R| = 1.404678e-02
> > > 0 Linear |R| = 1.404678e-02
> > > 1 Linear |R| = 4.836162e-08
> > > 2 Linear |R| = 7.055620e-14
> > > 2 Nonlinear |R| = 4.836392e-08
> > >
> > > MUMPS run4:
> > >
> > > 0 Nonlinear |R| = 9.447423e+03
> > > 0 Linear |R| = 9.447423e+03
> > > 1 Linear |R| = 1.013384e-02
> > > 2 Linear |R| = 4.020993e-08
> > > 1 Nonlinear |R| = 1.404678e-02
> > > 0 Linear |R| = 1.404678e-02
> > > 1 Linear |R| = 4.836162e-08
> > > 2 Linear |R| = 7.055620e-14
> > > 2 Nonlinear |R| = 4.836392e-08
> > >
> > >
> > >
> > > ASM run1:
> > >
> > > 0 Nonlinear |R| = 9.447423e+03
> > > 0 Linear |R| = 9.447423e+03
> > > 1 Linear |R| = 6.189229e+03
> > > 2 Linear |R| = 3.252487e+02
> > > 3 Linear |R| = 3.485174e+01
> > > 4 Linear |R| = 8.600695e+00
> > > 5 Linear |R| = 3.333942e+00
> > > 6 Linear |R| = 1.706112e+00
> > > 7 Linear |R| = 5.047863e-01
> > > 8 Linear |R| = 2.337297e-01
> > > 9 Linear |R| = 1.071627e-01
> > > 10 Linear |R| = 4.692177e-02
> > > 11 Linear |R| = 1.340717e-02
> > > 12 Linear |R| = 4.753951e-03
> > > 1 Nonlinear |R| = 2.320271e-02
> > > 0 Linear |R| = 2.320271e-02
> > > 1 Linear |R| = 4.367880e-03
> > > 2 Linear |R| = 1.407852e-03
> > > 3 Linear |R| = 6.036360e-04
> > > 4 Linear |R| = 1.867661e-04
> > > 5 Linear |R| = 8.760076e-05
> > > 6 Linear |R| = 3.260519e-05
> > > 7 Linear |R| = 1.435418e-05
> > > 8 Linear |R| = 4.532875e-06
> > > 9 Linear |R| = 2.439053e-06
> > > 10 Linear |R| = 7.998549e-07
> > > 11 Linear |R| = 2.428064e-07
> > > 12 Linear |R| = 4.766918e-08
> > > 13 Linear |R| = 1.713748e-08
> > > 2 Nonlinear |R| = 3.671573e-07
> > >
> > >
> > > ASM run2:
> > >
> > > 0 Nonlinear |R| = 9.447423e+03
> > > 0 Linear |R| = 9.447423e+03
> > > 1 Linear |R| = 6.189229e+03
> > > 2 Linear |R| = 3.252487e+02
> > > 3 Linear |R| = 3.485174e+01
> > > 4 Linear |R| = 8.600695e+00
> > > 5 Linear |R| = 3.333942e+00
> > > 6 Linear |R| = 1.706112e+00
> > > 7 Linear |R| = 5.047863e-01
> > > 8 Linear |R| = 2.337297e-01
> > > 9 Linear |R| = 1.071627e-01
> > > 10 Linear |R| = 4.692177e-02
> > > 11 Linear |R| = 1.340717e-02
> > > 12 Linear |R| = 4.753951e-03
> > > 1 Nonlinear |R| = 2.320271e-02
> > > 0 Linear |R| = 2.320271e-02
> > > 1 Linear |R| = 4.367880e-03
> > > 2 Linear |R| = 1.407852e-03
> > > 3 Linear |R| = 6.036360e-04
> > > 4 Linear |R| = 1.867661e-04
> > > 5 Linear |R| = 8.760076e-05
> > > 6 Linear |R| = 3.260519e-05
> > > 7 Linear |R| = 1.435418e-05
> > > 8 Linear |R| = 4.532875e-06
> > > 9 Linear |R| = 2.439053e-06
> > > 10 Linear |R| = 7.998549e-07
> > > 11 Linear |R| = 2.428064e-07
> > > 12 Linear |R| = 4.766918e-08
> > > 13 Linear |R| = 1.713748e-08
> > > 2 Nonlinear |R| = 3.671573e-07
> > >
> > > ASM run3:
> > >
> > > 0 Nonlinear |R| = 9.447423e+03
> > > 0 Linear |R| = 9.447423e+03
> > > 1 Linear |R| = 6.189229e+03
> > > 2 Linear |R| = 3.252487e+02
> > > 3 Linear |R| = 3.485174e+01
> > > 4 Linear |R| = 8.600695e+00
> > > 5 Linear |R| = 3.333942e+00
> > > 6 Linear |R| = 1.706112e+00
> > > 7 Linear |R| = 5.047863e-01
> > > 8 Linear |R| = 2.337297e-01
> > > 9 Linear |R| = 1.071627e-01
> > > 10 Linear |R| = 4.692177e-02
> > > 11 Linear |R| = 1.340717e-02
> > > 12 Linear |R| = 4.753951e-03
> > > 1 Nonlinear |R| = 2.320271e-02
> > > 0 Linear |R| = 2.320271e-02
> > > 1 Linear |R| = 4.367880e-03
> > > 2 Linear |R| = 1.407852e-03
> > > 3 Linear |R| = 6.036360e-04
> > > 4 Linear |R| = 1.867661e-04
> > > 5 Linear |R| = 8.760076e-05
> > > 6 Linear |R| = 3.260519e-05
> > > 7 Linear |R| = 1.435418e-05
> > > 8 Linear |R| = 4.532875e-06
> > > 9 Linear |R| = 2.439053e-06
> > > 10 Linear |R| = 7.998549e-07
> > > 11 Linear |R| = 2.428064e-07
> > > 12 Linear |R| = 4.766918e-08
> > > 13 Linear |R| = 1.713748e-08
> > > 2 Nonlinear |R| = 3.671573e-07
> > >
> > >
> > >
> > > ASM run4:
> > > 0 Nonlinear |R| = 9.447423e+03
> > > 0 Linear |R| = 9.447423e+03
> > > 1 Linear |R| = 6.189229e+03
> > > 2 Linear |R| = 3.252487e+02
> > > 3 Linear |R| = 3.485174e+01
> > > 4 Linear |R| = 8.600695e+00
> > > 5 Linear |R| = 3.333942e+00
> > > 6 Linear |R| = 1.706112e+00
> > > 7 Linear |R| = 5.047863e-01
> > > 8 Linear |R| = 2.337297e-01
> > > 9 Linear |R| = 1.071627e-01
> > > 10 Linear |R| = 4.692177e-02
> > > 11 Linear |R| = 1.340717e-02
> > > 12 Linear |R| = 4.753951e-03
> > > 1 Nonlinear |R| = 2.320271e-02
> > > 0 Linear |R| = 2.320271e-02
> > > 1 Linear |R| = 4.367880e-03
> > > 2 Linear |R| = 1.407852e-03
> > > 3 Linear |R| = 6.036360e-04
> > > 4 Linear |R| = 1.867661e-04
> > > 5 Linear |R| = 8.760076e-05
> > > 6 Linear |R| = 3.260519e-05
> > > 7 Linear |R| = 1.435418e-05
> > > 8 Linear |R| = 4.532875e-06
> > > 9 Linear |R| = 2.439053e-06
> > > 10 Linear |R| = 7.998549e-07
> > > 11 Linear |R| = 2.428064e-07
> > > 12 Linear |R| = 4.766918e-08
> > > 13 Linear |R| = 1.713748e-08
> > > 2 Nonlinear |R| = 3.671573e-07
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > >
> > > >
> > > > Fande,
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Nov 15, 2017 at 1:59 PM, Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> > > >
> > > > Meaningless differences
> > > >
> > > >
> > > > > On Nov 15, 2017, at 2:26 PM, Kong, Fande <fande.kong at inl.gov>
> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > There is a heat conduction problem. When superlu_dist is used as a
> preconditioner, we have random results from different runs. Is there a
> random algorithm in superlu_dist? If we use ASM or MUMPS as the
> preconditioner, we then don't have this issue.
> > > > >
> > > > > run 1:
> > > > >
> > > > > 0 Nonlinear |R| = 9.447423e+03
> > > > > 0 Linear |R| = 9.447423e+03
> > > > > 1 Linear |R| = 1.013384e-02
> > > > > 2 Linear |R| = 4.020995e-08
> > > > > 1 Nonlinear |R| = 1.404678e-02
> > > > > 0 Linear |R| = 1.404678e-02
> > > > > 1 Linear |R| = 5.104757e-08
> > > > > 2 Linear |R| = 7.699637e-14
> > > > > 2 Nonlinear |R| = 5.106418e-08
> > > > >
> > > > >
> > > > > run 2:
> > > > >
> > > > > 0 Nonlinear |R| = 9.447423e+03
> > > > > 0 Linear |R| = 9.447423e+03
> > > > > 1 Linear |R| = 1.013384e-02
> > > > > 2 Linear |R| = 4.020995e-08
> > > > > 1 Nonlinear |R| = 1.404678e-02
> > > > > 0 Linear |R| = 1.404678e-02
> > > > > 1 Linear |R| = 5.109913e-08
> > > > > 2 Linear |R| = 7.189091e-14
> > > > > 2 Nonlinear |R| = 5.111591e-08
> > > > >
> > > > > run 3:
> > > > >
> > > > > 0 Nonlinear |R| = 9.447423e+03
> > > > > 0 Linear |R| = 9.447423e+03
> > > > > 1 Linear |R| = 1.013384e-02
> > > > > 2 Linear |R| = 4.020995e-08
> > > > > 1 Nonlinear |R| = 1.404678e-02
> > > > > 0 Linear |R| = 1.404678e-02
> > > > > 1 Linear |R| = 5.104942e-08
> > > > > 2 Linear |R| = 7.465572e-14
> > > > > 2 Nonlinear |R| = 5.106642e-08
> > > > >
> > > > > run 4:
> > > > >
> > > > > 0 Nonlinear |R| = 9.447423e+03
> > > > > 0 Linear |R| = 9.447423e+03
> > > > > 1 Linear |R| = 1.013384e-02
> > > > > 2 Linear |R| = 4.020995e-08
> > > > > 1 Nonlinear |R| = 1.404678e-02
> > > > > 0 Linear |R| = 1.404678e-02
> > > > > 1 Linear |R| = 5.102730e-08
> > > > > 2 Linear |R| = 7.132220e-14
> > > > > 2 Nonlinear |R| = 5.104442e-08
> > > > >
> > > > > Solver details:
> > > > >
> > > > > SNES Object: 8 MPI processes
> > > > > type: newtonls
> > > > > maximum iterations=15, maximum function evaluations=10000
> > > > > tolerances: relative=1e-08, absolute=1e-11, solution=1e-50
> > > > > total number of linear solver iterations=4
> > > > > total number of function evaluations=7
> > > > > norm schedule ALWAYS
> > > > > SNESLineSearch Object: 8 MPI processes
> > > > > type: basic
> > > > > maxstep=1.000000e+08, minlambda=1.000000e-12
> > > > > tolerances: relative=1.000000e-08, absolute=1.000000e-15,
> lambda=1.000000e-08
> > > > > maximum iterations=40
> > > > > KSP Object: 8 MPI processes
> > > > > type: gmres
> > > > > restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> > > > > happy breakdown tolerance 1e-30
> > > > > maximum iterations=100, initial guess is zero
> > > > > tolerances: relative=1e-06, absolute=1e-50, divergence=10000.
> > > > > right preconditioning
> > > > > using UNPRECONDITIONED norm type for convergence test
> > > > > PC Object: 8 MPI processes
> > > > > type: lu
> > > > > out-of-place factorization
> > > > > tolerance for zero pivot 2.22045e-14
> > > > > matrix ordering: natural
> > > > > factor fill ratio given 0., needed 0.
> > > > > Factored matrix follows:
> > > > > Mat Object: 8 MPI processes
> > > > > type: superlu_dist
> > > > > rows=7925, cols=7925
> > > > > package used to perform factorization: superlu_dist
> > > > > total: nonzeros=0, allocated nonzeros=0
> > > > > total number of mallocs used during MatSetValues calls
> =0
> > > > > SuperLU_DIST run parameters:
> > > > > Process grid nprow 4 x npcol 2
> > > > > Equilibrate matrix TRUE
> > > > > Matrix input mode 1
> > > > > Replace tiny pivots FALSE
> > > > > Use iterative refinement TRUE
> > > > > Processors in row 4 col partition 2
> > > > > Row permutation LargeDiag
> > > > > Column permutation METIS_AT_PLUS_A
> > > > > Parallel symbolic factorization FALSE
> > > > > Repeated factorization SamePattern
> > > > > linear system matrix followed by preconditioner matrix:
> > > > > Mat Object: 8 MPI processes
> > > > > type: mffd
> > > > > rows=7925, cols=7925
> > > > > Matrix-free approximation:
> > > > > err=1.49012e-08 (relative error in function evaluation)
> > > > > Using wp compute h routine
> > > > > Does not compute normU
> > > > > Mat Object: () 8 MPI processes
> > > > > type: mpiaij
> > > > > rows=7925, cols=7925
> > > > > total: nonzeros=63587, allocated nonzeros=63865
> > > > > total number of mallocs used during MatSetValues calls =0
> > > > > not using I-node (on process 0) routines
> > > > >
> > > > >
> > > > > Fande,
> > > > >
> > > > >
> > > >
> > > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20171115/673c14c4/attachment-0001.html>
More information about the petsc-users
mailing list