[petsc-users] sources of floating point randomness in JFNK in serial
Mark Lohry
mlohry at gmail.com
Fri May 5 16:02:21 CDT 2023
are there any safe subsets of -march=whatever? i had it on to take
advantage of simd ops on avx512 chips but never looked so close at the
exact results.
On Fri, May 5, 2023 at 4:58 PM Barry Smith <bsmith at petsc.dev> wrote:
>
>
> On May 5, 2023, at 4:45 PM, Mark Lohry <mlohry at gmail.com> wrote:
>
> wow. leaving -O3 and turning off -march=native seems to have made it
> repeatable. this is on an avx2 cpu if it matters.
>
> out-of-order instructions may be performed thus, two runs may have
>> different order of operations
>>
>>
> this is terrifying if true. the source code path is exactly the same every
> time but the cpu does different things?
>
>
> Sure. And you will see more of it in the future, not less. It is not so
> much the CPU does different things each time but that the same things
> happen in a different order (and different order for floating point
> arithmetic means different results).
>
>
> On Fri, May 5, 2023 at 10:55 AM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>> Mark,
>>
>> Thank you. You do have aggressive optimizations: -O3 -march=native,
>> which means out-of-order instructions may be performed thus, two runs may
>> have different order of operations and possibly different round-off values.
>>
>> You could try turning off all of this with -O0 for an experiment and
>> see what happens. My guess is that you will see much smaller differences in
>> the residuals.
>>
>> Barry
>>
>>
>> On May 5, 2023, at 8:11 AM, Mark Lohry <mlohry at gmail.com> wrote:
>>
>>
>>
>> On Thu, May 4, 2023 at 9:51 PM Barry Smith <bsmith at petsc.dev> wrote:
>>
>>>
>>> Send configure.log
>>>
>>>
>>> On May 4, 2023, at 5:35 PM, Mark Lohry <mlohry at gmail.com> wrote:
>>>
>>> Sure, but why only once and why save to disk? Why not just use that
>>>> computed approximate Jacobian at each Newton step to drive the Newton
>>>> solves along for a bunch of time steps?
>>>
>>>
>>> Ah I get what you mean. Okay I did three newton steps with the same LHS,
>>> with a few repeated manual tests. 3 out of 4 times i got the same exact
>>> history. is it in the realm of possibility that a hardware error could
>>> cause something this subtle, bad memory bit or something?
>>>
>>> 2 runs of 3 newton solves below, ever-so-slightly different.
>>>
>>>
>>> 0 SNES Function norm 3.424003312857e+04
>>> 0 KSP Residual norm 3.424003312857e+04
>>> 1 KSP Residual norm 2.886124328003e+04
>>> 2 KSP Residual norm 2.504664994246e+04
>>> 3 KSP Residual norm 2.104615835161e+04
>>> 4 KSP Residual norm 1.938102896632e+04
>>> 5 KSP Residual norm 1.793774642408e+04
>>> 6 KSP Residual norm 1.671392566980e+04
>>> 7 KSP Residual norm 1.501504103873e+04
>>> 8 KSP Residual norm 1.366362900747e+04
>>> 9 KSP Residual norm 1.240398500429e+04
>>> 10 KSP Residual norm 1.156293733914e+04
>>> 11 KSP Residual norm 1.066296477958e+04
>>> 12 KSP Residual norm 9.835601966950e+03
>>> 13 KSP Residual norm 9.017480191491e+03
>>> 14 KSP Residual norm 8.415336139780e+03
>>> 15 KSP Residual norm 7.807497808435e+03
>>> 16 KSP Residual norm 7.341703768294e+03
>>> 17 KSP Residual norm 6.979298049282e+03
>>> 18 KSP Residual norm 6.521277772081e+03
>>> 19 KSP Residual norm 6.174842408773e+03
>>> 20 KSP Residual norm 5.889819665003e+03
>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>> 1 SNES Function norm 1.000525348433e+04
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>> type: newtonls
>>> maximum iterations=1, maximum function evaluations=-1
>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>> total number of linear solver iterations=20
>>> total number of function evaluations=2
>>> norm schedule ALWAYS
>>> Jacobian is never rebuilt
>>> Jacobian is built using finite differences with coloring
>>> SNESLineSearch Object: 1 MPI process
>>> type: basic
>>> maxstep=1.000000e+08, minlambda=1.000000e-12
>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>> lambda=1.000000e-08
>>> maximum iterations=40
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>> 0 SNES Function norm 1.000525348433e+04
>>> 0 KSP Residual norm 1.000525348433e+04
>>> 1 KSP Residual norm 7.908741564765e+03
>>> 2 KSP Residual norm 6.825263536686e+03
>>> 3 KSP Residual norm 6.224930664968e+03
>>> 4 KSP Residual norm 6.095547180532e+03
>>> 5 KSP Residual norm 5.952968230430e+03
>>> 6 KSP Residual norm 5.861251998116e+03
>>> 7 KSP Residual norm 5.712439327755e+03
>>> 8 KSP Residual norm 5.583056913266e+03
>>> 9 KSP Residual norm 5.461768804626e+03
>>> 10 KSP Residual norm 5.351937611098e+03
>>> 11 KSP Residual norm 5.224288337578e+03
>>> 12 KSP Residual norm 5.129863847081e+03
>>> 13 KSP Residual norm 5.010818237218e+03
>>> 14 KSP Residual norm 4.907162936199e+03
>>> 15 KSP Residual norm 4.789564773955e+03
>>> 16 KSP Residual norm 4.695173370720e+03
>>> 17 KSP Residual norm 4.584070962171e+03
>>> 18 KSP Residual norm 4.483061424742e+03
>>> 19 KSP Residual norm 4.373384070745e+03
>>> 20 KSP Residual norm 4.260704657592e+03
>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>> 1 SNES Function norm 4.662386014882e+03
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>> type: newtonls
>>> maximum iterations=1, maximum function evaluations=-1
>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>> total number of linear solver iterations=20
>>> total number of function evaluations=2
>>> norm schedule ALWAYS
>>> Jacobian is never rebuilt
>>> Jacobian is built using finite differences with coloring
>>> SNESLineSearch Object: 1 MPI process
>>> type: basic
>>> maxstep=1.000000e+08, minlambda=1.000000e-12
>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>> lambda=1.000000e-08
>>> maximum iterations=40
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>> 0 SNES Function norm 4.662386014882e+03
>>> 0 KSP Residual norm 4.662386014882e+03
>>> 1 KSP Residual norm 4.408316259864e+03
>>> 2 KSP Residual norm 4.184867769829e+03
>>> 3 KSP Residual norm 4.079091244351e+03
>>> 4 KSP Residual norm 4.009247390166e+03
>>> 5 KSP Residual norm 3.928417371428e+03
>>> 6 KSP Residual norm 3.865152075780e+03
>>> 7 KSP Residual norm 3.795606446033e+03
>>> 8 KSP Residual norm 3.735294554158e+03
>>> 9 KSP Residual norm 3.674393726487e+03
>>> 10 KSP Residual norm 3.617795166786e+03
>>> 11 KSP Residual norm 3.563807982274e+03
>>> 12 KSP Residual norm 3.512269444921e+03
>>> 13 KSP Residual norm 3.455110223236e+03
>>> 14 KSP Residual norm 3.407141247372e+03
>>> 15 KSP Residual norm 3.356562415982e+03
>>> 16 KSP Residual norm 3.312720047685e+03
>>> 17 KSP Residual norm 3.263690150810e+03
>>> 18 KSP Residual norm 3.219359862444e+03
>>> 19 KSP Residual norm 3.173500955995e+03
>>> 20 KSP Residual norm 3.127528790155e+03
>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>> 1 SNES Function norm 3.186752172556e+03
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>> type: newtonls
>>> maximum iterations=1, maximum function evaluations=-1
>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>> total number of linear solver iterations=20
>>> total number of function evaluations=2
>>> norm schedule ALWAYS
>>> Jacobian is never rebuilt
>>> Jacobian is built using finite differences with coloring
>>> SNESLineSearch Object: 1 MPI process
>>> type: basic
>>> maxstep=1.000000e+08, minlambda=1.000000e-12
>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>> lambda=1.000000e-08
>>> maximum iterations=40
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>>
>>>
>>>
>>> 0 SNES Function norm 3.424003312857e+04
>>> 0 KSP Residual norm 3.424003312857e+04
>>> 1 KSP Residual norm 2.886124328003e+04
>>> 2 KSP Residual norm 2.504664994221e+04
>>> 3 KSP Residual norm 2.104615835130e+04
>>> 4 KSP Residual norm 1.938102896610e+04
>>> 5 KSP Residual norm 1.793774642406e+04
>>> 6 KSP Residual norm 1.671392566981e+04
>>> 7 KSP Residual norm 1.501504103854e+04
>>> 8 KSP Residual norm 1.366362900726e+04
>>> 9 KSP Residual norm 1.240398500414e+04
>>> 10 KSP Residual norm 1.156293733914e+04
>>> 11 KSP Residual norm 1.066296477972e+04
>>> 12 KSP Residual norm 9.835601967036e+03
>>> 13 KSP Residual norm 9.017480191500e+03
>>> 14 KSP Residual norm 8.415336139732e+03
>>> 15 KSP Residual norm 7.807497808414e+03
>>> 16 KSP Residual norm 7.341703768300e+03
>>> 17 KSP Residual norm 6.979298049244e+03
>>> 18 KSP Residual norm 6.521277772042e+03
>>> 19 KSP Residual norm 6.174842408713e+03
>>> 20 KSP Residual norm 5.889819664983e+03
>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>> 1 SNES Function norm 1.000525348435e+04
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>> type: newtonls
>>> maximum iterations=1, maximum function evaluations=-1
>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>> total number of linear solver iterations=20
>>> total number of function evaluations=2
>>> norm schedule ALWAYS
>>> Jacobian is never rebuilt
>>> Jacobian is built using finite differences with coloring
>>> SNESLineSearch Object: 1 MPI process
>>> type: basic
>>> maxstep=1.000000e+08, minlambda=1.000000e-12
>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>> lambda=1.000000e-08
>>> maximum iterations=40
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>> 0 SNES Function norm 1.000525348435e+04
>>> 0 KSP Residual norm 1.000525348435e+04
>>> 1 KSP Residual norm 7.908741565645e+03
>>> 2 KSP Residual norm 6.825263536988e+03
>>> 3 KSP Residual norm 6.224930664967e+03
>>> 4 KSP Residual norm 6.095547180474e+03
>>> 5 KSP Residual norm 5.952968230397e+03
>>> 6 KSP Residual norm 5.861251998127e+03
>>> 7 KSP Residual norm 5.712439327726e+03
>>> 8 KSP Residual norm 5.583056913167e+03
>>> 9 KSP Residual norm 5.461768804526e+03
>>> 10 KSP Residual norm 5.351937611030e+03
>>> 11 KSP Residual norm 5.224288337536e+03
>>> 12 KSP Residual norm 5.129863847028e+03
>>> 13 KSP Residual norm 5.010818237161e+03
>>> 14 KSP Residual norm 4.907162936143e+03
>>> 15 KSP Residual norm 4.789564773923e+03
>>> 16 KSP Residual norm 4.695173370709e+03
>>> 17 KSP Residual norm 4.584070962145e+03
>>> 18 KSP Residual norm 4.483061424714e+03
>>> 19 KSP Residual norm 4.373384070713e+03
>>> 20 KSP Residual norm 4.260704657576e+03
>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>> 1 SNES Function norm 4.662386014874e+03
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>> type: newtonls
>>> maximum iterations=1, maximum function evaluations=-1
>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>> total number of linear solver iterations=20
>>> total number of function evaluations=2
>>> norm schedule ALWAYS
>>> Jacobian is never rebuilt
>>> Jacobian is built using finite differences with coloring
>>> SNESLineSearch Object: 1 MPI process
>>> type: basic
>>> maxstep=1.000000e+08, minlambda=1.000000e-12
>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>> lambda=1.000000e-08
>>> maximum iterations=40
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>> 0 SNES Function norm 4.662386014874e+03
>>> 0 KSP Residual norm 4.662386014874e+03
>>> 1 KSP Residual norm 4.408316259834e+03
>>> 2 KSP Residual norm 4.184867769891e+03
>>> 3 KSP Residual norm 4.079091244367e+03
>>> 4 KSP Residual norm 4.009247390184e+03
>>> 5 KSP Residual norm 3.928417371457e+03
>>> 6 KSP Residual norm 3.865152075802e+03
>>> 7 KSP Residual norm 3.795606446041e+03
>>> 8 KSP Residual norm 3.735294554160e+03
>>> 9 KSP Residual norm 3.674393726485e+03
>>> 10 KSP Residual norm 3.617795166775e+03
>>> 11 KSP Residual norm 3.563807982249e+03
>>> 12 KSP Residual norm 3.512269444873e+03
>>> 13 KSP Residual norm 3.455110223193e+03
>>> 14 KSP Residual norm 3.407141247334e+03
>>> 15 KSP Residual norm 3.356562415949e+03
>>> 16 KSP Residual norm 3.312720047652e+03
>>> 17 KSP Residual norm 3.263690150782e+03
>>> 18 KSP Residual norm 3.219359862425e+03
>>> 19 KSP Residual norm 3.173500955997e+03
>>> 20 KSP Residual norm 3.127528790156e+03
>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>> 1 SNES Function norm 3.186752172503e+03
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>> type: newtonls
>>> maximum iterations=1, maximum function evaluations=-1
>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>> total number of linear solver iterations=20
>>> total number of function evaluations=2
>>> norm schedule ALWAYS
>>> Jacobian is never rebuilt
>>> Jacobian is built using finite differences with coloring
>>> SNESLineSearch Object: 1 MPI process
>>> type: basic
>>> maxstep=1.000000e+08, minlambda=1.000000e-12
>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>> lambda=1.000000e-08
>>> maximum iterations=40
>>> KSP Object: 1 MPI process
>>> type: gmres
>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>> happy breakdown tolerance 1e-30
>>> maximum iterations=20, initial guess is zero
>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>> type: none
>>> linear system matrix = precond matrix:
>>> Mat Object: 1 MPI process
>>> type: seqbaij
>>> rows=16384, cols=16384, bs=16
>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>> total number of mallocs used during MatSetValues calls=0
>>> block size is 16
>>>
>>> On Thu, May 4, 2023 at 5:22 PM Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Thu, May 4, 2023 at 5:03 PM Mark Lohry <mlohry at gmail.com> wrote:
>>>>
>>>>> Do you get different results (in different runs) without
>>>>>> -snes_mf_operator? So just using an explicit matrix?
>>>>>
>>>>>
>>>>> Unfortunately I don't have an explicit matrix available for this,
>>>>> hence the MFFD/JFNK.
>>>>>
>>>>
>>>> I don't mean the actual matrix, I mean a representative matrix.
>>>>
>>>>
>>>>>
>>>>>> (Note: I am not convinced there is even a problem and think it may
>>>>>> be simply different order of floating point operations in different runs.)
>>>>>>
>>>>>
>>>>> I'm not convinced either, but running explicit RK for 10,000
>>>>> iterations i get exactly the same results every time so i'm fairly
>>>>> confident it's not the residual evaluation.
>>>>> How would there be a different order of floating point ops in
>>>>> different runs in serial?
>>>>>
>>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running
>>>>>> that solver with a sparse matrix. This would give me confidence
>>>>>> that nothing in the solver is variable.
>>>>>>
>>>>>> I could do the sparse finite difference jacobian once, save it to
>>>>> disk, and then use that system each time.
>>>>>
>>>>
>>>> Yes. That would work.
>>>>
>>>> Thanks,
>>>>
>>>> Matt
>>>>
>>>>
>>>>> On Thu, May 4, 2023 at 4:57 PM Matthew Knepley <knepley at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Thu, May 4, 2023 at 4:44 PM Mark Lohry <mlohry at gmail.com> wrote:
>>>>>>
>>>>>>> Is your code valgrind clean?
>>>>>>>>
>>>>>>>
>>>>>>> Yes, I also initialize all allocations with NaNs to be sure I'm not
>>>>>>> using anything uninitialized.
>>>>>>>
>>>>>>>
>>>>>>>> We can try and test this. Replace your MatMFFD with an actual
>>>>>>>> matrix and run. Do you see any variability?
>>>>>>>>
>>>>>>>
>>>>>>> I think I did what you're asking. I have -snes_mf_operator set, and
>>>>>>> then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where
>>>>>>> diag_ones is a matrix with ones on the diagonal. Two runs below, still with
>>>>>>> differences but sometimes identical.
>>>>>>>
>>>>>>
>>>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running
>>>>>> that solver with a sparse matrix. This would give me confidence
>>>>>> that nothing in the solver is variable.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Matt
>>>>>>
>>>>>>
>>>>>>> 0 SNES Function norm 3.424003312857e+04
>>>>>>> 0 KSP Residual norm 3.424003312857e+04
>>>>>>> 1 KSP Residual norm 2.871734444536e+04
>>>>>>> 2 KSP Residual norm 2.490276930242e+04
>>>>>>> 3 KSP Residual norm 2.131675872968e+04
>>>>>>> 4 KSP Residual norm 1.973129814235e+04
>>>>>>> 5 KSP Residual norm 1.832377856317e+04
>>>>>>> 6 KSP Residual norm 1.716783617436e+04
>>>>>>> 7 KSP Residual norm 1.583963149542e+04
>>>>>>> 8 KSP Residual norm 1.482272170304e+04
>>>>>>> 9 KSP Residual norm 1.380312106742e+04
>>>>>>> 10 KSP Residual norm 1.297793480658e+04
>>>>>>> 11 KSP Residual norm 1.208599123244e+04
>>>>>>> 12 KSP Residual norm 1.137345655227e+04
>>>>>>> 13 KSP Residual norm 1.059676909366e+04
>>>>>>> 14 KSP Residual norm 1.003823862398e+04
>>>>>>> 15 KSP Residual norm 9.425879221354e+03
>>>>>>> 16 KSP Residual norm 8.954805890038e+03
>>>>>>> 17 KSP Residual norm 8.592372470456e+03
>>>>>>> 18 KSP Residual norm 8.060707175821e+03
>>>>>>> 19 KSP Residual norm 7.782057728723e+03
>>>>>>> 20 KSP Residual norm 7.449686095424e+03
>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>> KSP Object: 1 MPI process
>>>>>>> type: gmres
>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>>> Orthogonalization with no iterative refinement
>>>>>>> happy breakdown tolerance 1e-30
>>>>>>> maximum iterations=20, initial guess is zero
>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>>>>>> left preconditioning
>>>>>>> using PRECONDITIONED norm type for convergence test
>>>>>>> PC Object: 1 MPI process
>>>>>>> type: none
>>>>>>> linear system matrix followed by preconditioner matrix:
>>>>>>> Mat Object: 1 MPI process
>>>>>>> type: mffd
>>>>>>> rows=16384, cols=16384
>>>>>>> Matrix-free approximation:
>>>>>>> err=1.49012e-08 (relative error in function evaluation)
>>>>>>> Using wp compute h routine
>>>>>>> Does not compute normU
>>>>>>> Mat Object: 1 MPI process
>>>>>>> type: seqaij
>>>>>>> rows=16384, cols=16384
>>>>>>> total: nonzeros=16384, allocated nonzeros=16384
>>>>>>> total number of mallocs used during MatSetValues calls=0
>>>>>>> not using I-node routines
>>>>>>> 1 SNES Function norm 1.085015646971e+04
>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>> SNES Object: 1 MPI process
>>>>>>> type: newtonls
>>>>>>> maximum iterations=1, maximum function evaluations=-1
>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>>>>> total number of linear solver iterations=20
>>>>>>> total number of function evaluations=23
>>>>>>> norm schedule ALWAYS
>>>>>>> Jacobian is never rebuilt
>>>>>>> Jacobian is applied matrix-free with differencing
>>>>>>> Preconditioning Jacobian is built using finite differences with
>>>>>>> coloring
>>>>>>> SNESLineSearch Object: 1 MPI process
>>>>>>> type: basic
>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>>>>>> lambda=1.000000e-08
>>>>>>> maximum iterations=40
>>>>>>> KSP Object: 1 MPI process
>>>>>>> type: gmres
>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>>> Orthogonalization with no iterative refinement
>>>>>>> happy breakdown tolerance 1e-30
>>>>>>> maximum iterations=20, initial guess is zero
>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>>>>>> left preconditioning
>>>>>>> using PRECONDITIONED norm type for convergence test
>>>>>>> PC Object: 1 MPI process
>>>>>>> type: none
>>>>>>> linear system matrix followed by preconditioner matrix:
>>>>>>> Mat Object: 1 MPI process
>>>>>>> type: mffd
>>>>>>> rows=16384, cols=16384
>>>>>>> Matrix-free approximation:
>>>>>>> err=1.49012e-08 (relative error in function evaluation)
>>>>>>> Using wp compute h routine
>>>>>>> Does not compute normU
>>>>>>> Mat Object: 1 MPI process
>>>>>>> type: seqaij
>>>>>>> rows=16384, cols=16384
>>>>>>> total: nonzeros=16384, allocated nonzeros=16384
>>>>>>> total number of mallocs used during MatSetValues calls=0
>>>>>>> not using I-node routines
>>>>>>>
>>>>>>> 0 SNES Function norm 3.424003312857e+04
>>>>>>> 0 KSP Residual norm 3.424003312857e+04
>>>>>>> 1 KSP Residual norm 2.871734444536e+04
>>>>>>> 2 KSP Residual norm 2.490276931041e+04
>>>>>>> 3 KSP Residual norm 2.131675873776e+04
>>>>>>> 4 KSP Residual norm 1.973129814908e+04
>>>>>>> 5 KSP Residual norm 1.832377852186e+04
>>>>>>> 6 KSP Residual norm 1.716783608174e+04
>>>>>>> 7 KSP Residual norm 1.583963128956e+04
>>>>>>> 8 KSP Residual norm 1.482272160069e+04
>>>>>>> 9 KSP Residual norm 1.380312087005e+04
>>>>>>> 10 KSP Residual norm 1.297793458796e+04
>>>>>>> 11 KSP Residual norm 1.208599115602e+04
>>>>>>> 12 KSP Residual norm 1.137345657533e+04
>>>>>>> 13 KSP Residual norm 1.059676906197e+04
>>>>>>> 14 KSP Residual norm 1.003823857515e+04
>>>>>>> 15 KSP Residual norm 9.425879177747e+03
>>>>>>> 16 KSP Residual norm 8.954805850825e+03
>>>>>>> 17 KSP Residual norm 8.592372413320e+03
>>>>>>> 18 KSP Residual norm 8.060706994110e+03
>>>>>>> 19 KSP Residual norm 7.782057560782e+03
>>>>>>> 20 KSP Residual norm 7.449686034356e+03
>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>> KSP Object: 1 MPI process
>>>>>>> type: gmres
>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>>> Orthogonalization with no iterative refinement
>>>>>>> happy breakdown tolerance 1e-30
>>>>>>> maximum iterations=20, initial guess is zero
>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>>>>>> left preconditioning
>>>>>>> using PRECONDITIONED norm type for convergence test
>>>>>>> PC Object: 1 MPI process
>>>>>>> type: none
>>>>>>> linear system matrix followed by preconditioner matrix:
>>>>>>> Mat Object: 1 MPI process
>>>>>>> type: mffd
>>>>>>> rows=16384, cols=16384
>>>>>>> Matrix-free approximation:
>>>>>>> err=1.49012e-08 (relative error in function evaluation)
>>>>>>> Using wp compute h routine
>>>>>>> Does not compute normU
>>>>>>> Mat Object: 1 MPI process
>>>>>>> type: seqaij
>>>>>>> rows=16384, cols=16384
>>>>>>> total: nonzeros=16384, allocated nonzeros=16384
>>>>>>> total number of mallocs used during MatSetValues calls=0
>>>>>>> not using I-node routines
>>>>>>> 1 SNES Function norm 1.085015821006e+04
>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>> SNES Object: 1 MPI process
>>>>>>> type: newtonls
>>>>>>> maximum iterations=1, maximum function evaluations=-1
>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>>>>> total number of linear solver iterations=20
>>>>>>> total number of function evaluations=23
>>>>>>> norm schedule ALWAYS
>>>>>>> Jacobian is never rebuilt
>>>>>>> Jacobian is applied matrix-free with differencing
>>>>>>> Preconditioning Jacobian is built using finite differences with
>>>>>>> coloring
>>>>>>> SNESLineSearch Object: 1 MPI process
>>>>>>> type: basic
>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>>>>>> lambda=1.000000e-08
>>>>>>> maximum iterations=40
>>>>>>> KSP Object: 1 MPI process
>>>>>>> type: gmres
>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>>> Orthogonalization with no iterative refinement
>>>>>>> happy breakdown tolerance 1e-30
>>>>>>> maximum iterations=20, initial guess is zero
>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>>>>>> left preconditioning
>>>>>>> using PRECONDITIONED norm type for convergence test
>>>>>>> PC Object: 1 MPI process
>>>>>>> type: none
>>>>>>> linear system matrix followed by preconditioner matrix:
>>>>>>> Mat Object: 1 MPI process
>>>>>>> type: mffd
>>>>>>> rows=16384, cols=16384
>>>>>>> Matrix-free approximation:
>>>>>>> err=1.49012e-08 (relative error in function evaluation)
>>>>>>> Using wp compute h routine
>>>>>>> Does not compute normU
>>>>>>> Mat Object: 1 MPI process
>>>>>>> type: seqaij
>>>>>>> rows=16384, cols=16384
>>>>>>> total: nonzeros=16384, allocated nonzeros=16384
>>>>>>> total number of mallocs used during MatSetValues calls=0
>>>>>>> not using I-node routines
>>>>>>>
>>>>>>> On Thu, May 4, 2023 at 10:10 AM Matthew Knepley <knepley at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Thu, May 4, 2023 at 8:54 AM Mark Lohry <mlohry at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Try -pc_type none.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> With -pc_type none the 0 KSP residual looks identical. But
>>>>>>>>> *sometimes* it's producing exactly the same history and others it's
>>>>>>>>> gradually changing. I'm reasonably confident my residual evaluation has no
>>>>>>>>> randomness, see info after the petsc output.
>>>>>>>>>
>>>>>>>>
>>>>>>>> We can try and test this. Replace your MatMFFD with an actual
>>>>>>>> matrix and run. Do you see any variability?
>>>>>>>>
>>>>>>>> If not, then it could be your routine, or it could be MatMFFD. So
>>>>>>>> run a few with -snes_view, and we can see if the
>>>>>>>> "w" parameter changes.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Matt
>>>>>>>>
>>>>>>>>
>>>>>>>>> solve history 1:
>>>>>>>>>
>>>>>>>>> 0 SNES Function norm 3.424003312857e+04
>>>>>>>>> 0 KSP Residual norm 3.424003312857e+04
>>>>>>>>> 1 KSP Residual norm 2.871734444536e+04
>>>>>>>>> 2 KSP Residual norm 2.490276931041e+04
>>>>>>>>> ...
>>>>>>>>> 20 KSP Residual norm 7.449686034356e+03
>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>> 1 SNES Function norm 1.085015821006e+04
>>>>>>>>>
>>>>>>>>> solve history 2, identical to 1:
>>>>>>>>>
>>>>>>>>> 0 SNES Function norm 3.424003312857e+04
>>>>>>>>> 0 KSP Residual norm 3.424003312857e+04
>>>>>>>>> 1 KSP Residual norm 2.871734444536e+04
>>>>>>>>> 2 KSP Residual norm 2.490276931041e+04
>>>>>>>>> ...
>>>>>>>>> 20 KSP Residual norm 7.449686034356e+03
>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>> 1 SNES Function norm 1.085015821006e+04
>>>>>>>>>
>>>>>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2,
>>>>>>>>> growing difference to the end:
>>>>>>>>> 0 SNES Function norm 3.424003312857e+04
>>>>>>>>> 0 KSP Residual norm 3.424003312857e+04
>>>>>>>>> 1 KSP Residual norm 2.871734444536e+04
>>>>>>>>> 2 KSP Residual norm 2.490276930242e+04
>>>>>>>>> ...
>>>>>>>>> 20 KSP Residual norm 7.449686095424e+03
>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>> 1 SNES Function norm 1.085015646971e+04
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for
>>>>>>>>> 10 iterations, so 30 calls of the same residual evaluation, identical
>>>>>>>>> residuals every time
>>>>>>>>>
>>>>>>>>> run 1:
>>>>>>>>>
>>>>>>>>> # iteration rho rhou
>>>>>>>>> rhov rhoE abs_res rel_res
>>>>>>>>> umin vmax vmin
>>>>>>>>> elapsed_time
>>>>>>>>> #
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02
>>>>>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02
>>>>>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14
>>>>>>>>> 6.34834e-01
>>>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02
>>>>>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02
>>>>>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14
>>>>>>>>> 6.40063e-01
>>>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01
>>>>>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02
>>>>>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14
>>>>>>>>> 6.45166e-01
>>>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01
>>>>>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02
>>>>>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14
>>>>>>>>> 6.50494e-01
>>>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01
>>>>>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02
>>>>>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14
>>>>>>>>> 6.55656e-01
>>>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01
>>>>>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02
>>>>>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14
>>>>>>>>> 6.60872e-01
>>>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01
>>>>>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02
>>>>>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14
>>>>>>>>> 6.66041e-01
>>>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01
>>>>>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02
>>>>>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14
>>>>>>>>> 6.71316e-01
>>>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01
>>>>>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02
>>>>>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13
>>>>>>>>> 6.76447e-01
>>>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01
>>>>>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02
>>>>>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13
>>>>>>>>> 6.81716e-01
>>>>>>>>>
>>>>>>>>> run N:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> #
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> # iteration rho rhou
>>>>>>>>> rhov rhoE abs_res rel_res
>>>>>>>>> umin vmax vmin
>>>>>>>>> elapsed_time
>>>>>>>>> #
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02
>>>>>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02
>>>>>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 -1.35563e-14
>>>>>>>>> 6.23316e-01
>>>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02
>>>>>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02
>>>>>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 -1.69925e-14
>>>>>>>>> 6.28510e-01
>>>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01
>>>>>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02
>>>>>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 -1.59733e-14
>>>>>>>>> 6.33558e-01
>>>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01
>>>>>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02
>>>>>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 -1.46238e-14
>>>>>>>>> 6.38773e-01
>>>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01
>>>>>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02
>>>>>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 -1.06666e-14
>>>>>>>>> 6.43887e-01
>>>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01
>>>>>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02
>>>>>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 -1.39713e-14
>>>>>>>>> 6.49073e-01
>>>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01
>>>>>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02
>>>>>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 -2.39351e-14
>>>>>>>>> 6.54167e-01
>>>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01
>>>>>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02
>>>>>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14
>>>>>>>>> 6.59394e-01
>>>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01
>>>>>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02
>>>>>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13
>>>>>>>>> 6.64516e-01
>>>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01
>>>>>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02
>>>>>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 -2.28736e-13
>>>>>>>>> 6.69677e-01
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, May 4, 2023 at 8:41 AM Mark Adams <mfadams at lbl.gov> wrote:
>>>>>>>>>
>>>>>>>>>> ASM is just the sub PC with one proc but gets weaker with more
>>>>>>>>>> procs unless you use jacobi. (maybe I am missing something).
>>>>>>>>>>
>>>>>>>>>> On Thu, May 4, 2023 at 8:31 AM Mark Lohry <mlohry at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Please send the output of -snes_view.
>>>>>>>>>>>>
>>>>>>>>>>> pasted below. anything stand out?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> SNES Object: 1 MPI process
>>>>>>>>>>> type: newtonls
>>>>>>>>>>> maximum iterations=1, maximum function evaluations=-1
>>>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>>>>>>>>> total number of linear solver iterations=20
>>>>>>>>>>> total number of function evaluations=22
>>>>>>>>>>> norm schedule ALWAYS
>>>>>>>>>>> Jacobian is never rebuilt
>>>>>>>>>>> Jacobian is applied matrix-free with differencing
>>>>>>>>>>> Preconditioning Jacobian is built using finite differences
>>>>>>>>>>> with coloring
>>>>>>>>>>> SNESLineSearch Object: 1 MPI process
>>>>>>>>>>> type: basic
>>>>>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>>>>>>>>>> lambda=1.000000e-08
>>>>>>>>>>> maximum iterations=40
>>>>>>>>>>> KSP Object: 1 MPI process
>>>>>>>>>>> type: gmres
>>>>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>>>>>>> Orthogonalization with no iterative refinement
>>>>>>>>>>> happy breakdown tolerance 1e-30
>>>>>>>>>>> maximum iterations=20, initial guess is zero
>>>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10.
>>>>>>>>>>> left preconditioning
>>>>>>>>>>> using PRECONDITIONED norm type for convergence test
>>>>>>>>>>> PC Object: 1 MPI process
>>>>>>>>>>> type: asm
>>>>>>>>>>> total subdomain blocks = 1, amount of overlap = 0
>>>>>>>>>>> restriction/interpolation type - RESTRICT
>>>>>>>>>>> Local solver information for first block is in the
>>>>>>>>>>> following KSP and PC objects on rank 0:
>>>>>>>>>>> Use -ksp_view ::ascii_info_detail to display information
>>>>>>>>>>> for all blocks
>>>>>>>>>>> KSP Object: (sub_) 1 MPI process
>>>>>>>>>>> type: preonly
>>>>>>>>>>> maximum iterations=10000, initial guess is zero
>>>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50,
>>>>>>>>>>> divergence=10000.
>>>>>>>>>>> left preconditioning
>>>>>>>>>>> using NONE norm type for convergence test
>>>>>>>>>>> PC Object: (sub_) 1 MPI process
>>>>>>>>>>> type: ilu
>>>>>>>>>>> out-of-place factorization
>>>>>>>>>>> 0 levels of fill
>>>>>>>>>>> tolerance for zero pivot 2.22045e-14
>>>>>>>>>>> matrix ordering: natural
>>>>>>>>>>> factor fill ratio given 1., needed 1.
>>>>>>>>>>> Factored matrix follows:
>>>>>>>>>>> Mat Object: (sub_) 1 MPI process
>>>>>>>>>>> type: seqbaij
>>>>>>>>>>> rows=16384, cols=16384, bs=16
>>>>>>>>>>> package used to perform factorization: petsc
>>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>>>>> block size is 16
>>>>>>>>>>> linear system matrix = precond matrix:
>>>>>>>>>>> Mat Object: (sub_) 1 MPI process
>>>>>>>>>>> type: seqbaij
>>>>>>>>>>> rows=16384, cols=16384, bs=16
>>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>>>>> total number of mallocs used during MatSetValues calls=0
>>>>>>>>>>> block size is 16
>>>>>>>>>>> linear system matrix followed by preconditioner matrix:
>>>>>>>>>>> Mat Object: 1 MPI process
>>>>>>>>>>> type: mffd
>>>>>>>>>>> rows=16384, cols=16384
>>>>>>>>>>> Matrix-free approximation:
>>>>>>>>>>> err=1.49012e-08 (relative error in function evaluation)
>>>>>>>>>>> Using wp compute h routine
>>>>>>>>>>> Does not compute normU
>>>>>>>>>>> Mat Object: 1 MPI process
>>>>>>>>>>> type: seqbaij
>>>>>>>>>>> rows=16384, cols=16384, bs=16
>>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>>>>> total number of mallocs used during MatSetValues calls=0
>>>>>>>>>>> block size is 16
>>>>>>>>>>>
>>>>>>>>>>> On Thu, May 4, 2023 at 8:30 AM Mark Adams <mfadams at lbl.gov>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> If you are using MG what is the coarse grid solver?
>>>>>>>>>>>> -snes_view might give you that.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, May 4, 2023 at 8:25 AM Matthew Knepley <
>>>>>>>>>>>> knepley at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, May 4, 2023 at 8:21 AM Mark Lohry <mlohry at gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Do they start very similarly and then slowly drift further
>>>>>>>>>>>>>>> apart?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, this. I take it this sounds familiar?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the
>>>>>>>>>>>>>> end. The difference for one solve is slight (final SNES norm is identical
>>>>>>>>>>>>>> to 5 digits), but in the context I'm using it in (repeated applications to
>>>>>>>>>>>>>> solve a steady state multigrid problem, though here just one level) the
>>>>>>>>>>>>>> differences add up such that I might reach global convergence in 35
>>>>>>>>>>>>>> iterations or 38. It's not the end of the world, but I was expecting that
>>>>>>>>>>>>>> with -np 1 these would be identical and I'm not sure where the root cause
>>>>>>>>>>>>>> would be.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> The initial KSP residual is different, so its the PC.
>>>>>>>>>>>>> Please send the output of -snes_view. If your ASM is using direct
>>>>>>>>>>>>> factorization, then it
>>>>>>>>>>>>> could be randomness in whatever LU you are using.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Matt
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04
>>>>>>>>>>>>>> 0 KSP Residual norm 4.045639499595e+01
>>>>>>>>>>>>>> 1 KSP Residual norm 1.917999809040e+01
>>>>>>>>>>>>>> 2 KSP Residual norm 1.616048521958e+01
>>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>> 19 KSP Residual norm 8.788043518111e-01
>>>>>>>>>>>>>> 20 KSP Residual norm 6.570851270214e-01
>>>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>>>>>>> 1 SNES Function norm 1.801309983345e+03
>>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly
>>>>>>>>>>>>>> different
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04
>>>>>>>>>>>>>> 0 KSP Residual norm 4.045639473002e+01
>>>>>>>>>>>>>> 1 KSP Residual norm 1.917999883034e+01
>>>>>>>>>>>>>> 2 KSP Residual norm 1.616048572016e+01
>>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>> 19 KSP Residual norm 8.788046348957e-01
>>>>>>>>>>>>>> 20 KSP Residual norm 6.570859588610e-01
>>>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>>>>>>> 1 SNES Function norm 1.801311320322e+03
>>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, May 3, 2023 at 11:05 PM Barry Smith <bsmith at petsc.dev>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Do they start very similarly and then slowly drift further
>>>>>>>>>>>>>>> apart? That is the first couple of KSP iterations they are almost identical
>>>>>>>>>>>>>>> but then for each iteration get a bit further. Similar for the SNES
>>>>>>>>>>>>>>> iterations, starting close and then for more iterations and more solves
>>>>>>>>>>>>>>> they start moving apart. Or do they suddenly jump to be very different? You
>>>>>>>>>>>>>>> can run with -snes_monitor -ksp_monitor
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry <mlohry at gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is on a single MPI rank. I haven't checked the
>>>>>>>>>>>>>>> coloring, was just guessing there. But the solutions/residuals are slightly
>>>>>>>>>>>>>>> different from run to run.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should
>>>>>>>>>>>>>>> expect bitwise identical results?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith <bsmith at petsc.dev>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> No, the coloring should be identical every time. Do you
>>>>>>>>>>>>>>>> see differences with 1 MPI rank? (Or much smaller ones?).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry <mlohry at gmail.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an
>>>>>>>>>>>>>>>> MFFD/JFNK nonlinear solver where I give it the sparsity. PC asm, KSP gmres,
>>>>>>>>>>>>>>>> with SNESSetLagJacobian -2 (compute once and then frozen jacobian).
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in
>>>>>>>>>>>>>>>> residuals from run to run. I'm wondering where randomness might enter here
>>>>>>>>>>>>>>>> -- does the jacobian coloring use a random seed?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> What most experimenters take for granted before they begin
>>>>>>>>>>>>> their experiments is infinitely more interesting than any results to which
>>>>>>>>>>>>> their experiments lead.
>>>>>>>>>>>>> -- Norbert Wiener
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>> experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>>
>>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their
>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>> experiments lead.
>>>>>> -- Norbert Wiener
>>>>>>
>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>> <configure.log>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230505/4a994515/attachment-0001.html>
More information about the petsc-users
mailing list