[petsc-users] MOOSE_SNES_ISSUES
Patrick Sanan
patrick.sanan at gmail.com
Tue Mar 1 03:11:07 CST 2016
On Tue, Mar 01, 2016 at 09:58:18AM +0100, alena kopanicakova wrote:
> Hello,
>
> I am developing my own nonlinear solver, which aims to solve simulations
> from MOOSE. Communication with moose is done via SNES interface.
>
> I am obtaining Jacobian and residual in following way:
>
> SNESComputeFunction(snes, X, R);
>
> SNESSetJacobian(snes, jac, jac, SNESComputeJacobianDefault,
As here, http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESComputeJacobianDefault.html ,
it seems like you are computing the Jacobian in a brute-force way with finite differences.
Do you have an expression to compute the Jacobian?
> NULL);
> SNESComputeJacobian(snes, X, jac, jac);
>
> Unfortunately, by this setting it takes incredible amount of time to obtain
> Jacobian.
> Taking closer look at perf log, it seems that difference between mine and
> MOOSE solver is, that my executioner calls compute_residual() function
> ridiculously many times.
> I have no idea, what could be causing such a behavior.
>
> Do you have any suggestions, how to set up interface properly? or which
> change is needed for obtaining more-less same performance as MOOSE
> executioner?
>
>
> many thanks,
> alena
>
> Framework Information:
> MOOSE version: git commit e79292e on 2016-01-05
> PETSc Version: 3.6.1
> Current Time: Tue Mar 1 00:51:45 2016
> Executable Timestamp: Tue Mar 1 00:36:30 2016
>
> Parallelism:
> Num Processors: 1
> Num Threads: 1
>
> Mesh:
> Distribution: serial
> Mesh Dimension: 3
> Spatial Dimension: 3
> Nodes:
> Total: 1331
> Local: 1331
> Elems:
> Total: 1000
> Local: 1000
> Num Subdomains: 1
> Num Partitions: 1
>
> Nonlinear System:
> Num DOFs: 3993
> Num Local DOFs: 3993
> Variables: { "disp_x" "disp_y" "disp_z" }
> Finite Element Types: "LAGRANGE"
> Approximation Orders: "FIRST"
>
> Execution Information:
> Executioner: Steady
> Solver Mode: NEWTON
>
>
>
> 0 Nonlinear |R| = [32m2.850000e-02[39m
> 0 Linear |R| = [32m2.850000e-02[39m
> 1 Linear |R| = [32m1.653670e-02[39m
> 2 Linear |R| = [32m1.447168e-02[39m
> 3 Linear |R| = [33m1.392965e-02[39m
> 4 Linear |R| = [32m1.258440e-02[39m
> 5 Linear |R| = [32m1.007181e-02[39m
> 6 Linear |R| = [32m8.264315e-03[39m
> 7 Linear |R| = [32m6.541897e-03[39m
> 8 Linear |R| = [32m4.371900e-03[39m
> 9 Linear |R| = [32m2.100406e-03[39m
> 10 Linear |R| = [32m1.227539e-03[39m
> 11 Linear |R| = [32m1.026286e-03[39m
> 12 Linear |R| = [32m9.180101e-04[39m
> 13 Linear |R| = [32m8.087598e-04[39m
> 14 Linear |R| = [32m6.435247e-04[39m
> 15 Linear |R| = [32m5.358688e-04[39m
> 16 Linear |R| = [32m4.551657e-04[39m
> 17 Linear |R| = [32m4.090276e-04[39m
> 18 Linear |R| = [32m3.359290e-04[39m
> 19 Linear |R| = [32m2.417643e-04[39m
> 20 Linear |R| = [32m1.710452e-04[39m
> 21 Linear |R| = [32m1.261996e-04[39m
> 22 Linear |R| = [32m9.384052e-05[39m
> 23 Linear |R| = [32m6.070637e-05[39m
> 24 Linear |R| = [32m4.283233e-05[39m
> 25 Linear |R| = [32m3.383792e-05[39m
> 26 Linear |R| = [32m2.342289e-05[39m
> 27 Linear |R| = [32m1.700855e-05[39m
> 28 Linear |R| = [32m9.814278e-06[39m
> 29 Linear |R| = [32m4.398519e-06[39m
> 30 Linear |R| = [32m2.161205e-06[39m
> 31 Linear |R| = [32m1.289206e-06[39m
> 32 Linear |R| = [32m6.548007e-07[39m
> 33 Linear |R| = [32m3.677894e-07[39m
> 34 Linear |R| = [32m2.640006e-07[39m
> 1 Nonlinear |R| = [32m2.400310e-02[39m
> 0 Linear |R| = [32m2.400310e-02[39m
> 1 Linear |R| = [32m9.102075e-03[39m
> 2 Linear |R| = [32m5.017381e-03[39m
> 3 Linear |R| = [32m3.840732e-03[39m
> 4 Linear |R| = [32m2.990685e-03[39m
> 5 Linear |R| = [32m1.990203e-03[39m
> 6 Linear |R| = [32m1.085764e-03[39m
> 7 Linear |R| = [32m4.657779e-04[39m
> 8 Linear |R| = [32m3.049692e-04[39m
> 9 Linear |R| = [32m1.625839e-04[39m
> 10 Linear |R| = [32m1.124700e-04[39m
> 11 Linear |R| = [32m7.764153e-05[39m
> 12 Linear |R| = [32m5.698577e-05[39m
> 13 Linear |R| = [32m4.581843e-05[39m
> 14 Linear |R| = [32m4.262610e-05[39m
> 15 Linear |R| = [32m3.792804e-05[39m
> 16 Linear |R| = [32m3.404168e-05[39m
> 17 Linear |R| = [32m2.536004e-05[39m
> 18 Linear |R| = [32m1.577559e-05[39m
> 19 Linear |R| = [32m9.099392e-06[39m
> 20 Linear |R| = [32m6.140685e-06[39m
> 21 Linear |R| = [32m5.083606e-06[39m
> 22 Linear |R| = [32m4.521560e-06[39m
> 23 Linear |R| = [32m3.601845e-06[39m
> 24 Linear |R| = [32m2.776090e-06[39m
> 25 Linear |R| = [32m2.252274e-06[39m
> 26 Linear |R| = [32m1.898090e-06[39m
> 27 Linear |R| = [32m1.620684e-06[39m
> 28 Linear |R| = [32m1.395574e-06[39m
> 29 Linear |R| = [32m1.157953e-06[39m
> 30 Linear |R| = [32m9.540738e-07[39m
> 31 Linear |R| = [32m8.487724e-07[39m
> 32 Linear |R| = [32m7.634710e-07[39m
> 33 Linear |R| = [32m6.254549e-07[39m
> 34 Linear |R| = [32m4.811588e-07[39m
> 35 Linear |R| = [32m3.930739e-07[39m
> 36 Linear |R| = [32m3.340577e-07[39m
> 37 Linear |R| = [32m2.873430e-07[39m
> 38 Linear |R| = [32m2.407606e-07[39m
> 39 Linear |R| = [32m1.978818e-07[39m
> 2 Nonlinear |R| = [32m4.185813e-04[39m
> 0 Linear |R| = [32m4.185813e-04[39m
> 1 Linear |R| = [32m1.406808e-04[39m
> 2 Linear |R| = [32m7.266714e-05[39m
> 3 Linear |R| = [32m5.734138e-05[39m
> 4 Linear |R| = [32m4.524739e-05[39m
> 5 Linear |R| = [32m3.025661e-05[39m
> 6 Linear |R| = [32m1.946626e-05[39m
> 7 Linear |R| = [32m1.005809e-05[39m
> 8 Linear |R| = [32m7.639142e-06[39m
> 9 Linear |R| = [32m6.668613e-06[39m
> 10 Linear |R| = [32m6.070601e-06[39m
> 11 Linear |R| = [32m5.496769e-06[39m
> 12 Linear |R| = [32m4.388115e-06[39m
> 13 Linear |R| = [32m2.966258e-06[39m
> 14 Linear |R| = [32m1.838201e-06[39m
> 15 Linear |R| = [32m9.709174e-07[39m
> 16 Linear |R| = [32m6.743766e-07[39m
> 17 Linear |R| = [32m5.531138e-07[39m
> 18 Linear |R| = [32m4.649969e-07[39m
> 19 Linear |R| = [32m3.982799e-07[39m
> 20 Linear |R| = [32m3.662679e-07[39m
> 21 Linear |R| = [32m3.309140e-07[39m
> 22 Linear |R| = [32m2.652039e-07[39m
> 23 Linear |R| = [32m1.728911e-07[39m
> 24 Linear |R| = [32m1.005779e-07[39m
> 25 Linear |R| = [32m5.747041e-08[39m
> 26 Linear |R| = [32m4.185011e-08[39m
> 27 Linear |R| = [32m3.394446e-08[39m
> 28 Linear |R| = [32m2.788435e-08[39m
> 29 Linear |R| = [32m2.046992e-08[39m
> 30 Linear |R| = [32m1.231943e-08[39m
> 31 Linear |R| = [32m8.724911e-09[39m
> 32 Linear |R| = [32m6.390162e-09[39m
> 33 Linear |R| = [32m5.060595e-09[39m
> 34 Linear |R| = [32m4.216656e-09[39m
> 35 Linear |R| = [32m2.969865e-09[39m
> 3 Nonlinear |R| = [32m2.494055e-07[39m
> 0 Linear |R| = [32m2.494055e-07[39m
> 1 Linear |R| = [32m8.559637e-08[39m
> 2 Linear |R| = [32m4.335101e-08[39m
> 3 Linear |R| = [32m3.214303e-08[39m
> 4 Linear |R| = [32m2.549409e-08[39m
> 5 Linear |R| = [32m1.899624e-08[39m
> 6 Linear |R| = [32m1.522624e-08[39m
> 7 Linear |R| = [32m1.258408e-08[39m
> 8 Linear |R| = [32m1.098545e-08[39m
> 9 Linear |R| = [32m1.009481e-08[39m
> 10 Linear |R| = [32m8.423983e-09[39m
> 11 Linear |R| = [32m6.946144e-09[39m
> 12 Linear |R| = [32m5.624875e-09[39m
> 13 Linear |R| = [32m4.448760e-09[39m
> 14 Linear |R| = [32m2.834320e-09[39m
> 15 Linear |R| = [32m1.614722e-09[39m
> 16 Linear |R| = [32m9.409384e-10[39m
> 17 Linear |R| = [32m7.775851e-10[39m
> 18 Linear |R| = [32m6.905971e-10[39m
> 19 Linear |R| = [32m6.129201e-10[39m
> 20 Linear |R| = [32m5.438935e-10[39m
> 21 Linear |R| = [32m4.435519e-10[39m
> 22 Linear |R| = [32m3.486621e-10[39m
> 23 Linear |R| = [32m2.811928e-10[39m
> 24 Linear |R| = [32m2.159800e-10[39m
> 25 Linear |R| = [32m1.670940e-10[39m
> 26 Linear |R| = [32m1.338889e-10[39m
> 27 Linear |R| = [32m9.926067e-11[39m
> 28 Linear |R| = [32m7.483221e-11[39m
> 29 Linear |R| = [32m5.045662e-11[39m
> 30 Linear |R| = [32m2.772335e-11[39m
> 31 Linear |R| = [32m1.814968e-11[39m
> 32 Linear |R| = [32m1.264268e-11[39m
> 33 Linear |R| = [32m9.856586e-12[39m
> 34 Linear |R| = [32m7.802757e-12[39m
> 35 Linear |R| = [32m6.092276e-12[39m
> 36 Linear |R| = [32m4.785005e-12[39m
> 37 Linear |R| = [32m3.887554e-12[39m
> 38 Linear |R| = [32m3.125756e-12[39m
> 39 Linear |R| = [32m2.543989e-12[39m
> 40 Linear |R| = [32m2.062100e-12[39m
> 4 Nonlinear |R| = [32m2.522382e-12[39m
>
> ------------------------------------------------------------------------------------------------------------
> | Whale Performance: Alive time=4.25229, Active time=4.12034 |
> ------------------------------------------------------------------------------------------------------------
> | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
> | w/o Sub w/o Sub With Sub With Sub w/o S With S |
> |------------------------------------------------------------------------------------------------------------|
> | |
> | |
> | Exodus |
> | output() 2 0.0126 0.006300 0.0126 0.006300 0.31 0.31 |
> | |
> | Setup |
> | computeAuxiliaryKernels() 2 0.0000 0.000000 0.0000 0.000000 0.00 0.00 |
> | computeControls() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
> | computeUserObjects() 4 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
> | |
> | Solve |
> | ComputeResidualThread 330 3.8288 0.011603 3.8288 0.011603 92.93 92.93 |
> | computeDiracContributions() 331 0.0002 0.000001 0.0002 0.000001 0.00 0.00 |
> | compute_jacobian() 1 0.1302 0.130242 0.1302 0.130243 3.16 3.16 |
> | compute_residual() 330 0.0394 0.000119 3.8724 0.011735 0.96 93.98 |
> | residual.close3() 330 0.0020 0.000006 0.0020 0.000006 0.05 0.05 |
> | residual.close4() 330 0.0019 0.000006 0.0019 0.000006 0.05 0.05 |
> | solve() 1 0.1052 0.105226 4.1079 4.107900 2.55 99.70 |
> ------------------------------------------------------------------------------------------------------------
> | Totals: 1663 4.1203 100.00 |
> ------------------------------------------------------------------------------------------------------------
> -------------------------------------------------------------------------------------------------------------------------
> | Setup Performance: Alive time=4.25186, Active time=0.060553 |
> -------------------------------------------------------------------------------------------------------------------------
> | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
> | w/o Sub w/o Sub With Sub With Sub w/o S With S |
> |-------------------------------------------------------------------------------------------------------------------------|
> | |
> | |
> | Setup |
> | Create Executioner 1 0.0003 0.000313 0.0003 0.000313 0.52 0.52 |
> | FEProblem::init::meshChanged() 1 0.0016 0.001626 0.0016 0.001626 2.69 2.69 |
> | Initial computeUserObjects() 1 0.0000 0.000005 0.0000 0.000005 0.01 0.01 |
> | Initial execMultiApps() 1 0.0000 0.000003 0.0000 0.000003 0.00 0.00 |
> | Initial execTransfers() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
> | Initial updateActiveSemiLocalNodeRange() 1 0.0004 0.000396 0.0004 0.000396 0.65 0.65 |
> | Initial updateGeomSearch() 2 0.0000 0.000002 0.0000 0.000002 0.00 0.00 |
> | NonlinearSystem::update() 1 0.0000 0.000036 0.0000 0.000036 0.06 0.06 |
> | Output Initial Condition 1 0.0107 0.010671 0.0107 0.010671 17.62 17.62 |
> | Prepare Mesh 1 0.0017 0.001737 0.0017 0.001737 2.87 2.87 |
> | copySolutionsBackwards() 1 0.0000 0.000023 0.0000 0.000023 0.04 0.04 |
> | eq.init() 1 0.0455 0.045469 0.0455 0.045469 75.09 75.09 |
> | getMinQuadratureOrder() 1 0.0000 0.000004 0.0000 0.000004 0.01 0.01 |
> | initial adaptivity 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 |
> | maxQps() 1 0.0003 0.000263 0.0003 0.000263 0.43 0.43 |
> | reinit() after updateGeomSearch() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
> | |
> | ghostGhostedBoundaries |
> | eq.init() 1 0.0000 0.000002 0.0000 0.000002 0.00 0.00 |
> -------------------------------------------------------------------------------------------------------------------------
> | Totals: 18 0.0606 100.00 |
> -------------------------------------------------------------------------------------------------------------------------
>
> Framework Information:
> MOOSE version: git commit e79292e on 2016-01-05
> PETSc Version: 3.6.1
> Current Time: Tue Mar 1 00:41:08 2016
> Executable Timestamp: Tue Mar 1 00:36:30 2016
>
> Parallelism:
> Num Processors: 1
> Num Threads: 1
>
> Mesh:
> Distribution: serial
> Mesh Dimension: 3
> Spatial Dimension: 3
> Nodes:
> Total: 1331
> Local: 1331
> Elems:
> Total: 1000
> Local: 1000
> Num Subdomains: 1
> Num Partitions: 1
>
> Nonlinear System:
> Num DOFs: 3993
> Num Local DOFs: 3993
> Variables: { "disp_x" "disp_y" "disp_z" }
> Finite Element Types: "LAGRANGE"
> Approximation Orders: "FIRST"
>
> Execution Information:
> Executioner: PassoSteady
> Solver Mode: NEWTON
>
>
>
> In Function SNESCreate_passo_Newton_Solver
>
> it. || g ||
> ------- -------------------
> 1 0.0240028
> 2 0.000418569
> 3 2.49436e-07
> 4 1.52966e-12
>
> Solver converged in 4 iterations.
>
> Outlier Variable Residual Norms:
> disp_z: [31m2.850000e-02[39m
>
> ------------------------------------------------------------------------------------------------------------
> | Whale Performance: Alive time=199.422, Active time=199.285 |
> ------------------------------------------------------------------------------------------------------------
> | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
> | w/o Sub w/o Sub With Sub With Sub w/o S With S |
> |------------------------------------------------------------------------------------------------------------|
> | |
> | |
> | Exodus |
> | output() 2 0.0110 0.005503 0.0110 0.005503 0.01 0.01 |
> | |
> | Setup |
> | computeAuxiliaryKernels() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
> | computeControls() 2 0.0000 0.000000 0.0000 0.000000 0.00 0.00 |
> | computeUserObjects() 4 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
> | |
> | Solve |
> | ComputeResidualThread 15985 189.8671 0.011878 189.8671 0.011878 95.27 95.27 |
> | computeDiracContributions() 15986 0.0106 0.000001 0.0106 0.000001 0.01 0.01 |
> | compute_jacobian() 1 0.1265 0.126484 0.1265 0.126485 0.06 0.06 |
> | compute_residual() 15985 2.1702 0.000136 192.3009 0.012030 1.09 96.50 |
> | residual.close3() 15985 0.1200 0.000008 0.1200 0.000008 0.06 0.06 |
> | residual.close4() 15985 0.1263 0.000008 0.1263 0.000008 0.06 0.06 |
> | solve() 1 6.8537 6.853733 199.2830 199.282969 3.44 100.00 |
> ------------------------------------------------------------------------------------------------------------
> | Totals: 79938 199.2854 100.00 |
> ------------------------------------------------------------------------------------------------------------
> -------------------------------------------------------------------------------------------------------------------------
> | Setup Performance: Alive time=199.422, Active time=0.055161 |
> -------------------------------------------------------------------------------------------------------------------------
> | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
> | w/o Sub w/o Sub With Sub With Sub w/o S With S |
> |-------------------------------------------------------------------------------------------------------------------------|
> | |
> | |
> | Setup |
> | Create Executioner 1 0.0003 0.000315 0.0003 0.000315 0.57 0.57 |
> | FEProblem::init::meshChanged() 1 0.0016 0.001593 0.0016 0.001593 2.89 2.89 |
> | Initial computeUserObjects() 1 0.0000 0.000005 0.0000 0.000005 0.01 0.01 |
> | Initial execMultiApps() 1 0.0000 0.000003 0.0000 0.000003 0.01 0.01 |
> | Initial execTransfers() 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 |
> | Initial updateActiveSemiLocalNodeRange() 1 0.0004 0.000380 0.0004 0.000380 0.69 0.69 |
> | Initial updateGeomSearch() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
> | NonlinearSystem::update() 1 0.0000 0.000034 0.0000 0.000034 0.06 0.06 |
> | Output Initial Condition 1 0.0091 0.009094 0.0091 0.009094 16.49 16.49 |
> | Prepare Mesh 1 0.0018 0.001766 0.0018 0.001766 3.20 3.20 |
> | copySolutionsBackwards() 1 0.0000 0.000025 0.0000 0.000025 0.05 0.05 |
> | eq.init() 1 0.0417 0.041668 0.0417 0.041668 75.54 75.54 |
> | getMinQuadratureOrder() 1 0.0000 0.000004 0.0000 0.000004 0.01 0.01 |
> | initial adaptivity 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 |
> | maxQps() 1 0.0003 0.000270 0.0003 0.000270 0.49 0.49 |
> | reinit() after updateGeomSearch() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
> | |
> | ghostGhostedBoundaries |
> | eq.init() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
> -------------------------------------------------------------------------------------------------------------------------
> | Totals: 18 0.0552 100.00 |
> -------------------------------------------------------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160301/fd7861ce/attachment-0001.pgp>
More information about the petsc-users
mailing list