[petsc-users] MOOSE_SNES_ISSUES
alena kopanicakova
alena.kopanicakova13 at gmail.com
Tue Mar 1 02:58:18 CST 2016
Hello,
I am developing my own nonlinear solver, which aims to solve simulations
from MOOSE. Communication with moose is done via SNES interface.
I am obtaining Jacobian and residual in following way:
SNESComputeFunction(snes, X, R);
SNESSetJacobian(snes, jac, jac, SNESComputeJacobianDefault,
NULL);
SNESComputeJacobian(snes, X, jac, jac);
Unfortunately, by this setting it takes incredible amount of time to obtain
Jacobian.
Taking closer look at perf log, it seems that difference between mine and
MOOSE solver is, that my executioner calls compute_residual() function
ridiculously many times.
I have no idea, what could be causing such a behavior.
Do you have any suggestions, how to set up interface properly? or which
change is needed for obtaining more-less same performance as MOOSE
executioner?
many thanks,
alena
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160301/b144ec38/attachment.html>
-------------- next part --------------
Framework Information:
MOOSE version: git commit e79292e on 2016-01-05
PETSc Version: 3.6.1
Current Time: Tue Mar 1 00:51:45 2016
Executable Timestamp: Tue Mar 1 00:36:30 2016
Parallelism:
Num Processors: 1
Num Threads: 1
Mesh:
Distribution: serial
Mesh Dimension: 3
Spatial Dimension: 3
Nodes:
Total: 1331
Local: 1331
Elems:
Total: 1000
Local: 1000
Num Subdomains: 1
Num Partitions: 1
Nonlinear System:
Num DOFs: 3993
Num Local DOFs: 3993
Variables: { "disp_x" "disp_y" "disp_z" }
Finite Element Types: "LAGRANGE"
Approximation Orders: "FIRST"
Execution Information:
Executioner: Steady
Solver Mode: NEWTON
0 Nonlinear |R| = [32m2.850000e-02[39m
0 Linear |R| = [32m2.850000e-02[39m
1 Linear |R| = [32m1.653670e-02[39m
2 Linear |R| = [32m1.447168e-02[39m
3 Linear |R| = [33m1.392965e-02[39m
4 Linear |R| = [32m1.258440e-02[39m
5 Linear |R| = [32m1.007181e-02[39m
6 Linear |R| = [32m8.264315e-03[39m
7 Linear |R| = [32m6.541897e-03[39m
8 Linear |R| = [32m4.371900e-03[39m
9 Linear |R| = [32m2.100406e-03[39m
10 Linear |R| = [32m1.227539e-03[39m
11 Linear |R| = [32m1.026286e-03[39m
12 Linear |R| = [32m9.180101e-04[39m
13 Linear |R| = [32m8.087598e-04[39m
14 Linear |R| = [32m6.435247e-04[39m
15 Linear |R| = [32m5.358688e-04[39m
16 Linear |R| = [32m4.551657e-04[39m
17 Linear |R| = [32m4.090276e-04[39m
18 Linear |R| = [32m3.359290e-04[39m
19 Linear |R| = [32m2.417643e-04[39m
20 Linear |R| = [32m1.710452e-04[39m
21 Linear |R| = [32m1.261996e-04[39m
22 Linear |R| = [32m9.384052e-05[39m
23 Linear |R| = [32m6.070637e-05[39m
24 Linear |R| = [32m4.283233e-05[39m
25 Linear |R| = [32m3.383792e-05[39m
26 Linear |R| = [32m2.342289e-05[39m
27 Linear |R| = [32m1.700855e-05[39m
28 Linear |R| = [32m9.814278e-06[39m
29 Linear |R| = [32m4.398519e-06[39m
30 Linear |R| = [32m2.161205e-06[39m
31 Linear |R| = [32m1.289206e-06[39m
32 Linear |R| = [32m6.548007e-07[39m
33 Linear |R| = [32m3.677894e-07[39m
34 Linear |R| = [32m2.640006e-07[39m
1 Nonlinear |R| = [32m2.400310e-02[39m
0 Linear |R| = [32m2.400310e-02[39m
1 Linear |R| = [32m9.102075e-03[39m
2 Linear |R| = [32m5.017381e-03[39m
3 Linear |R| = [32m3.840732e-03[39m
4 Linear |R| = [32m2.990685e-03[39m
5 Linear |R| = [32m1.990203e-03[39m
6 Linear |R| = [32m1.085764e-03[39m
7 Linear |R| = [32m4.657779e-04[39m
8 Linear |R| = [32m3.049692e-04[39m
9 Linear |R| = [32m1.625839e-04[39m
10 Linear |R| = [32m1.124700e-04[39m
11 Linear |R| = [32m7.764153e-05[39m
12 Linear |R| = [32m5.698577e-05[39m
13 Linear |R| = [32m4.581843e-05[39m
14 Linear |R| = [32m4.262610e-05[39m
15 Linear |R| = [32m3.792804e-05[39m
16 Linear |R| = [32m3.404168e-05[39m
17 Linear |R| = [32m2.536004e-05[39m
18 Linear |R| = [32m1.577559e-05[39m
19 Linear |R| = [32m9.099392e-06[39m
20 Linear |R| = [32m6.140685e-06[39m
21 Linear |R| = [32m5.083606e-06[39m
22 Linear |R| = [32m4.521560e-06[39m
23 Linear |R| = [32m3.601845e-06[39m
24 Linear |R| = [32m2.776090e-06[39m
25 Linear |R| = [32m2.252274e-06[39m
26 Linear |R| = [32m1.898090e-06[39m
27 Linear |R| = [32m1.620684e-06[39m
28 Linear |R| = [32m1.395574e-06[39m
29 Linear |R| = [32m1.157953e-06[39m
30 Linear |R| = [32m9.540738e-07[39m
31 Linear |R| = [32m8.487724e-07[39m
32 Linear |R| = [32m7.634710e-07[39m
33 Linear |R| = [32m6.254549e-07[39m
34 Linear |R| = [32m4.811588e-07[39m
35 Linear |R| = [32m3.930739e-07[39m
36 Linear |R| = [32m3.340577e-07[39m
37 Linear |R| = [32m2.873430e-07[39m
38 Linear |R| = [32m2.407606e-07[39m
39 Linear |R| = [32m1.978818e-07[39m
2 Nonlinear |R| = [32m4.185813e-04[39m
0 Linear |R| = [32m4.185813e-04[39m
1 Linear |R| = [32m1.406808e-04[39m
2 Linear |R| = [32m7.266714e-05[39m
3 Linear |R| = [32m5.734138e-05[39m
4 Linear |R| = [32m4.524739e-05[39m
5 Linear |R| = [32m3.025661e-05[39m
6 Linear |R| = [32m1.946626e-05[39m
7 Linear |R| = [32m1.005809e-05[39m
8 Linear |R| = [32m7.639142e-06[39m
9 Linear |R| = [32m6.668613e-06[39m
10 Linear |R| = [32m6.070601e-06[39m
11 Linear |R| = [32m5.496769e-06[39m
12 Linear |R| = [32m4.388115e-06[39m
13 Linear |R| = [32m2.966258e-06[39m
14 Linear |R| = [32m1.838201e-06[39m
15 Linear |R| = [32m9.709174e-07[39m
16 Linear |R| = [32m6.743766e-07[39m
17 Linear |R| = [32m5.531138e-07[39m
18 Linear |R| = [32m4.649969e-07[39m
19 Linear |R| = [32m3.982799e-07[39m
20 Linear |R| = [32m3.662679e-07[39m
21 Linear |R| = [32m3.309140e-07[39m
22 Linear |R| = [32m2.652039e-07[39m
23 Linear |R| = [32m1.728911e-07[39m
24 Linear |R| = [32m1.005779e-07[39m
25 Linear |R| = [32m5.747041e-08[39m
26 Linear |R| = [32m4.185011e-08[39m
27 Linear |R| = [32m3.394446e-08[39m
28 Linear |R| = [32m2.788435e-08[39m
29 Linear |R| = [32m2.046992e-08[39m
30 Linear |R| = [32m1.231943e-08[39m
31 Linear |R| = [32m8.724911e-09[39m
32 Linear |R| = [32m6.390162e-09[39m
33 Linear |R| = [32m5.060595e-09[39m
34 Linear |R| = [32m4.216656e-09[39m
35 Linear |R| = [32m2.969865e-09[39m
3 Nonlinear |R| = [32m2.494055e-07[39m
0 Linear |R| = [32m2.494055e-07[39m
1 Linear |R| = [32m8.559637e-08[39m
2 Linear |R| = [32m4.335101e-08[39m
3 Linear |R| = [32m3.214303e-08[39m
4 Linear |R| = [32m2.549409e-08[39m
5 Linear |R| = [32m1.899624e-08[39m
6 Linear |R| = [32m1.522624e-08[39m
7 Linear |R| = [32m1.258408e-08[39m
8 Linear |R| = [32m1.098545e-08[39m
9 Linear |R| = [32m1.009481e-08[39m
10 Linear |R| = [32m8.423983e-09[39m
11 Linear |R| = [32m6.946144e-09[39m
12 Linear |R| = [32m5.624875e-09[39m
13 Linear |R| = [32m4.448760e-09[39m
14 Linear |R| = [32m2.834320e-09[39m
15 Linear |R| = [32m1.614722e-09[39m
16 Linear |R| = [32m9.409384e-10[39m
17 Linear |R| = [32m7.775851e-10[39m
18 Linear |R| = [32m6.905971e-10[39m
19 Linear |R| = [32m6.129201e-10[39m
20 Linear |R| = [32m5.438935e-10[39m
21 Linear |R| = [32m4.435519e-10[39m
22 Linear |R| = [32m3.486621e-10[39m
23 Linear |R| = [32m2.811928e-10[39m
24 Linear |R| = [32m2.159800e-10[39m
25 Linear |R| = [32m1.670940e-10[39m
26 Linear |R| = [32m1.338889e-10[39m
27 Linear |R| = [32m9.926067e-11[39m
28 Linear |R| = [32m7.483221e-11[39m
29 Linear |R| = [32m5.045662e-11[39m
30 Linear |R| = [32m2.772335e-11[39m
31 Linear |R| = [32m1.814968e-11[39m
32 Linear |R| = [32m1.264268e-11[39m
33 Linear |R| = [32m9.856586e-12[39m
34 Linear |R| = [32m7.802757e-12[39m
35 Linear |R| = [32m6.092276e-12[39m
36 Linear |R| = [32m4.785005e-12[39m
37 Linear |R| = [32m3.887554e-12[39m
38 Linear |R| = [32m3.125756e-12[39m
39 Linear |R| = [32m2.543989e-12[39m
40 Linear |R| = [32m2.062100e-12[39m
4 Nonlinear |R| = [32m2.522382e-12[39m
------------------------------------------------------------------------------------------------------------
| Whale Performance: Alive time=4.25229, Active time=4.12034 |
------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|------------------------------------------------------------------------------------------------------------|
| |
| |
| Exodus |
| output() 2 0.0126 0.006300 0.0126 0.006300 0.31 0.31 |
| |
| Setup |
| computeAuxiliaryKernels() 2 0.0000 0.000000 0.0000 0.000000 0.00 0.00 |
| computeControls() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
| computeUserObjects() 4 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
| |
| Solve |
| ComputeResidualThread 330 3.8288 0.011603 3.8288 0.011603 92.93 92.93 |
| computeDiracContributions() 331 0.0002 0.000001 0.0002 0.000001 0.00 0.00 |
| compute_jacobian() 1 0.1302 0.130242 0.1302 0.130243 3.16 3.16 |
| compute_residual() 330 0.0394 0.000119 3.8724 0.011735 0.96 93.98 |
| residual.close3() 330 0.0020 0.000006 0.0020 0.000006 0.05 0.05 |
| residual.close4() 330 0.0019 0.000006 0.0019 0.000006 0.05 0.05 |
| solve() 1 0.1052 0.105226 4.1079 4.107900 2.55 99.70 |
------------------------------------------------------------------------------------------------------------
| Totals: 1663 4.1203 100.00 |
------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------
| Setup Performance: Alive time=4.25186, Active time=0.060553 |
-------------------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-------------------------------------------------------------------------------------------------------------------------|
| |
| |
| Setup |
| Create Executioner 1 0.0003 0.000313 0.0003 0.000313 0.52 0.52 |
| FEProblem::init::meshChanged() 1 0.0016 0.001626 0.0016 0.001626 2.69 2.69 |
| Initial computeUserObjects() 1 0.0000 0.000005 0.0000 0.000005 0.01 0.01 |
| Initial execMultiApps() 1 0.0000 0.000003 0.0000 0.000003 0.00 0.00 |
| Initial execTransfers() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
| Initial updateActiveSemiLocalNodeRange() 1 0.0004 0.000396 0.0004 0.000396 0.65 0.65 |
| Initial updateGeomSearch() 2 0.0000 0.000002 0.0000 0.000002 0.00 0.00 |
| NonlinearSystem::update() 1 0.0000 0.000036 0.0000 0.000036 0.06 0.06 |
| Output Initial Condition 1 0.0107 0.010671 0.0107 0.010671 17.62 17.62 |
| Prepare Mesh 1 0.0017 0.001737 0.0017 0.001737 2.87 2.87 |
| copySolutionsBackwards() 1 0.0000 0.000023 0.0000 0.000023 0.04 0.04 |
| eq.init() 1 0.0455 0.045469 0.0455 0.045469 75.09 75.09 |
| getMinQuadratureOrder() 1 0.0000 0.000004 0.0000 0.000004 0.01 0.01 |
| initial adaptivity 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 |
| maxQps() 1 0.0003 0.000263 0.0003 0.000263 0.43 0.43 |
| reinit() after updateGeomSearch() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
| |
| ghostGhostedBoundaries |
| eq.init() 1 0.0000 0.000002 0.0000 0.000002 0.00 0.00 |
-------------------------------------------------------------------------------------------------------------------------
| Totals: 18 0.0606 100.00 |
-------------------------------------------------------------------------------------------------------------------------
-------------- next part --------------
Framework Information:
MOOSE version: git commit e79292e on 2016-01-05
PETSc Version: 3.6.1
Current Time: Tue Mar 1 00:41:08 2016
Executable Timestamp: Tue Mar 1 00:36:30 2016
Parallelism:
Num Processors: 1
Num Threads: 1
Mesh:
Distribution: serial
Mesh Dimension: 3
Spatial Dimension: 3
Nodes:
Total: 1331
Local: 1331
Elems:
Total: 1000
Local: 1000
Num Subdomains: 1
Num Partitions: 1
Nonlinear System:
Num DOFs: 3993
Num Local DOFs: 3993
Variables: { "disp_x" "disp_y" "disp_z" }
Finite Element Types: "LAGRANGE"
Approximation Orders: "FIRST"
Execution Information:
Executioner: PassoSteady
Solver Mode: NEWTON
In Function SNESCreate_passo_Newton_Solver
it. || g ||
------- -------------------
1 0.0240028
2 0.000418569
3 2.49436e-07
4 1.52966e-12
Solver converged in 4 iterations.
Outlier Variable Residual Norms:
disp_z: [31m2.850000e-02[39m
------------------------------------------------------------------------------------------------------------
| Whale Performance: Alive time=199.422, Active time=199.285 |
------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|------------------------------------------------------------------------------------------------------------|
| |
| |
| Exodus |
| output() 2 0.0110 0.005503 0.0110 0.005503 0.01 0.01 |
| |
| Setup |
| computeAuxiliaryKernels() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
| computeControls() 2 0.0000 0.000000 0.0000 0.000000 0.00 0.00 |
| computeUserObjects() 4 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
| |
| Solve |
| ComputeResidualThread 15985 189.8671 0.011878 189.8671 0.011878 95.27 95.27 |
| computeDiracContributions() 15986 0.0106 0.000001 0.0106 0.000001 0.01 0.01 |
| compute_jacobian() 1 0.1265 0.126484 0.1265 0.126485 0.06 0.06 |
| compute_residual() 15985 2.1702 0.000136 192.3009 0.012030 1.09 96.50 |
| residual.close3() 15985 0.1200 0.000008 0.1200 0.000008 0.06 0.06 |
| residual.close4() 15985 0.1263 0.000008 0.1263 0.000008 0.06 0.06 |
| solve() 1 6.8537 6.853733 199.2830 199.282969 3.44 100.00 |
------------------------------------------------------------------------------------------------------------
| Totals: 79938 199.2854 100.00 |
------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------
| Setup Performance: Alive time=199.422, Active time=0.055161 |
-------------------------------------------------------------------------------------------------------------------------
| Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time |
| w/o Sub w/o Sub With Sub With Sub w/o S With S |
|-------------------------------------------------------------------------------------------------------------------------|
| |
| |
| Setup |
| Create Executioner 1 0.0003 0.000315 0.0003 0.000315 0.57 0.57 |
| FEProblem::init::meshChanged() 1 0.0016 0.001593 0.0016 0.001593 2.89 2.89 |
| Initial computeUserObjects() 1 0.0000 0.000005 0.0000 0.000005 0.01 0.01 |
| Initial execMultiApps() 1 0.0000 0.000003 0.0000 0.000003 0.01 0.01 |
| Initial execTransfers() 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 |
| Initial updateActiveSemiLocalNodeRange() 1 0.0004 0.000380 0.0004 0.000380 0.69 0.69 |
| Initial updateGeomSearch() 2 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
| NonlinearSystem::update() 1 0.0000 0.000034 0.0000 0.000034 0.06 0.06 |
| Output Initial Condition 1 0.0091 0.009094 0.0091 0.009094 16.49 16.49 |
| Prepare Mesh 1 0.0018 0.001766 0.0018 0.001766 3.20 3.20 |
| copySolutionsBackwards() 1 0.0000 0.000025 0.0000 0.000025 0.05 0.05 |
| eq.init() 1 0.0417 0.041668 0.0417 0.041668 75.54 75.54 |
| getMinQuadratureOrder() 1 0.0000 0.000004 0.0000 0.000004 0.01 0.01 |
| initial adaptivity 1 0.0000 0.000000 0.0000 0.000000 0.00 0.00 |
| maxQps() 1 0.0003 0.000270 0.0003 0.000270 0.49 0.49 |
| reinit() after updateGeomSearch() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
| |
| ghostGhostedBoundaries |
| eq.init() 1 0.0000 0.000001 0.0000 0.000001 0.00 0.00 |
-------------------------------------------------------------------------------------------------------------------------
| Totals: 18 0.0552 100.00 |
-------------------------------------------------------------------------------------------------------------------------
More information about the petsc-users
mailing list