[petsc-users] MOOSE_SNES_ISSUES

alena kopanicakova alena.kopanicakova13 at gmail.com
Tue Mar 1 02:58:18 CST 2016


Hello,

I am developing my own nonlinear solver, which aims to solve simulations
from MOOSE. Communication with moose is done via SNES interface.

I am obtaining Jacobian and residual in following way:

           SNESComputeFunction(snes, X, R);

          SNESSetJacobian(snes, jac, jac, SNESComputeJacobianDefault,
NULL);
          SNESComputeJacobian(snes, X, jac,  jac);

Unfortunately, by this setting it takes incredible amount of time to obtain
Jacobian.
Taking closer look at perf log, it seems that difference between mine and
MOOSE solver is, that my executioner calls compute_residual() function
ridiculously many times.
I have no idea, what could be causing such a behavior.

Do you have any suggestions, how to set up interface properly? or which
change is needed for obtaining more-less same performance as MOOSE
executioner?


many thanks,
alena
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160301/b144ec38/attachment.html>
-------------- next part --------------

Framework Information:
MOOSE version:           git commit e79292e on 2016-01-05
PETSc Version:           3.6.1
Current Time:            Tue Mar  1 00:51:45 2016
Executable Timestamp:    Tue Mar  1 00:36:30 2016

Parallelism:
  Num Processors:          1
  Num Threads:             1

Mesh: 
  Distribution:            serial
  Mesh Dimension:          3
  Spatial Dimension:       3
  Nodes:                   
    Total:                 1331
    Local:                 1331
  Elems:                   
    Total:                 1000
    Local:                 1000
  Num Subdomains:          1
  Num Partitions:          1

Nonlinear System:
  Num DOFs:                3993
  Num Local DOFs:          3993
  Variables:               { "disp_x" "disp_y" "disp_z" } 
  Finite Element Types:    "LAGRANGE" 
  Approximation Orders:    "FIRST" 

Execution Information:
  Executioner:             Steady
  Solver Mode:             NEWTON



 0 Nonlinear |R| = 2.850000e-02
      0 Linear |R| = 2.850000e-02
      1 Linear |R| = 1.653670e-02
      2 Linear |R| = 1.447168e-02
      3 Linear |R| = 1.392965e-02
      4 Linear |R| = 1.258440e-02
      5 Linear |R| = 1.007181e-02
      6 Linear |R| = 8.264315e-03
      7 Linear |R| = 6.541897e-03
      8 Linear |R| = 4.371900e-03
      9 Linear |R| = 2.100406e-03
     10 Linear |R| = 1.227539e-03
     11 Linear |R| = 1.026286e-03
     12 Linear |R| = 9.180101e-04
     13 Linear |R| = 8.087598e-04
     14 Linear |R| = 6.435247e-04
     15 Linear |R| = 5.358688e-04
     16 Linear |R| = 4.551657e-04
     17 Linear |R| = 4.090276e-04
     18 Linear |R| = 3.359290e-04
     19 Linear |R| = 2.417643e-04
     20 Linear |R| = 1.710452e-04
     21 Linear |R| = 1.261996e-04
     22 Linear |R| = 9.384052e-05
     23 Linear |R| = 6.070637e-05
     24 Linear |R| = 4.283233e-05
     25 Linear |R| = 3.383792e-05
     26 Linear |R| = 2.342289e-05
     27 Linear |R| = 1.700855e-05
     28 Linear |R| = 9.814278e-06
     29 Linear |R| = 4.398519e-06
     30 Linear |R| = 2.161205e-06
     31 Linear |R| = 1.289206e-06
     32 Linear |R| = 6.548007e-07
     33 Linear |R| = 3.677894e-07
     34 Linear |R| = 2.640006e-07
 1 Nonlinear |R| = 2.400310e-02
      0 Linear |R| = 2.400310e-02
      1 Linear |R| = 9.102075e-03
      2 Linear |R| = 5.017381e-03
      3 Linear |R| = 3.840732e-03
      4 Linear |R| = 2.990685e-03
      5 Linear |R| = 1.990203e-03
      6 Linear |R| = 1.085764e-03
      7 Linear |R| = 4.657779e-04
      8 Linear |R| = 3.049692e-04
      9 Linear |R| = 1.625839e-04
     10 Linear |R| = 1.124700e-04
     11 Linear |R| = 7.764153e-05
     12 Linear |R| = 5.698577e-05
     13 Linear |R| = 4.581843e-05
     14 Linear |R| = 4.262610e-05
     15 Linear |R| = 3.792804e-05
     16 Linear |R| = 3.404168e-05
     17 Linear |R| = 2.536004e-05
     18 Linear |R| = 1.577559e-05
     19 Linear |R| = 9.099392e-06
     20 Linear |R| = 6.140685e-06
     21 Linear |R| = 5.083606e-06
     22 Linear |R| = 4.521560e-06
     23 Linear |R| = 3.601845e-06
     24 Linear |R| = 2.776090e-06
     25 Linear |R| = 2.252274e-06
     26 Linear |R| = 1.898090e-06
     27 Linear |R| = 1.620684e-06
     28 Linear |R| = 1.395574e-06
     29 Linear |R| = 1.157953e-06
     30 Linear |R| = 9.540738e-07
     31 Linear |R| = 8.487724e-07
     32 Linear |R| = 7.634710e-07
     33 Linear |R| = 6.254549e-07
     34 Linear |R| = 4.811588e-07
     35 Linear |R| = 3.930739e-07
     36 Linear |R| = 3.340577e-07
     37 Linear |R| = 2.873430e-07
     38 Linear |R| = 2.407606e-07
     39 Linear |R| = 1.978818e-07
 2 Nonlinear |R| = 4.185813e-04
      0 Linear |R| = 4.185813e-04
      1 Linear |R| = 1.406808e-04
      2 Linear |R| = 7.266714e-05
      3 Linear |R| = 5.734138e-05
      4 Linear |R| = 4.524739e-05
      5 Linear |R| = 3.025661e-05
      6 Linear |R| = 1.946626e-05
      7 Linear |R| = 1.005809e-05
      8 Linear |R| = 7.639142e-06
      9 Linear |R| = 6.668613e-06
     10 Linear |R| = 6.070601e-06
     11 Linear |R| = 5.496769e-06
     12 Linear |R| = 4.388115e-06
     13 Linear |R| = 2.966258e-06
     14 Linear |R| = 1.838201e-06
     15 Linear |R| = 9.709174e-07
     16 Linear |R| = 6.743766e-07
     17 Linear |R| = 5.531138e-07
     18 Linear |R| = 4.649969e-07
     19 Linear |R| = 3.982799e-07
     20 Linear |R| = 3.662679e-07
     21 Linear |R| = 3.309140e-07
     22 Linear |R| = 2.652039e-07
     23 Linear |R| = 1.728911e-07
     24 Linear |R| = 1.005779e-07
     25 Linear |R| = 5.747041e-08
     26 Linear |R| = 4.185011e-08
     27 Linear |R| = 3.394446e-08
     28 Linear |R| = 2.788435e-08
     29 Linear |R| = 2.046992e-08
     30 Linear |R| = 1.231943e-08
     31 Linear |R| = 8.724911e-09
     32 Linear |R| = 6.390162e-09
     33 Linear |R| = 5.060595e-09
     34 Linear |R| = 4.216656e-09
     35 Linear |R| = 2.969865e-09
 3 Nonlinear |R| = 2.494055e-07
      0 Linear |R| = 2.494055e-07
      1 Linear |R| = 8.559637e-08
      2 Linear |R| = 4.335101e-08
      3 Linear |R| = 3.214303e-08
      4 Linear |R| = 2.549409e-08
      5 Linear |R| = 1.899624e-08
      6 Linear |R| = 1.522624e-08
      7 Linear |R| = 1.258408e-08
      8 Linear |R| = 1.098545e-08
      9 Linear |R| = 1.009481e-08
     10 Linear |R| = 8.423983e-09
     11 Linear |R| = 6.946144e-09
     12 Linear |R| = 5.624875e-09
     13 Linear |R| = 4.448760e-09
     14 Linear |R| = 2.834320e-09
     15 Linear |R| = 1.614722e-09
     16 Linear |R| = 9.409384e-10
     17 Linear |R| = 7.775851e-10
     18 Linear |R| = 6.905971e-10
     19 Linear |R| = 6.129201e-10
     20 Linear |R| = 5.438935e-10
     21 Linear |R| = 4.435519e-10
     22 Linear |R| = 3.486621e-10
     23 Linear |R| = 2.811928e-10
     24 Linear |R| = 2.159800e-10
     25 Linear |R| = 1.670940e-10
     26 Linear |R| = 1.338889e-10
     27 Linear |R| = 9.926067e-11
     28 Linear |R| = 7.483221e-11
     29 Linear |R| = 5.045662e-11
     30 Linear |R| = 2.772335e-11
     31 Linear |R| = 1.814968e-11
     32 Linear |R| = 1.264268e-11
     33 Linear |R| = 9.856586e-12
     34 Linear |R| = 7.802757e-12
     35 Linear |R| = 6.092276e-12
     36 Linear |R| = 4.785005e-12
     37 Linear |R| = 3.887554e-12
     38 Linear |R| = 3.125756e-12
     39 Linear |R| = 2.543989e-12
     40 Linear |R| = 2.062100e-12
 4 Nonlinear |R| = 2.522382e-12

 ------------------------------------------------------------------------------------------------------------
| Whale Performance: Alive time=4.25229, Active time=4.12034                                                 |
 ------------------------------------------------------------------------------------------------------------
| Event                         nCalls     Total Time  Avg Time    Total Time  Avg Time    % of Active Time  |
|                                          w/o Sub     w/o Sub     With Sub    With Sub    w/o S    With S   |
|------------------------------------------------------------------------------------------------------------|
|                                                                                                            |
|                                                                                                            |
| Exodus                                                                                                     |
|   output()                    2          0.0126      0.006300    0.0126      0.006300    0.31     0.31     |
|                                                                                                            |
| Setup                                                                                                      |
|   computeAuxiliaryKernels()   2          0.0000      0.000000    0.0000      0.000000    0.00     0.00     |
|   computeControls()           2          0.0000      0.000001    0.0000      0.000001    0.00     0.00     |
|   computeUserObjects()        4          0.0000      0.000001    0.0000      0.000001    0.00     0.00     |
|                                                                                                            |
| Solve                                                                                                      |
|   ComputeResidualThread       330        3.8288      0.011603    3.8288      0.011603    92.93    92.93    |
|   computeDiracContributions() 331        0.0002      0.000001    0.0002      0.000001    0.00     0.00     |
|   compute_jacobian()          1          0.1302      0.130242    0.1302      0.130243    3.16     3.16     |
|   compute_residual()          330        0.0394      0.000119    3.8724      0.011735    0.96     93.98    |
|   residual.close3()           330        0.0020      0.000006    0.0020      0.000006    0.05     0.05     |
|   residual.close4()           330        0.0019      0.000006    0.0019      0.000006    0.05     0.05     |
|   solve()                     1          0.1052      0.105226    4.1079      4.107900    2.55     99.70    |
 ------------------------------------------------------------------------------------------------------------
| Totals:                       1663       4.1203                                          100.00            |
 ------------------------------------------------------------------------------------------------------------
 -------------------------------------------------------------------------------------------------------------------------
| Setup Performance: Alive time=4.25186, Active time=0.060553                                                             |
 -------------------------------------------------------------------------------------------------------------------------
| Event                                      nCalls     Total Time  Avg Time    Total Time  Avg Time    % of Active Time  |
|                                                       w/o Sub     w/o Sub     With Sub    With Sub    w/o S    With S   |
|-------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                         |
|                                                                                                                         |
| Setup                                                                                                                   |
|   Create Executioner                       1          0.0003      0.000313    0.0003      0.000313    0.52     0.52     |
|   FEProblem::init::meshChanged()           1          0.0016      0.001626    0.0016      0.001626    2.69     2.69     |
|   Initial computeUserObjects()             1          0.0000      0.000005    0.0000      0.000005    0.01     0.01     |
|   Initial execMultiApps()                  1          0.0000      0.000003    0.0000      0.000003    0.00     0.00     |
|   Initial execTransfers()                  1          0.0000      0.000001    0.0000      0.000001    0.00     0.00     |
|   Initial updateActiveSemiLocalNodeRange() 1          0.0004      0.000396    0.0004      0.000396    0.65     0.65     |
|   Initial updateGeomSearch()               2          0.0000      0.000002    0.0000      0.000002    0.00     0.00     |
|   NonlinearSystem::update()                1          0.0000      0.000036    0.0000      0.000036    0.06     0.06     |
|   Output Initial Condition                 1          0.0107      0.010671    0.0107      0.010671    17.62    17.62    |
|   Prepare Mesh                             1          0.0017      0.001737    0.0017      0.001737    2.87     2.87     |
|   copySolutionsBackwards()                 1          0.0000      0.000023    0.0000      0.000023    0.04     0.04     |
|   eq.init()                                1          0.0455      0.045469    0.0455      0.045469    75.09    75.09    |
|   getMinQuadratureOrder()                  1          0.0000      0.000004    0.0000      0.000004    0.01     0.01     |
|   initial adaptivity                       1          0.0000      0.000000    0.0000      0.000000    0.00     0.00     |
|   maxQps()                                 1          0.0003      0.000263    0.0003      0.000263    0.43     0.43     |
|   reinit() after updateGeomSearch()        1          0.0000      0.000001    0.0000      0.000001    0.00     0.00     |
|                                                                                                                         |
| ghostGhostedBoundaries                                                                                                  |
|   eq.init()                                1          0.0000      0.000002    0.0000      0.000002    0.00     0.00     |
 -------------------------------------------------------------------------------------------------------------------------
| Totals:                                    18         0.0606                                          100.00            |
 -------------------------------------------------------------------------------------------------------------------------
-------------- next part --------------

Framework Information:
MOOSE version:           git commit e79292e on 2016-01-05
PETSc Version:           3.6.1
Current Time:            Tue Mar  1 00:41:08 2016
Executable Timestamp:    Tue Mar  1 00:36:30 2016

Parallelism:
  Num Processors:          1
  Num Threads:             1

Mesh: 
  Distribution:            serial
  Mesh Dimension:          3
  Spatial Dimension:       3
  Nodes:                   
    Total:                 1331
    Local:                 1331
  Elems:                   
    Total:                 1000
    Local:                 1000
  Num Subdomains:          1
  Num Partitions:          1

Nonlinear System:
  Num DOFs:                3993
  Num Local DOFs:          3993
  Variables:               { "disp_x" "disp_y" "disp_z" } 
  Finite Element Types:    "LAGRANGE" 
  Approximation Orders:    "FIRST" 

Execution Information:
  Executioner:             PassoSteady
  Solver Mode:             NEWTON



In Function SNESCreate_passo_Newton_Solver 

      it.          || g || 
   -------  ------------------- 
         1        0.0240028
         2      0.000418569
         3      2.49436e-07
         4      1.52966e-12

Solver converged in 4 iterations.

Outlier Variable Residual Norms:
  disp_z: 2.850000e-02

 ------------------------------------------------------------------------------------------------------------
| Whale Performance: Alive time=199.422, Active time=199.285                                                 |
 ------------------------------------------------------------------------------------------------------------
| Event                         nCalls     Total Time  Avg Time    Total Time  Avg Time    % of Active Time  |
|                                          w/o Sub     w/o Sub     With Sub    With Sub    w/o S    With S   |
|------------------------------------------------------------------------------------------------------------|
|                                                                                                            |
|                                                                                                            |
| Exodus                                                                                                     |
|   output()                    2          0.0110      0.005503    0.0110      0.005503    0.01     0.01     |
|                                                                                                            |
| Setup                                                                                                      |
|   computeAuxiliaryKernels()   2          0.0000      0.000001    0.0000      0.000001    0.00     0.00     |
|   computeControls()           2          0.0000      0.000000    0.0000      0.000000    0.00     0.00     |
|   computeUserObjects()        4          0.0000      0.000001    0.0000      0.000001    0.00     0.00     |
|                                                                                                            |
| Solve                                                                                                      |
|   ComputeResidualThread       15985      189.8671    0.011878    189.8671    0.011878    95.27    95.27    |
|   computeDiracContributions() 15986      0.0106      0.000001    0.0106      0.000001    0.01     0.01     |
|   compute_jacobian()          1          0.1265      0.126484    0.1265      0.126485    0.06     0.06     |
|   compute_residual()          15985      2.1702      0.000136    192.3009    0.012030    1.09     96.50    |
|   residual.close3()           15985      0.1200      0.000008    0.1200      0.000008    0.06     0.06     |
|   residual.close4()           15985      0.1263      0.000008    0.1263      0.000008    0.06     0.06     |
|   solve()                     1          6.8537      6.853733    199.2830    199.282969  3.44     100.00   |
 ------------------------------------------------------------------------------------------------------------
| Totals:                       79938      199.2854                                        100.00            |
 ------------------------------------------------------------------------------------------------------------
 -------------------------------------------------------------------------------------------------------------------------
| Setup Performance: Alive time=199.422, Active time=0.055161                                                             |
 -------------------------------------------------------------------------------------------------------------------------
| Event                                      nCalls     Total Time  Avg Time    Total Time  Avg Time    % of Active Time  |
|                                                       w/o Sub     w/o Sub     With Sub    With Sub    w/o S    With S   |
|-------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                         |
|                                                                                                                         |
| Setup                                                                                                                   |
|   Create Executioner                       1          0.0003      0.000315    0.0003      0.000315    0.57     0.57     |
|   FEProblem::init::meshChanged()           1          0.0016      0.001593    0.0016      0.001593    2.89     2.89     |
|   Initial computeUserObjects()             1          0.0000      0.000005    0.0000      0.000005    0.01     0.01     |
|   Initial execMultiApps()                  1          0.0000      0.000003    0.0000      0.000003    0.01     0.01     |
|   Initial execTransfers()                  1          0.0000      0.000000    0.0000      0.000000    0.00     0.00     |
|   Initial updateActiveSemiLocalNodeRange() 1          0.0004      0.000380    0.0004      0.000380    0.69     0.69     |
|   Initial updateGeomSearch()               2          0.0000      0.000001    0.0000      0.000001    0.00     0.00     |
|   NonlinearSystem::update()                1          0.0000      0.000034    0.0000      0.000034    0.06     0.06     |
|   Output Initial Condition                 1          0.0091      0.009094    0.0091      0.009094    16.49    16.49    |
|   Prepare Mesh                             1          0.0018      0.001766    0.0018      0.001766    3.20     3.20     |
|   copySolutionsBackwards()                 1          0.0000      0.000025    0.0000      0.000025    0.05     0.05     |
|   eq.init()                                1          0.0417      0.041668    0.0417      0.041668    75.54    75.54    |
|   getMinQuadratureOrder()                  1          0.0000      0.000004    0.0000      0.000004    0.01     0.01     |
|   initial adaptivity                       1          0.0000      0.000000    0.0000      0.000000    0.00     0.00     |
|   maxQps()                                 1          0.0003      0.000270    0.0003      0.000270    0.49     0.49     |
|   reinit() after updateGeomSearch()        1          0.0000      0.000001    0.0000      0.000001    0.00     0.00     |
|                                                                                                                         |
| ghostGhostedBoundaries                                                                                                  |
|   eq.init()                                1          0.0000      0.000001    0.0000      0.000001    0.00     0.00     |
 -------------------------------------------------------------------------------------------------------------------------
| Totals:                                    18         0.0552                                          100.00            |
 -------------------------------------------------------------------------------------------------------------------------


More information about the petsc-users mailing list