[petsc-users] Memory problem
Rongliang Chen
rongliang.chan at gmail.com
Wed Oct 5 17:56:53 CDT 2011
Hello Everyone,
I am testing a non-linear problem using the snessolve(). The degree of
freedoms of my test case is about 1 Million, which means the Jacobian matrix
in the snessolve() is an 1 million by 1 million matrix and it should be a
sparse matrix.
And my question is that in the "-log_summary" output file I find a strange
massage: "Matrix 39 39 18446744074642894848 0". Does
this message mean that the matrix's memory usage is 1.8x10^20? I have no
idea why an one million by one million matrix use so much memory. Is this
possible? The output of the "-log_summary" followed. Thanks.
Best,
Rongliang
-------------------------------------------------------------------------------------------------------------------------
Starting to load grid...
Nodes on moving boundary: coarse 199, fine 799, Gridratio 0.250000.
Setupping Interpolation matrix......
Interpolation matrix done......Time spent: 0.405431
finished.
Grid has 32000 elements, 1096658 degrees of freedom.
Coarse grid has 2000 elements, 70170 degrees of freedom.
[0] has 35380 degrees of freedom (matrix), 35380 degrees of freedom
(including shared points).
[0] coarse grid has 2194 degrees of freedom (matrix), 2194 degrees of
freedom (including shared points).
[31] has 32466 degrees of freedom (matrix), 34428 degrees of freedom
(including shared points).
[31] coarse grid has 2250 degrees of freedom (matrix), 2826 degrees of
freedom (including shared points).
Time spend on the load grid and create matrix etc.: 3.577862.
Solving fixed mesh (steady-state problem)
Solving coarse problem......
0 SNES norm 3.1224989992e+01, 0 KSP its last norm 0.0000000000e+00.
1 SNES norm 1.3987219837e+00, 25 KSP its last norm 2.4915963656e-01.
2 SNES norm 5.1898321541e-01, 59 KSP its last norm 1.3451744761e-02.
3 SNES norm 4.0024228221e-02, 56 KSP its last norm 4.9036146089e-03.
4 SNES norm 6.7641787439e-04, 59 KSP its last norm 3.6925683196e-04.
Coarse solver done......
Initial value of object function (Energy dissipation) (Coarse):
38.9341108701
0 SNES norm 7.4575110699e+00, 0 KSP its last norm 0.0000000000e+00.
1 SNES norm 6.4497565921e-02, 51 KSP its last norm 7.4277453141e-03.
2 SNES norm 9.2093642958e-04, 90 KSP its last norm 5.4331380112e-05.
3 SNES norm 8.1283574549e-07, 103 KSP its last norm 7.5974191049e-07.
Initial value of object function (Energy dissipation) (Fine): 42.5134271399
Solution time of 17.180358 sec.
Fixed mesh (Steady-state) solver done.
Total number of nonlinear iterations = 3
Total number of linear iterations = 244
Average number of linear iterations = 81.333336
Time computing: 17.180358 sec, Time outputting: 0.000000 sec.
Time spent in coarse nonlinear solve: 0.793436 sec, 0.046183 fraction of
total compute time.
Solving Shape Optimization problem (steady-state problem)
Solving coarse problem......
0 SNES norm 4.1963166116e+01, 0 KSP its last norm 0.0000000000e+00.
1 SNES norm 3.2749386875e+01, 132 KSP its last norm 4.0966334477e-01.
2 SNES norm 2.2874504408e+01, 130 KSP its last norm 3.2526355310e-01.
3 SNES norm 1.4327187891e+01, 132 KSP its last norm 2.1213029400e-01.
4 SNES norm 1.7283643754e+00, 81 KSP its last norm 1.4233338128e-01.
5 SNES norm 3.6703566918e-01, 133 KSP its last norm 1.6069896349e-02.
6 SNES norm 3.6554528686e-03, 77 KSP its last norm 3.5379167356e-03.
Coarse solver done......
Optimized value of object function (Energy dissipation) (Coarse):
29.9743062939
The reduction of the energy dissipation (Coarse): 23.012737%
The optimized curve (Coarse):
a = (4.500000, -0.042893, -0.002030, 0.043721, -0.018798, 0.001824)
Solving moving mesh equation......
KSP norm 2.3040219081e-07, KSP its. 741. Time spent 8.481956
Moving mesh solver done.
0 SNES norm 4.7843968670e+02, 0 KSP its last norm 0.0000000000e+00.
1 SNES norm 1.0148854085e+02, 49 KSP its last norm 4.7373180511e-01.
2 SNES norm 1.8312214030e+00, 46 KSP its last norm 1.0133332840e-01.
3 SNES norm 3.3101970861e-03, 212 KSP its last norm 1.7753271069e-03.
4 SNES norm 4.9552614008e-06, 249 KSP its last norm 3.2293284103e-06.
Optimized value of object function (Energy dissipation) (Fine):
33.2754372645
Solution time of 4053.227456 sec.
Number of unknowns = 1096658
Parameters: kinematic viscosity = 0.01
inlet velocity: u = 5, v = 0
Total number of nonlinear iterations = 4
Total number of linear iterations = 556
Average number of linear iterations = 139.000000
Time computing: 4053.227456 sec, Time outputting: 0.000001 sec.
Time spent in coarse nonlinear solve: 24.239526 sec, 0.005980 fraction of
total compute time.
The optimized curve (fine):
a = (4.500000, -0.046468, -0.001963, 0.045736, -0.019141, 0.001789)
The reduction of the energy dissipation (Fine): 21.729582%
Time spend on fixed mesh solving: 17.296872
Time spend on shape opt. solving: 4053.250126
Latex command line:
np Newton GMRES Time(Total) Time(Coarse) Ratio
32 & 4 & 139.00 & 4053.23 & 24.24 & 0.6\%
Running finished on: Wed Oct 5 11:32:04 2011
Total running time: 4070.644329
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r
-fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary:
----------------------------------------------
./joab on a Janus-nod named node1751 with 32 processors, by ronglian Wed
Oct 5 11:32:04 2011
Using Petsc Release Version 3.2.0, Patch 1, Mon Sep 12 16:01:51 CDT 2011
Max Max/Min Avg Total
Time (sec): 4.074e+03 1.00000 4.074e+03
Objects: 1.011e+03 1.00000 1.011e+03
Flops: 2.255e+11 2.27275 1.471e+11 4.706e+12
Flops/sec: 5.535e+07 2.27275 3.609e+07 1.155e+09
MPI Messages: 1.103e+05 5.41392 3.665e+04 1.173e+06
MPI Message Lengths: 1.326e+09 2.60531 2.416e+04 2.833e+10
MPI Reductions: 5.969e+03 1.00000
Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N -->
2N flops
and VecAXPY() for complex vectors of length N
--> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages ---
-- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts
%Total Avg %Total counts %Total
0: Main Stage: 4.0743e+03 100.0% 4.7058e+12 100.0% 1.173e+06 100.0%
2.416e+04 100.0% 5.968e+03 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting
output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and
PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this
phase
%M - percent messages in this phase %L - percent message lengths
in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over
all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec)
Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len
Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 2493 1.0 1.2225e+0218.4 4.37e+09 1.1 3.9e+05 2.2e+03
0.0e+00 2 3 33 3 0 2 3 33 3 0 1084
MatMultTranspose 6 1.0 3.3590e-02 2.2 7.38e+06 1.1 8.0e+02 1.5e+03
0.0e+00 0 0 0 0 0 0 0 0 0 0 6727
MatSolve 2467 1.0 1.1270e+02 1.7 5.95e+10 1.7 0.0e+00 0.0e+00
0.0e+00 2 33 0 0 0 2 33 0 0 0 13775
MatLUFactorSym 4 1.0 3.4774e+00 3.1 0.00e+00 0.0 0.0e+00 0.0e+00
1.2e+01 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 18 1.0 2.0832e+02 3.7 1.55e+11 3.2 0.0e+00 0.0e+00
0.0e+00 2 56 0 0 0 2 56 0 0 0 12746
MatILUFactorSym 1 1.0 8.3280e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
1.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 103 1.0 7.6879e+0215.4 0.00e+00 0.0 1.6e+04 6.2e+04
1.7e+02 7 0 1 4 3 7 0 1 4 3 0
MatAssemblyEnd 103 1.0 3.7818e+01 1.0 0.00e+00 0.0 3.0e+03 5.3e+02
1.6e+02 1 0 0 0 3 1 0 0 0 3 0
MatGetRowIJ 5 1.0 4.8716e-02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetSubMatrice 18 1.0 4.3095e+00 2.5 0.00e+00 0.0 1.6e+04 3.5e+05
7.4e+01 0 0 1 20 1 0 0 1 20 1 0
MatGetOrdering 5 1.0 1.4656e+00 2.8 0.00e+00 0.0 0.0e+00 0.0e+00
1.4e+01 0 0 0 0 0 0 0 0 0 0 0
MatPartitioning 1 1.0 1.4356e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
1.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 42 1.0 2.0939e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecDot 17 1.0 1.2719e-02 6.8 5.47e+05 1.1 0.0e+00 0.0e+00
1.7e+01 0 0 0 0 0 0 0 0 0 0 1317
VecMDot 2425 1.0 1.7196e+01 2.2 5.82e+09 1.1 0.0e+00 0.0e+00
2.4e+03 0 4 0 0 41 0 4 0 0 41 10353
VecNorm 2503 1.0 2.7923e+00 3.4 1.18e+08 1.1 0.0e+00 0.0e+00
2.5e+03 0 0 0 0 42 0 0 0 0 42 1293
VecScale 2467 1.0 7.3112e-02 1.7 5.84e+07 1.1 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 24453
VecCopy 153 1.0 1.1636e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 5031 1.0 6.0423e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 137 1.0 1.1462e-02 1.5 6.33e+06 1.1 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 16902
VecWAXPY 19 1.0 1.7784e-03 1.4 2.83e+05 1.1 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 4869
VecMAXPY 2467 1.0 8.5820e+00 1.3 5.93e+09 1.1 0.0e+00 0.0e+00
0.0e+00 0 4 0 0 0 0 4 0 0 0 21153
VecAssemblyBegin 69 1.0 1.0341e+0018.2 0.00e+00 0.0 4.9e+03 5.4e+02
2.1e+02 0 0 0 0 3 0 0 0 0 3 0
VecAssemblyEnd 69 1.0 2.4939e-04 2.8 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 7491 1.0 1.3734e+00 1.7 0.00e+00 0.0 1.1e+06 1.9e+04
0.0e+00 0 0 96 76 0 0 0 96 76 0 0
VecScatterEnd 7491 1.0 2.0055e+02 8.7 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 3 0 0 0 0 3 0 0 0 0 0
VecReduceArith 8 1.0 1.4977e-03 2.0 3.05e+05 1.1 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 6232
VecReduceComm 4 1.0 8.9908e-0412.2 0.00e+00 0.0 0.0e+00 0.0e+00
4.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 2467 1.0 2.8067e+00 3.4 1.75e+08 1.1 0.0e+00 0.0e+00
2.4e+03 0 0 0 0 41 0 0 0 0 41 1905
SNESSolve 4 1.0 4.0619e+03 1.0 2.23e+11 2.3 9.4e+05 2.3e+04
4.1e+03100 98 80 77 68 100 98 80 77 68 1136
SNESLineSearch 17 1.0 1.1423e+01 1.0 5.23e+07 1.1 1.8e+04 1.7e+04
3.3e+02 0 0 2 1 6 0 0 2 1 6 140
SNESFunctionEval 23 1.0 2.9742e+01 1.0 2.60e+07 1.1 1.9e+04 1.9e+04
3.5e+02 1 0 2 1 6 1 0 2 1 6 27
SNESJacobianEval 17 1.0 3.6786e+03 1.0 0.00e+00 0.0 9.8e+03 6.4e+04
1.4e+02 90 0 1 2 2 90 0 1 2 2 0
KSPGMRESOrthog 2425 1.0 2.5150e+01 1.6 1.16e+10 1.1 0.0e+00 0.0e+00
2.4e+03 0 8 0 0 41 0 8 0 0 41 14157
KSPSetup 36 1.0 2.5388e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 18 1.0 3.6141e+02 1.0 2.25e+11 2.3 1.1e+06 2.4e+04
5.0e+03 9100 97 96 84 9100 97 96 84 13015
PCSetUp 36 1.0 2.1635e+02 3.6 1.55e+11 3.2 1.8e+04 3.2e+05
1.5e+02 3 56 2 20 3 3 56 2 20 3 12274
PCSetUpOnBlocks 18 1.0 2.1293e+02 3.7 1.55e+11 3.2 0.0e+00 0.0e+00
2.7e+01 2 56 0 0 0 2 56 0 0 0 12471
PCApply 2467 1.0 2.5616e+02 2.5 5.95e+10 1.7 7.3e+05 2.8e+04
0.0e+00 4 33 62 73 0 4 33 62 73 0 6060
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 39 39 18446744074642894848 0
Matrix Partitioning 1 1 640 0
Index Set 184 184 2589512 0
IS L to G Mapping 2 2 301720 0
Vector 729 729 133662888 0
Vector Scatter 29 29 30508 0
Application Order 2 2 9335968 0
SNES 4 4 5088 0
Krylov Solver 10 10 32264320 0
Preconditioner 10 10 9088 0
Viewer 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 1.19209e-07
Average time for MPI_Barrier(): 1.20163e-05
Average time for zero size MPI_Send(): 2.49594e-06
......................................
-----------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20111005/8df228f6/attachment-0001.htm>
More information about the petsc-users
mailing list