[petsc-users] Poor weak scaling when solving successivelinearsystems
Michael Becker
Michael.Becker at physik.uni-giessen.de
Tue May 29 06:18:42 CDT 2018
Hello again,
here are the updated log_view files for 125 and 1000 processors. I ran
both problems twice, the first time with all processors per node
allocated ("-1.txt"), the second with only half on twice the number of
nodes ("-2.txt").
>> On May 24, 2018, at 12:24 AM, Michael Becker <Michael.Becker at physik.uni-giessen.de> wrote:
>>
>> I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES).
> Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance.
I mean this, right in the log_view output:
> Memory usage is given in bytes:
>
> Object Type Creations Destructions Memory Descendants' Mem.
> Reports information only for process 0.
>
> --- Event Stage 0: Main Stage
>
> ...
>
> --- Event Stage 1: First Solve
>
> ...
>
> --- Event Stage 2: Remaining Solves
>
> Vector 23904 23904 1295501184 0.
I logged the exact number of KSP iterations over the 999 timesteps and
its exactly 23904/6 = 3984.
Michael
Am 24.05.2018 um 19:50 schrieb Smith, Barry F.:
> Please send the log file for 1000 with cg as the solver.
>
> You should make a bar chart of each event for the two cases to see which ones are taking more time and which are taking less (we cannot tell with the two logs you sent us since they are for different solvers.)
>
>
>
>> On May 24, 2018, at 12:24 AM, Michael Becker <Michael.Becker at physik.uni-giessen.de> wrote:
>>
>> I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES).
> Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance.
>
>
>
>> This seems kind of wasteful, is this supposed to be like this? Is this even the reason for my problems? Apart from that, everything seems quite normal to me (but I'm not the expert here).
>>
>>
>> Thanks in advance.
>>
>> Michael
>>
>>
>>
>> <log_view_125procs.txt><log_view_1000procs.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20180529/8034966c/attachment-0001.html>
-------------- next part --------------
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
/home/ritsat/beckerm/ppp_test/plasmapic on a arch-linux-amd-opt named node1-022 with 125 processors, by beckerm Fri May 25 09:33:10 2018
Using Petsc Development GIT revision: v3.9.2-503-g9e88a8b GIT Date: 2018-05-24 08:01:24 -0500
Max Max/Min Avg Total
Time (sec): 2.916e+02 1.00000 2.916e+02
Objects: 2.438e+04 1.00004 2.438e+04
Flop: 2.125e+10 1.27708 1.963e+10 2.454e+12
Flop/sec: 7.287e+07 1.27708 6.733e+07 8.416e+09
MPI Messages: 1.042e+06 3.36140 7.129e+05 8.911e+07
MPI Message Lengths: 1.344e+09 2.32209 1.439e+03 1.282e+11
MPI Reductions: 2.250e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 4.3792e+01 15.0% 0.0000e+00 0.0% 3.000e+03 0.0% 3.178e+03 0.0% 1.700e+01 0.1%
1: First Solve: 2.5655e+00 0.9% 3.6885e+09 0.2% 3.549e+05 0.4% 3.736e+03 1.0% 5.500e+02 2.4%
2: Remaining Solves: 2.4525e+02 84.1% 2.4504e+12 99.8% 8.875e+07 99.6% 1.430e+03 99.0% 2.192e+04 97.4%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecSet 3 1.0 4.0317e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
--- Event Stage 1: First Solve
BuildTwoSided 12 1.0 9.7456e-03 1.5 0.00e+00 0.0 8.8e+03 4.0e+00 0.0e+00 0 0 0 0 0 0 0 2 0 0 0
BuildTwoSidedF 30 1.0 2.9124e-01 3.7 0.00e+00 0.0 7.1e+03 1.0e+04 0.0e+00 0 0 0 0 0 6 0 2 5 0 0
KSPSetUp 9 1.0 3.9537e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0
KSPSolve 1 1.0 2.5694e+00 1.0 3.26e+07 1.4 3.5e+05 3.7e+03 5.5e+02 1 0 0 1 2 100100100100100 1436
VecTDot 8 1.0 1.0836e-02 6.5 4.32e+05 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 1 0 0 1 4983
VecNorm 6 1.0 2.1179e-03 3.5 3.24e+05 1.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 1 0 0 1 19123
VecScale 24 1.0 2.5225e-04 4.4 5.43e+04 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 20408
VecCopy 1 1.0 1.3018e-04 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 115 1.0 7.8964e-04 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 8 1.0 1.0571e-03 1.8 4.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 51081
VecAYPX 28 1.0 1.4100e-03 2.2 3.58e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 31104
VecAssemblyBegin 2 1.0 2.1458e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyEnd 2 1.0 1.9073e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 103 1.0 6.7844e-03 3.4 0.00e+00 0.0 8.9e+04 1.4e+03 0.0e+00 0 0 0 0 0 0 0 25 9 0 0
VecScatterEnd 103 1.0 5.8765e-02 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatMult 29 1.0 4.4128e-02 1.7 6.14e+06 1.2 3.0e+04 2.1e+03 0.0e+00 0 0 0 0 0 1 19 8 5 0 16244
MatMultAdd 24 1.0 1.6727e-02 2.7 1.37e+06 1.6 1.6e+04 6.5e+02 0.0e+00 0 0 0 0 0 0 4 5 1 0 9033
MatMultTranspose 24 1.0 1.5692e-02 2.4 1.37e+06 1.6 1.6e+04 6.5e+02 0.0e+00 0 0 0 0 0 0 4 5 1 0 9628
MatSolve 4 0.0 2.2888e-05 0.0 2.64e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 12
MatSOR 48 1.0 7.2616e-02 1.7 1.09e+07 1.3 2.7e+04 1.5e+03 8.0e+00 0 0 0 0 0 3 34 8 3 1 17266
MatLUFactorSym 1 1.0 6.6996e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 1.5020e-05 5.2 1.29e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 9
MatResidual 24 1.0 3.6082e-02 2.1 4.55e+06 1.3 2.7e+04 1.5e+03 0.0e+00 0 0 0 0 0 1 14 8 3 0 14385
MatAssemblyBegin 94 1.0 2.9352e-01 3.4 0.00e+00 0.0 7.1e+03 1.0e+04 0.0e+00 0 0 0 0 0 6 0 2 5 0 0
MatAssemblyEnd 94 1.0 8.8632e-02 1.1 0.00e+00 0.0 6.3e+04 2.1e+02 2.3e+02 0 0 0 0 1 3 0 18 1 42 0
MatGetRow 3102093 1.3 4.2884e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 15 0 0 0 0 0
MatGetRowIJ 1 0.0 8.8215e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatCreateSubMats 6 1.0 4.7427e-01 2.5 0.00e+00 0.0 5.5e+04 1.7e+04 1.2e+01 0 0 0 1 0 13 0 15 71 2 0
MatCreateSubMat 4 1.0 8.0028e-03 1.0 0.00e+00 0.0 2.9e+03 2.7e+02 6.4e+01 0 0 0 0 0 0 0 1 0 12 0
MatGetOrdering 1 0.0 1.3018e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatIncreaseOvrlp 6 1.0 5.7495e-02 1.2 0.00e+00 0.0 2.7e+04 1.0e+03 1.2e+01 0 0 0 0 0 2 0 8 2 2 0
MatCoarsen 6 1.0 2.0511e-02 1.1 0.00e+00 0.0 5.3e+04 5.8e+02 3.3e+01 0 0 0 0 0 1 0 15 2 6 0
MatZeroEntries 6 1.0 3.5179e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatPtAP 6 1.0 2.6506e-01 1.0 1.13e+07 1.6 6.3e+04 2.6e+03 9.2e+01 0 0 0 0 0 10 33 18 13 17 4615
MatPtAPSymbolic 6 1.0 1.5077e-01 1.0 0.00e+00 0.0 3.4e+04 2.7e+03 4.2e+01 0 0 0 0 0 6 0 10 7 8 0
MatPtAPNumeric 6 1.0 1.1295e-01 1.0 1.13e+07 1.6 2.9e+04 2.6e+03 4.8e+01 0 0 0 0 0 4 33 8 6 9 10831
MatGetLocalMat 6 1.0 4.4863e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 6 1.0 1.0457e-02 1.7 0.00e+00 0.0 2.0e+04 3.5e+03 0.0e+00 0 0 0 0 0 0 0 6 5 0 0
SFSetGraph 12 1.0 2.1935e-05 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFSetUp 12 1.0 1.6796e-02 1.1 0.00e+00 0.0 2.6e+04 6.2e+02 0.0e+00 0 0 0 0 0 1 0 7 1 0 0
SFBcastBegin 45 1.0 2.0542e-03 2.5 0.00e+00 0.0 5.4e+04 6.9e+02 0.0e+00 0 0 0 0 0 0 0 15 3 0 0
SFBcastEnd 45 1.0 8.7860e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
GAMG: createProl 6 1.0 2.0872e+00 1.0 0.00e+00 0.0 2.0e+05 5.2e+03 2.8e+02 1 0 0 1 1 81 0 56 78 52 0
GAMG: partLevel 6 1.0 2.7715e-01 1.0 1.13e+07 1.6 6.6e+04 2.5e+03 1.9e+02 0 0 0 0 1 11 33 19 13 35 4414
repartition 2 1.0 8.5306e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 2 0
Invert-Sort 2 1.0 1.1656e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0
Move A 2 1.0 4.7252e-03 1.1 0.00e+00 0.0 1.4e+03 5.3e+02 3.4e+01 0 0 0 0 0 0 0 0 0 6 0
Move P 2 1.0 4.5433e-03 1.1 0.00e+00 0.0 1.4e+03 1.3e+01 3.4e+01 0 0 0 0 0 0 0 0 0 6 0
PCSetUp 2 1.0 2.3749e+00 1.0 1.13e+07 1.6 2.7e+05 4.5e+03 5.1e+02 1 0 0 1 2 93 33 75 90 93 515
PCSetUpOnBlocks 4 1.0 2.7108e-04 1.4 1.29e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 4 1.0 1.1422e-01 1.1 1.82e+07 1.3 8.6e+04 1.2e+03 8.0e+00 0 0 0 0 0 4 56 24 8 1 18166
--- Event Stage 2: Remaining Solves
KSPSolve 999 1.0 1.2777e+02 1.1 2.12e+10 1.3 8.8e+07 1.4e+03 2.2e+04 42100 99 97 97 50100 99 98100 19178
VecTDot 7968 1.0 1.1053e+01 6.1 4.30e+08 1.0 0.0e+00 0.0e+00 8.0e+03 1 2 0 0 35 1 2 0 0 36 4866
VecNorm 5982 1.0 4.1078e+00 6.9 3.23e+08 1.0 0.0e+00 0.0e+00 6.0e+03 1 2 0 0 27 1 2 0 0 27 9830
VecScale 23904 1.0 1.1072e-01 2.3 5.40e+07 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 46310
VecCopy 999 1.0 1.2563e-01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 83664 1.0 6.9843e-01 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 7968 1.0 1.0304e+00 1.8 4.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 52196
VecAYPX 27888 1.0 1.3915e+00 2.3 3.56e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 31384
VecScatterBegin 100599 1.0 6.4764e+00 3.5 0.00e+00 0.0 8.8e+07 1.4e+03 0.0e+00 1 0 99 97 0 2 0 99 98 0 0
VecScatterEnd 100599 1.0 5.6109e+01 4.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 9 0 0 0 0 0
MatMult 28887 1.0 4.4493e+01 1.8 6.12e+09 1.2 3.0e+07 2.1e+03 0.0e+00 10 29 33 49 0 12 29 33 49 0 16049
MatMultAdd 23904 1.0 1.4431e+01 2.5 1.37e+09 1.6 1.6e+07 6.5e+02 0.0e+00 3 6 18 8 0 4 6 18 8 0 10428
MatMultTranspose 23904 1.0 1.5629e+01 2.4 1.37e+09 1.6 1.6e+07 6.5e+02 0.0e+00 3 6 18 8 0 4 6 18 8 0 9629
MatSolve 3984 0.0 1.9469e-02 0.0 2.63e+05 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14
MatSOR 47808 1.0 6.8757e+01 1.7 1.08e+10 1.3 2.7e+07 1.5e+03 8.0e+03 22 51 30 32 35 26 51 30 32 36 18089
MatResidual 23904 1.0 3.1760e+01 1.9 4.54e+09 1.3 2.7e+07 1.5e+03 0.0e+00 7 21 30 32 0 8 21 30 32 0 16276
PCSetUpOnBlocks 3984 1.0 5.4686e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 3984 1.0 1.0766e+02 1.1 1.81e+10 1.3 8.5e+07 1.2e+03 8.0e+03 36 84 96 80 35 43 84 96 81 36 19149
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Krylov Solver 1 9 11424 0.
DMKSP interface 1 0 0 0.
Vector 5 52 2371496 0.
Matrix 0 72 14138216 0.
Distributed Mesh 1 0 0 0.
Index Set 2 12 133768 0.
IS L to G Mapping 1 0 0 0.
Star Forest Graph 2 0 0 0.
Discrete System 1 0 0 0.
Vec Scatter 1 13 16432 0.
Preconditioner 1 9 9676 0.
Viewer 1 0 0 0.
--- Event Stage 1: First Solve
Krylov Solver 8 0 0 0.
Vector 140 92 2204792 0.
Matrix 140 68 21738552 0.
Matrix Coarsen 6 6 3816 0.
Index Set 110 100 543240 0.
Star Forest Graph 12 12 10368 0.
Vec Scatter 31 18 22752 0.
Preconditioner 8 0 0 0.
--- Event Stage 2: Remaining Solves
Vector 23904 23904 1295501184 0.
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 1.71661e-05
Average time for zero size MPI_Send(): 1.54705e-05
#PETSc Option Table entries:
-gamg_est_ksp_type cg
-ksp_norm_type unpreconditioned
-ksp_type cg
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_max_it 1
-mg_levels_ksp_norm_type none
-mg_levels_ksp_type richardson
-mg_levels_pc_sor_its 1
-mg_levels_pc_type sor
-pc_gamg_type classical
-pc_type gamg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 PETSC_ARCH=arch-linux-amd-opt --download-f2cblaslapack --with-mpi-dir=/cm/shared/apps/mvapich2/intel-17.0.1/2.0 --download-hypre --download-ml --with-fc=0 --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 --with-batch --with-x --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=4
-----------------------------------------
Libraries compiled on 2018-05-25 07:05:14 on node1-001
Machine characteristics: Linux-2.6.32-696.18.7.el6.x86_64-x86_64-with-redhat-6.6-Carbon
Using PETSc directory: /home/ritsat/beckerm/petsc
Using PETSc arch: arch-linux-amd-opt
-----------------------------------------
Using C compiler: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc -fPIC -wd1572 -O3
-----------------------------------------
Using include paths: -I/home/ritsat/beckerm/petsc/include -I/home/ritsat/beckerm/petsc/arch-linux-amd-opt/include -I/cm/shared/apps/mvapich2/intel-17.0.1/2.0/include
-----------------------------------------
Using C linker: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc
Using libraries: -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lpetsc -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lHYPRE -lml -lf2clapack -lf2cblas -lX11 -ldl
-----------------------------------------
-------------- next part --------------
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
/home/ritsat/beckerm/ppp_test/plasmapic on a arch-linux-amd-opt named node1-028 with 125 processors, by beckerm Fri May 25 10:11:49 2018
Using Petsc Development GIT revision: v3.9.2-503-g9e88a8b GIT Date: 2018-05-24 08:01:24 -0500
Max Max/Min Avg Total
Time (sec): 2.488e+02 1.00000 2.488e+02
Objects: 2.438e+04 1.00004 2.438e+04
Flop: 2.125e+10 1.27708 1.963e+10 2.454e+12
Flop/sec: 8.539e+07 1.27708 7.890e+07 9.862e+09
MPI Messages: 1.042e+06 3.36140 7.129e+05 8.911e+07
MPI Message Lengths: 1.344e+09 2.32209 1.439e+03 1.282e+11
MPI Reductions: 2.250e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 6.9069e+00 2.8% 0.0000e+00 0.0% 3.000e+03 0.0% 3.178e+03 0.0% 1.700e+01 0.1%
1: First Solve: 2.5499e+00 1.0% 3.6885e+09 0.2% 3.549e+05 0.4% 3.736e+03 1.0% 5.500e+02 2.4%
2: Remaining Solves: 2.3939e+02 96.2% 2.4504e+12 99.8% 8.875e+07 99.6% 1.430e+03 99.0% 2.192e+04 97.4%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecSet 3 1.0 5.2118e-04 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
--- Event Stage 1: First Solve
BuildTwoSided 12 1.0 6.8238e-03 1.8 0.00e+00 0.0 8.8e+03 4.0e+00 0.0e+00 0 0 0 0 0 0 0 2 0 0 0
BuildTwoSidedF 30 1.0 3.0505e-01 4.1 0.00e+00 0.0 7.1e+03 1.0e+04 0.0e+00 0 0 0 0 0 7 0 2 5 0 0
KSPSetUp 9 1.0 3.2511e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0
KSPSolve 1 1.0 2.5530e+00 1.0 3.26e+07 1.4 3.5e+05 3.7e+03 5.5e+02 1 0 0 1 2 100100100100100 1445
VecTDot 8 1.0 6.3581e-03 3.8 4.32e+05 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 1 0 0 1 8493
VecNorm 6 1.0 1.4081e-03 2.7 3.24e+05 1.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 1 0 0 1 28762
VecScale 24 1.0 1.2040e-04 2.1 5.43e+04 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 42756
VecCopy 1 1.0 1.5712e-04 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 115 1.0 8.0633e-04 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 8 1.0 1.1771e-03 1.4 4.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 45877
VecAYPX 28 1.0 1.3962e-03 1.7 3.58e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 31412
VecAssemblyBegin 2 1.0 2.3842e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyEnd 2 1.0 2.3842e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 103 1.0 6.2523e-03 3.1 0.00e+00 0.0 8.9e+04 1.4e+03 0.0e+00 0 0 0 0 0 0 0 25 9 0 0
VecScatterEnd 103 1.0 3.7810e-02 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatMult 29 1.0 3.4570e-02 1.5 6.14e+06 1.2 3.0e+04 2.1e+03 0.0e+00 0 0 0 0 0 1 19 8 5 0 20735
MatMultAdd 24 1.0 1.3932e-02 2.5 1.37e+06 1.6 1.6e+04 6.5e+02 0.0e+00 0 0 0 0 0 0 4 5 1 0 10845
MatMultTranspose 24 1.0 1.3560e-02 2.5 1.37e+06 1.6 1.6e+04 6.5e+02 0.0e+00 0 0 0 0 0 0 4 5 1 0 11143
MatSolve 4 0.0 2.1935e-05 0.0 2.64e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 12
MatSOR 48 1.0 7.0858e-02 1.3 1.09e+07 1.3 2.7e+04 1.5e+03 8.0e+00 0 0 0 0 0 3 34 8 3 1 17694
MatLUFactorSym 1 1.0 4.9114e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 5.6028e-0519.6 1.29e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2
MatResidual 24 1.0 2.6907e-02 1.7 4.55e+06 1.3 2.7e+04 1.5e+03 0.0e+00 0 0 0 0 0 1 14 8 3 0 19289
MatAssemblyBegin 94 1.0 3.0747e-01 3.6 0.00e+00 0.0 7.1e+03 1.0e+04 0.0e+00 0 0 0 0 0 7 0 2 5 0 0
MatAssemblyEnd 94 1.0 8.1496e-02 1.1 0.00e+00 0.0 6.3e+04 2.1e+02 2.3e+02 0 0 0 0 1 3 0 18 1 42 0
MatGetRow 3102093 1.3 4.3272e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 15 0 0 0 0 0
MatGetRowIJ 1 0.0 7.1526e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatCreateSubMats 6 1.0 4.7091e-01 2.6 0.00e+00 0.0 5.5e+04 1.7e+04 1.2e+01 0 0 0 1 0 12 0 15 71 2 0
MatCreateSubMat 4 1.0 6.9880e-03 1.0 0.00e+00 0.0 2.9e+03 2.7e+02 6.4e+01 0 0 0 0 0 0 0 1 0 12 0
MatGetOrdering 1 0.0 1.2994e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatIncreaseOvrlp 6 1.0 5.7326e-02 1.2 0.00e+00 0.0 2.7e+04 1.0e+03 1.2e+01 0 0 0 0 0 2 0 8 2 2 0
MatCoarsen 6 1.0 1.6099e-02 1.0 0.00e+00 0.0 5.3e+04 5.8e+02 3.3e+01 0 0 0 0 0 1 0 15 2 6 0
MatZeroEntries 6 1.0 3.4292e-03 4.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatPtAP 6 1.0 2.6140e-01 1.0 1.13e+07 1.6 6.3e+04 2.6e+03 9.2e+01 0 0 0 0 0 10 33 18 13 17 4680
MatPtAPSymbolic 6 1.0 1.4820e-01 1.0 0.00e+00 0.0 3.4e+04 2.7e+03 4.2e+01 0 0 0 0 0 6 0 10 7 8 0
MatPtAPNumeric 6 1.0 1.0990e-01 1.0 1.13e+07 1.6 2.9e+04 2.6e+03 4.8e+01 0 0 0 0 0 4 33 8 6 9 11131
MatGetLocalMat 6 1.0 4.5252e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 6 1.0 8.3039e-03 1.6 0.00e+00 0.0 2.0e+04 3.5e+03 0.0e+00 0 0 0 0 0 0 0 6 5 0 0
SFSetGraph 12 1.0 1.4544e-05 5.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFSetUp 12 1.0 1.2054e-02 1.1 0.00e+00 0.0 2.6e+04 6.2e+02 0.0e+00 0 0 0 0 0 0 0 7 1 0 0
SFBcastBegin 45 1.0 2.0356e-03 2.3 0.00e+00 0.0 5.4e+04 6.9e+02 0.0e+00 0 0 0 0 0 0 0 15 3 0 0
SFBcastEnd 45 1.0 5.1184e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
GAMG: createProl 6 1.0 2.0881e+00 1.0 0.00e+00 0.0 2.0e+05 5.2e+03 2.8e+02 1 0 0 1 1 82 0 56 78 52 0
GAMG: partLevel 6 1.0 2.7127e-01 1.0 1.13e+07 1.6 6.6e+04 2.5e+03 1.9e+02 0 0 0 0 1 11 33 19 13 35 4510
repartition 2 1.0 6.8378e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 2 0
Invert-Sort 2 1.0 6.4492e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0
Move A 2 1.0 4.3352e-03 1.1 0.00e+00 0.0 1.4e+03 5.3e+02 3.4e+01 0 0 0 0 0 0 0 0 0 6 0
Move P 2 1.0 3.6781e-03 1.1 0.00e+00 0.0 1.4e+03 1.3e+01 3.4e+01 0 0 0 0 0 0 0 0 0 6 0
PCSetUp 2 1.0 2.3659e+00 1.0 1.13e+07 1.6 2.7e+05 4.5e+03 5.1e+02 1 0 0 1 2 93 33 75 90 93 517
PCSetUpOnBlocks 4 1.0 3.0303e-04 1.7 1.29e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 4 1.0 1.0916e-01 1.0 1.82e+07 1.3 8.6e+04 1.2e+03 8.0e+00 0 0 0 0 0 4 56 24 8 1 19009
--- Event Stage 2: Remaining Solves
KSPSolve 999 1.0 1.2028e+02 1.0 2.12e+10 1.3 8.8e+07 1.4e+03 2.2e+04 47100 99 97 97 49100 99 98100 20373
VecTDot 7968 1.0 6.3756e+00 3.7 4.30e+08 1.0 0.0e+00 0.0e+00 8.0e+03 1 2 0 0 35 1 2 0 0 36 8436
VecNorm 5982 1.0 4.5791e+00 7.1 3.23e+08 1.0 0.0e+00 0.0e+00 6.0e+03 1 2 0 0 27 1 2 0 0 27 8818
VecScale 23904 1.0 1.0700e-01 2.1 5.40e+07 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 47920
VecCopy 999 1.0 1.1231e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 83664 1.0 7.0631e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 7968 1.0 1.1656e+00 1.4 4.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 46141
VecAYPX 27888 1.0 1.3165e+00 1.6 3.56e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 33173
VecScatterBegin 100599 1.0 6.1421e+00 3.2 0.00e+00 0.0 8.8e+07 1.4e+03 0.0e+00 2 0 99 97 0 2 0 99 98 0 0
VecScatterEnd 100599 1.0 3.6060e+01 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 7 0 0 0 0 8 0 0 0 0 0
MatMult 28887 1.0 3.5612e+01 1.6 6.12e+09 1.2 3.0e+07 2.1e+03 0.0e+00 11 29 33 49 0 11 29 33 49 0 20052
MatMultAdd 23904 1.0 1.1237e+01 2.0 1.37e+09 1.6 1.6e+07 6.5e+02 0.0e+00 3 6 18 8 0 4 6 18 8 0 13392
MatMultTranspose 23904 1.0 1.3723e+01 2.5 1.37e+09 1.6 1.6e+07 6.5e+02 0.0e+00 4 6 18 8 0 4 6 18 8 0 10966
MatSolve 3984 0.0 1.9485e-02 0.0 2.63e+05 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 13
MatSOR 47808 1.0 6.6101e+01 1.3 1.08e+10 1.3 2.7e+07 1.5e+03 8.0e+03 25 51 30 32 35 26 51 30 32 36 18816
MatResidual 23904 1.0 2.6469e+01 1.7 4.54e+09 1.3 2.7e+07 1.5e+03 0.0e+00 8 21 30 32 0 8 21 30 32 0 19530
PCSetUpOnBlocks 3984 1.0 5.2657e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 3984 1.0 1.0306e+02 1.0 1.81e+10 1.3 8.5e+07 1.2e+03 8.0e+03 41 84 96 80 35 43 84 96 81 36 20004
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Krylov Solver 1 9 11424 0.
DMKSP interface 1 0 0 0.
Vector 5 52 2371496 0.
Matrix 0 72 14138216 0.
Distributed Mesh 1 0 0 0.
Index Set 2 12 133768 0.
IS L to G Mapping 1 0 0 0.
Star Forest Graph 2 0 0 0.
Discrete System 1 0 0 0.
Vec Scatter 1 13 16432 0.
Preconditioner 1 9 9676 0.
Viewer 1 0 0 0.
--- Event Stage 1: First Solve
Krylov Solver 8 0 0 0.
Vector 140 92 2204792 0.
Matrix 140 68 21738552 0.
Matrix Coarsen 6 6 3816 0.
Index Set 110 100 543240 0.
Star Forest Graph 12 12 10368 0.
Vec Scatter 31 18 22752 0.
Preconditioner 8 0 0 0.
--- Event Stage 2: Remaining Solves
Vector 23904 23904 1295501184 0.
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 2.13623e-05
Average time for zero size MPI_Send(): 1.46084e-05
#PETSc Option Table entries:
-gamg_est_ksp_type cg
-ksp_norm_type unpreconditioned
-ksp_type cg
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_max_it 1
-mg_levels_ksp_norm_type none
-mg_levels_ksp_type richardson
-mg_levels_pc_sor_its 1
-mg_levels_pc_type sor
-pc_gamg_type classical
-pc_type gamg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 PETSC_ARCH=arch-linux-amd-opt --download-f2cblaslapack --with-mpi-dir=/cm/shared/apps/mvapich2/intel-17.0.1/2.0 --download-hypre --download-ml --with-fc=0 --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 --with-batch --with-x --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=4
-----------------------------------------
Libraries compiled on 2018-05-25 07:05:14 on node1-001
Machine characteristics: Linux-2.6.32-696.18.7.el6.x86_64-x86_64-with-redhat-6.6-Carbon
Using PETSc directory: /home/ritsat/beckerm/petsc
Using PETSc arch: arch-linux-amd-opt
-----------------------------------------
Using C compiler: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc -fPIC -wd1572 -O3
-----------------------------------------
Using include paths: -I/home/ritsat/beckerm/petsc/include -I/home/ritsat/beckerm/petsc/arch-linux-amd-opt/include -I/cm/shared/apps/mvapich2/intel-17.0.1/2.0/include
-----------------------------------------
Using C linker: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc
Using libraries: -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lpetsc -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lHYPRE -lml -lf2clapack -lf2cblas -lX11 -ldl
-----------------------------------------
-------------- next part --------------
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
/home/ritsat/beckerm/ppp_test/plasmapic on a arch-linux-amd-opt named node1-010 with 1000 processors, by beckerm Tue May 29 11:18:21 2018
Using Petsc Development GIT revision: v3.9.2-503-g9e88a8b GIT Date: 2018-05-24 08:01:24 -0500
Max Max/Min Avg Total
Time (sec): 3.316e+02 1.00000 3.316e+02
Objects: 2.440e+04 1.00004 2.440e+04
Flop: 2.124e+10 1.27708 2.041e+10 2.041e+13
Flop/sec: 6.405e+07 1.27708 6.156e+07 6.156e+10
MPI Messages: 1.238e+06 3.99536 8.489e+05 8.489e+08
MPI Message Lengths: 1.343e+09 2.32238 1.393e+03 1.183e+12
MPI Reductions: 2.256e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.5695e+01 7.7% 0.0000e+00 0.0% 2.700e+04 0.0% 3.178e+03 0.0% 1.700e+01 0.1%
1: First Solve: 3.1540e+00 1.0% 3.0885e+10 0.2% 3.675e+06 0.4% 3.508e+03 1.1% 6.220e+02 2.8%
2: Remaining Solves: 3.0274e+02 91.3% 2.0380e+13 99.8% 8.452e+08 99.6% 1.384e+03 98.9% 2.191e+04 97.1%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecSet 3 1.0 5.2404e-04 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
--- Event Stage 1: First Solve
BuildTwoSided 12 1.0 2.2128e-02 1.4 0.00e+00 0.0 8.9e+04 4.0e+00 0.0e+00 0 0 0 0 0 1 0 2 0 0 0
BuildTwoSidedF 30 1.0 3.9611e-01 2.4 0.00e+00 0.0 6.5e+04 1.0e+04 0.0e+00 0 0 0 0 0 7 0 2 5 0 0
KSPSetUp 9 1.0 6.6152e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0
KSPSolve 1 1.0 3.1572e+00 1.0 3.25e+07 1.4 3.7e+06 3.5e+03 6.2e+02 1 0 0 1 3 100100100100100 9782
VecTDot 8 1.0 1.7718e-02 2.9 4.32e+05 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 1 0 0 1 24382
VecNorm 6 1.0 1.7011e-03 2.3 3.24e+05 1.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 1 0 0 1 190463
VecScale 24 1.0 1.6880e-04 2.5 5.43e+04 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 282104
VecCopy 1 1.0 1.3494e-04 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 124 1.0 8.7070e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 8 1.0 1.2469e-03 1.6 4.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 346451
VecAYPX 28 1.0 1.6160e-03 2.2 3.58e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 219183
VecAssemblyBegin 3 1.0 3.0994e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyEnd 3 1.0 4.0531e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 108 1.0 7.2830e-03 2.8 0.00e+00 0.0 8.4e+05 1.4e+03 0.0e+00 0 0 0 0 0 0 0 23 9 0 0
VecScatterEnd 108 1.0 5.8424e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatMult 29 1.0 3.7680e-02 1.4 6.14e+06 1.2 2.8e+05 2.0e+03 0.0e+00 0 0 0 0 0 1 19 8 4 0 157481
MatMultAdd 24 1.0 3.1618e-02 4.1 1.37e+06 1.6 1.5e+05 6.5e+02 0.0e+00 0 0 0 0 0 1 4 4 1 0 40763
MatMultTranspose 24 1.0 1.7325e-02 3.0 1.37e+06 1.6 1.5e+05 6.5e+02 0.0e+00 0 0 0 0 0 0 4 4 1 0 74394
MatSolve 4 0.0 4.8161e-05 0.0 1.10e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 228
MatSOR 48 1.0 8.3678e-02 1.4 1.09e+07 1.3 2.6e+05 1.5e+03 8.0e+00 0 0 0 0 0 2 34 7 3 1 124983
MatLUFactorSym 1 1.0 1.0300e-04 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 7.1049e-0537.2 3.29e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 463
MatResidual 24 1.0 2.9594e-02 1.6 4.55e+06 1.3 2.6e+05 1.5e+03 0.0e+00 0 0 0 0 0 1 14 7 3 0 146971
MatAssemblyBegin 102 1.0 3.9857e-01 2.3 0.00e+00 0.0 6.5e+04 1.0e+04 0.0e+00 0 0 0 0 0 8 0 2 5 0 0
MatAssemblyEnd 102 1.0 1.3652e-01 1.0 0.00e+00 0.0 6.2e+05 2.0e+02 2.5e+02 0 0 0 0 1 4 0 17 1 40 0
MatGetRow 3102093 1.3 4.5841e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 0
MatGetRowIJ 1 0.0 1.6928e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatCreateSubMats 6 1.0 5.0106e-01 2.2 0.00e+00 0.0 5.7e+05 1.6e+04 1.2e+01 0 0 0 1 0 11 0 15 72 2 0
MatCreateSubMat 6 1.0 2.8865e-02 1.0 0.00e+00 0.0 2.2e+04 3.3e+02 9.4e+01 0 0 0 0 0 1 0 1 0 15 0
MatGetOrdering 1 0.0 1.3614e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatIncreaseOvrlp 6 1.0 1.1707e-01 1.1 0.00e+00 0.0 2.6e+05 9.9e+02 1.2e+01 0 0 0 0 0 3 0 7 2 2 0
MatCoarsen 6 1.0 5.5459e-02 1.0 0.00e+00 0.0 7.1e+05 4.4e+02 5.6e+01 0 0 0 0 0 2 0 19 2 9 0
MatZeroEntries 6 1.0 3.5138e-03 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatPtAP 6 1.0 3.8423e-01 1.0 1.11e+07 1.6 6.3e+05 2.5e+03 9.2e+01 0 0 0 0 0 12 34 17 12 15 26996
MatPtAPSymbolic 6 1.0 2.1874e-01 1.0 0.00e+00 0.0 3.2e+05 2.7e+03 4.2e+01 0 0 0 0 0 7 0 9 7 7 0
MatPtAPNumeric 6 1.0 1.5509e-01 1.0 1.11e+07 1.6 3.0e+05 2.3e+03 4.8e+01 0 0 0 0 0 5 34 8 6 8 66883
MatGetLocalMat 6 1.0 4.7982e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 6 1.0 1.4448e-02 2.0 0.00e+00 0.0 1.9e+05 3.4e+03 0.0e+00 0 0 0 0 0 0 0 5 5 0 0
SFSetGraph 12 1.0 1.8835e-05 9.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFSetUp 12 1.0 3.1634e-02 1.2 0.00e+00 0.0 2.7e+05 5.8e+02 0.0e+00 0 0 0 0 0 1 0 7 1 0 0
SFBcastBegin 68 1.0 2.7318e-03 2.8 0.00e+00 0.0 7.2e+05 5.1e+02 0.0e+00 0 0 0 0 0 0 0 20 3 0 0
SFBcastEnd 68 1.0 3.0540e-02 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
GAMG: createProl 6 1.0 2.4582e+00 1.0 0.00e+00 0.0 2.2e+06 4.7e+03 3.1e+02 1 0 0 1 1 78 0 59 79 50 0
GAMG: partLevel 6 1.0 4.2463e-01 1.0 1.11e+07 1.6 6.5e+05 2.4e+03 2.4e+02 0 0 0 0 1 13 34 18 12 39 24427
repartition 3 1.0 3.3462e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0
Invert-Sort 3 1.0 3.3751e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 2 0
Move A 3 1.0 1.7274e-02 1.1 0.00e+00 0.0 9.5e+03 7.4e+02 5.0e+01 0 0 0 0 0 1 0 0 0 8 0
Move P 3 1.0 1.4379e-02 1.1 0.00e+00 0.0 1.3e+04 1.3e+01 5.0e+01 0 0 0 0 0 0 0 0 0 8 0
PCSetUp 2 1.0 2.9061e+00 1.0 1.11e+07 1.6 2.8e+06 4.2e+03 5.8e+02 1 0 0 1 3 92 34 77 91 94 3569
PCSetUpOnBlocks 4 1.0 4.0293e-04 3.0 3.29e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 82
PCApply 4 1.0 1.3660e-01 1.0 1.82e+07 1.3 8.2e+05 1.2e+03 8.0e+00 0 0 0 0 0 4 56 22 7 1 127272
--- Event Stage 2: Remaining Solves
KSPSolve 999 1.0 1.7180e+02 1.1 2.12e+10 1.3 8.4e+08 1.4e+03 2.2e+04 51100 99 97 97 56100 99 98100 118630
VecTDot 7964 1.0 1.1651e+01 2.4 4.30e+08 1.0 0.0e+00 0.0e+00 8.0e+03 2 2 0 0 35 3 2 0 0 36 36911
VecNorm 5980 1.0 1.2358e+01 3.1 3.23e+08 1.0 0.0e+00 0.0e+00 6.0e+03 2 2 0 0 27 3 2 0 0 27 26129
VecScale 23892 1.0 1.3560e-01 2.5 5.40e+07 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 349607
VecCopy 999 1.0 1.4402e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 83622 1.0 8.0502e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 7964 1.0 1.2322e+00 1.6 4.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 349007
VecAYPX 27874 1.0 1.6865e+00 2.1 3.56e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 209016
VecScatterBegin 100549 1.0 7.0089e+00 2.9 0.00e+00 0.0 8.4e+08 1.4e+03 0.0e+00 2 0 99 97 0 2 0 99 98 0 0
VecScatterEnd 100549 1.0 6.5406e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 13 0 0 0 0 14 0 0 0 0 0
MatMult 28873 1.0 3.9569e+01 1.5 6.11e+09 1.2 2.8e+08 2.0e+03 0.0e+00 10 29 33 48 0 11 29 34 49 0 149320
MatMultAdd 23892 1.0 3.2684e+01 4.0 1.37e+09 1.6 1.5e+08 6.5e+02 0.0e+00 8 6 18 8 0 9 6 18 8 0 39256
MatMultTranspose 23892 1.0 2.0947e+01 3.1 1.37e+09 1.6 1.5e+08 6.5e+02 0.0e+00 3 6 18 8 0 4 6 18 8 0 61253
MatSolve 3982 0.0 4.6725e-02 0.0 1.09e+07 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 234
MatSOR 47784 1.0 8.3053e+01 1.3 1.08e+10 1.3 2.6e+08 1.5e+03 8.0e+03 23 51 30 32 35 25 51 30 32 36 124862
MatResidual 23892 1.0 3.1663e+01 1.7 4.53e+09 1.3 2.6e+08 1.5e+03 0.0e+00 7 21 30 32 0 8 21 30 32 0 136752
PCSetUpOnBlocks 3982 1.0 5.4314e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 3982 1.0 1.4286e+02 1.0 1.81e+10 1.3 8.1e+08 1.2e+03 8.0e+03 43 85 96 81 35 47 85 96 82 36 120863
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Krylov Solver 1 9 11424 0.
DMKSP interface 1 0 0 0.
Vector 5 52 2382208 0.
Matrix 0 65 14780672 0.
Distributed Mesh 1 0 0 0.
Index Set 2 18 171852 0.
IS L to G Mapping 1 0 0 0.
Star Forest Graph 2 0 0 0.
Discrete System 1 0 0 0.
Vec Scatter 1 13 16432 0.
Preconditioner 1 9 9676 0.
Viewer 1 0 0 0.
--- Event Stage 1: First Solve
Krylov Solver 8 0 0 0.
Vector 152 104 2238504 0.
Matrix 148 83 22951356 0.
Matrix Coarsen 6 6 3816 0.
Index Set 128 112 590828 0.
Star Forest Graph 12 12 10368 0.
Vec Scatter 34 21 26544 0.
Preconditioner 8 0 0 0.
--- Event Stage 2: Remaining Solves
Vector 23892 23892 1302241424 0.
========================================================================================================================
Average time to get PetscTime(): 1.19209e-07
Average time for MPI_Barrier(): 3.35693e-05
Average time for zero size MPI_Send(): 1.84231e-05
#PETSc Option Table entries:
-gamg_est_ksp_type cg
-ksp_norm_type unpreconditioned
-ksp_type cg
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_max_it 1
-mg_levels_ksp_norm_type none
-mg_levels_ksp_type richardson
-mg_levels_pc_sor_its 1
-mg_levels_pc_type sor
-pc_gamg_type classical
-pc_type gamg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 PETSC_ARCH=arch-linux-amd-opt --download-f2cblaslapack --with-mpi-dir=/cm/shared/apps/mvapich2/intel-17.0.1/2.0 --download-hypre --download-ml --with-fc=0 --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 --with-batch --with-x --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=4
-----------------------------------------
Libraries compiled on 2018-05-25 07:05:14 on node1-001
Machine characteristics: Linux-2.6.32-696.18.7.el6.x86_64-x86_64-with-redhat-6.6-Carbon
Using PETSc directory: /home/ritsat/beckerm/petsc
Using PETSc arch: arch-linux-amd-opt
-----------------------------------------
Using C compiler: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc -fPIC -wd1572 -O3
-----------------------------------------
Using include paths: -I/home/ritsat/beckerm/petsc/include -I/home/ritsat/beckerm/petsc/arch-linux-amd-opt/include -I/cm/shared/apps/mvapich2/intel-17.0.1/2.0/include
-----------------------------------------
Using C linker: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc
Using libraries: -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lpetsc -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lHYPRE -lml -lf2clapack -lf2cblas -lX11 -ldl
-----------------------------------------
-------------- next part --------------
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
/home/ritsat/beckerm/ppp_test/plasmapic on a arch-linux-amd-opt named node1-010 with 1000 processors, by beckerm Tue May 29 11:37:28 2018
Using Petsc Development GIT revision: v3.9.2-503-g9e88a8b GIT Date: 2018-05-24 08:01:24 -0500
Max Max/Min Avg Total
Time (sec): 2.914e+02 1.00000 2.914e+02
Objects: 2.440e+04 1.00004 2.440e+04
Flop: 2.124e+10 1.27708 2.041e+10 2.041e+13
Flop/sec: 7.289e+07 1.27708 7.005e+07 7.005e+10
MPI Messages: 1.238e+06 3.99536 8.489e+05 8.489e+08
MPI Message Lengths: 1.343e+09 2.32238 1.393e+03 1.183e+12
MPI Reductions: 2.256e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.5285e+01 8.7% 0.0000e+00 0.0% 2.700e+04 0.0% 3.178e+03 0.0% 1.700e+01 0.1%
1: First Solve: 3.1432e+00 1.1% 3.0885e+10 0.2% 3.675e+06 0.4% 3.508e+03 1.1% 6.220e+02 2.8%
2: Remaining Solves: 2.6295e+02 90.2% 2.0380e+13 99.8% 8.452e+08 99.6% 1.384e+03 98.9% 2.191e+04 97.1%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecSet 3 1.0 5.2595e-04 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
--- Event Stage 1: First Solve
BuildTwoSided 12 1.0 1.2117e-02 1.5 0.00e+00 0.0 8.9e+04 4.0e+00 0.0e+00 0 0 0 0 0 0 0 2 0 0 0
BuildTwoSidedF 30 1.0 4.6420e-01 3.2 0.00e+00 0.0 6.5e+04 1.0e+04 0.0e+00 0 0 0 0 0 9 0 2 5 0 0
KSPSetUp 9 1.0 5.0461e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0
KSPSolve 1 1.0 3.1475e+00 1.0 3.25e+07 1.4 3.7e+06 3.5e+03 6.2e+02 1 0 0 1 3 100100100100100 9812
VecTDot 8 1.0 6.8884e-03 3.7 4.32e+05 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 1 0 0 1 62713
VecNorm 6 1.0 1.6949e-03 2.6 3.24e+05 1.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 1 0 0 1 191160
VecScale 24 1.0 1.4758e-04 2.4 5.43e+04 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 322665
VecCopy 1 1.0 1.4782e-04 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 124 1.0 8.3828e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 8 1.0 1.2336e-03 1.5 4.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 350201
VecAYPX 28 1.0 1.5278e-03 2.0 3.58e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 231838
VecAssemblyBegin 3 1.0 8.8215e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyEnd 3 1.0 5.0068e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 108 1.0 6.5110e-03 2.9 0.00e+00 0.0 8.4e+05 1.4e+03 0.0e+00 0 0 0 0 0 0 0 23 9 0 0
VecScatterEnd 108 1.0 4.8535e-02 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatMult 29 1.0 3.4050e-02 1.4 6.14e+06 1.2 2.8e+05 2.0e+03 0.0e+00 0 0 0 0 0 1 19 8 4 0 174274
MatMultAdd 24 1.0 2.1619e-02 3.0 1.37e+06 1.6 1.5e+05 6.5e+02 0.0e+00 0 0 0 0 0 1 4 4 1 0 59618
MatMultTranspose 24 1.0 1.6357e-02 2.4 1.37e+06 1.6 1.5e+05 6.5e+02 0.0e+00 0 0 0 0 0 0 4 4 1 0 78797
MatSolve 4 0.0 5.1498e-05 0.0 1.10e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 213
MatSOR 48 1.0 7.6160e-02 1.3 1.09e+07 1.3 2.6e+05 1.5e+03 8.0e+00 0 0 0 0 0 2 34 7 3 1 137320
MatLUFactorSym 1 1.0 1.0586e-04 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 7.1049e-0537.2 3.29e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 463
MatResidual 24 1.0 2.6069e-02 1.5 4.55e+06 1.3 2.6e+05 1.5e+03 0.0e+00 0 0 0 0 0 1 14 7 3 0 166848
MatAssemblyBegin 102 1.0 4.6661e-01 3.0 0.00e+00 0.0 6.5e+04 1.0e+04 0.0e+00 0 0 0 0 0 10 0 2 5 0 0
MatAssemblyEnd 102 1.0 1.0982e-01 1.1 0.00e+00 0.0 6.2e+05 2.0e+02 2.5e+02 0 0 0 0 1 3 0 17 1 40 0
MatGetRow 3102093 1.3 5.3594e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 0
MatGetRowIJ 1 0.0 1.6928e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatCreateSubMats 6 1.0 4.6788e-01 2.3 0.00e+00 0.0 5.7e+05 1.6e+04 1.2e+01 0 0 0 1 0 10 0 15 72 2 0
MatCreateSubMat 6 1.0 1.8935e-02 1.0 0.00e+00 0.0 2.2e+04 3.3e+02 9.4e+01 0 0 0 0 0 1 0 1 0 15 0
MatGetOrdering 1 0.0 1.4997e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatIncreaseOvrlp 6 1.0 1.1188e-01 1.1 0.00e+00 0.0 2.6e+05 9.9e+02 1.2e+01 0 0 0 0 0 3 0 7 2 2 0
MatCoarsen 6 1.0 3.2188e-02 1.0 0.00e+00 0.0 7.1e+05 4.4e+02 5.6e+01 0 0 0 0 0 1 0 19 2 9 0
MatZeroEntries 6 1.0 3.6952e-03 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatPtAP 6 1.0 3.6765e-01 1.0 1.11e+07 1.6 6.3e+05 2.5e+03 9.2e+01 0 0 0 0 0 12 34 17 12 15 28214
MatPtAPSymbolic 6 1.0 2.1124e-01 1.0 0.00e+00 0.0 3.2e+05 2.7e+03 4.2e+01 0 0 0 0 0 7 0 9 7 7 0
MatPtAPNumeric 6 1.0 1.4190e-01 1.0 1.11e+07 1.6 3.0e+05 2.3e+03 4.8e+01 0 0 0 0 0 5 34 8 6 8 73100
MatGetLocalMat 6 1.0 4.7829e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 6 1.0 1.2071e-02 2.1 0.00e+00 0.0 1.9e+05 3.4e+03 0.0e+00 0 0 0 0 0 0 0 5 5 0 0
SFSetGraph 12 1.0 2.4796e-05 8.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFSetUp 12 1.0 1.8736e-02 1.1 0.00e+00 0.0 2.7e+05 5.8e+02 0.0e+00 0 0 0 0 0 1 0 7 1 0 0
SFBcastBegin 68 1.0 2.7175e-03 2.8 0.00e+00 0.0 7.2e+05 5.1e+02 0.0e+00 0 0 0 0 0 0 0 20 3 0 0
SFBcastEnd 68 1.0 1.5780e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
GAMG: createProl 6 1.0 2.5062e+00 1.0 0.00e+00 0.0 2.2e+06 4.7e+03 3.1e+02 1 0 0 1 1 79 0 59 79 50 0
GAMG: partLevel 6 1.0 3.9494e-01 1.0 1.11e+07 1.6 6.5e+05 2.4e+03 2.4e+02 0 0 0 0 1 12 34 18 12 39 26264
repartition 3 1.0 2.5189e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0
Invert-Sort 3 1.0 2.0480e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 2 0
Move A 3 1.0 1.1837e-02 1.1 0.00e+00 0.0 9.5e+03 7.4e+02 5.0e+01 0 0 0 0 0 0 0 0 0 8 0
Move P 3 1.0 9.3570e-03 1.1 0.00e+00 0.0 1.3e+04 1.3e+01 5.0e+01 0 0 0 0 0 0 0 0 0 8 0
PCSetUp 2 1.0 2.9206e+00 1.0 1.11e+07 1.6 2.8e+06 4.2e+03 5.8e+02 1 0 0 1 3 93 34 77 91 94 3552
PCSetUpOnBlocks 4 1.0 4.2915e-04 2.6 3.29e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 77
PCApply 4 1.0 1.2181e-01 1.0 1.82e+07 1.3 8.2e+05 1.2e+03 8.0e+00 0 0 0 0 0 4 56 22 7 1 142729
--- Event Stage 2: Remaining Solves
KSPSolve 999 1.0 1.3992e+02 1.1 2.12e+10 1.3 8.4e+08 1.4e+03 2.2e+04 46100 99 97 97 51100 99 98100 145661
VecTDot 7964 1.0 7.8852e+00 2.9 4.30e+08 1.0 0.0e+00 0.0e+00 8.0e+03 1 2 0 0 35 2 2 0 0 36 54539
VecNorm 5980 1.0 8.5339e+00 6.9 3.23e+08 1.0 0.0e+00 0.0e+00 6.0e+03 1 2 0 0 27 2 2 0 0 27 37840
VecScale 23892 1.0 1.3013e-01 2.5 5.40e+07 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 364277
VecCopy 999 1.0 1.3009e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 83622 1.0 7.4316e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 7964 1.0 1.2206e+00 1.5 4.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 352338
VecAYPX 27874 1.0 1.5360e+00 1.9 3.56e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 229492
VecScatterBegin 100549 1.0 6.4482e+00 3.0 0.00e+00 0.0 8.4e+08 1.4e+03 0.0e+00 2 0 99 97 0 2 0 99 98 0 0
VecScatterEnd 100549 1.0 5.2444e+01 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 9 0 0 0 0 10 0 0 0 0 0
MatMult 28873 1.0 3.4346e+01 1.4 6.11e+09 1.2 2.8e+08 2.0e+03 0.0e+00 9 29 33 48 0 10 29 34 49 0 172028
MatMultAdd 23892 1.0 2.0934e+01 2.9 1.37e+09 1.6 1.5e+08 6.5e+02 0.0e+00 6 6 18 8 0 7 6 18 8 0 61290
MatMultTranspose 23892 1.0 1.4770e+01 2.2 1.37e+09 1.6 1.5e+08 6.5e+02 0.0e+00 3 6 18 8 0 3 6 18 8 0 86867
MatSolve 3982 0.0 4.7890e-02 0.0 1.09e+07 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 228
MatSOR 47784 1.0 7.1105e+01 1.3 1.08e+10 1.3 2.6e+08 1.5e+03 8.0e+03 23 51 30 32 35 26 51 30 32 36 145842
MatResidual 23892 1.0 2.4455e+01 1.4 4.53e+09 1.3 2.6e+08 1.5e+03 0.0e+00 7 21 30 32 0 7 21 30 32 0 177060
PCSetUpOnBlocks 3982 1.0 5.4274e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 3982 1.0 1.1804e+02 1.0 1.81e+10 1.3 8.1e+08 1.2e+03 8.0e+03 40 85 96 81 35 44 85 96 82 36 146277
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Krylov Solver 1 9 11424 0.
DMKSP interface 1 0 0 0.
Vector 5 52 2382208 0.
Matrix 0 65 14780672 0.
Distributed Mesh 1 0 0 0.
Index Set 2 18 171852 0.
IS L to G Mapping 1 0 0 0.
Star Forest Graph 2 0 0 0.
Discrete System 1 0 0 0.
Vec Scatter 1 13 16432 0.
Preconditioner 1 9 9676 0.
Viewer 1 0 0 0.
--- Event Stage 1: First Solve
Krylov Solver 8 0 0 0.
Vector 152 104 2238504 0.
Matrix 148 83 22951356 0.
Matrix Coarsen 6 6 3816 0.
Index Set 128 112 590828 0.
Star Forest Graph 12 12 10368 0.
Vec Scatter 34 21 26544 0.
Preconditioner 8 0 0 0.
--- Event Stage 2: Remaining Solves
Vector 23892 23892 1302241424 0.
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 3.93867e-05
Average time for zero size MPI_Send(): 1.59838e-05
#PETSc Option Table entries:
-gamg_est_ksp_type cg
-ksp_norm_type unpreconditioned
-ksp_type cg
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_max_it 1
-mg_levels_ksp_norm_type none
-mg_levels_ksp_type richardson
-mg_levels_pc_sor_its 1
-mg_levels_pc_type sor
-pc_gamg_type classical
-pc_type gamg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 PETSC_ARCH=arch-linux-amd-opt --download-f2cblaslapack --with-mpi-dir=/cm/shared/apps/mvapich2/intel-17.0.1/2.0 --download-hypre --download-ml --with-fc=0 --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 --with-batch --with-x --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=4
-----------------------------------------
Libraries compiled on 2018-05-25 07:05:14 on node1-001
Machine characteristics: Linux-2.6.32-696.18.7.el6.x86_64-x86_64-with-redhat-6.6-Carbon
Using PETSc directory: /home/ritsat/beckerm/petsc
Using PETSc arch: arch-linux-amd-opt
-----------------------------------------
Using C compiler: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc -fPIC -wd1572 -O3
-----------------------------------------
Using include paths: -I/home/ritsat/beckerm/petsc/include -I/home/ritsat/beckerm/petsc/arch-linux-amd-opt/include -I/cm/shared/apps/mvapich2/intel-17.0.1/2.0/include
-----------------------------------------
Using C linker: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc
Using libraries: -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lpetsc -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lHYPRE -lml -lf2clapack -lf2cblas -lX11 -ldl
-----------------------------------------
More information about the petsc-users
mailing list