[petsc-dev] Seeming performance regression with GAMG
Tobin Isaac
tisaac at ices.utexas.edu
Mon Apr 27 12:58:58 CDT 2015
On Mon, Apr 27, 2015 at 04:06:30PM +0100, Lawrence Mitchell wrote:
> Dear all,
>
> we recently noticed a slowdown when using GAMG that I'm trying to
> track down in a little more detail. I'm solving an Hdiv-L2
> "helmholtz" pressure correction using a schur complement. I
> precondition the schur complement with 'selfp', which morally looks
> like a normal Helmholtz operator (except in the DG space). The domain
> is very anisotropic (a thin atmospheric shell), so getting round to
> trying Toby's column-based coarsening plugin is on the horizon but I
> haven't done it yet.
>
> I don't have a good feel for exactly when things go worse, but here
> are two data points:
>
> A recentish master (e4b003c), and master from 26th Feb (30ab49e4). I
> notice in the former that MatPtAP takes significantly longer (full
> logs below), different coarsening maybe? As a point of comparison,
> the PCSetup for Hypre takes ballpark half a second on the same operator.
>
> I test with KSP ex6 (with a constant RHS):
>
> Any ideas?
While there may be other changes that have affected your performance,
I see two things in your logs:
- The coarse matrix is much smaller (3 vs. 592). The default coarse
equations limit was recently changed from 800 to 50. You can
recover the old behavior with `-pc_gamg_coarse_eq_limit 800`.
- GAMG now uses the square of the adjacency graph only on the finest
level. This means that matrices on the coarser levels will be
larger and have more entries, which probably explains the extra PtAP
time. Maybe Mark can explain the decision to make this change.
Cheers,
Toby
>
> Cheers,
>
> Lawrence
>
> $ ./ex6-e4b003c -f helmholtz-sphere.dat -ksp_type cg
> -ksp_convergence_test skip -ksp_max_it 2 -ksp_monitor -table
> -pc_type gamg -log_summary -ksp_view
> 0 KSP Residual norm 3.676132751311e-11
> 1 KSP Residual norm 1.764616084171e-14
> 2 KSP Residual norm 9.253867842133e-14
> KSP Object: 1 MPI processes
> type: cg
> maximum iterations=2, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
> type: gamg
> MG: type is MULTIPLICATIVE, levels=5 cycles=v
> Cycles per PCApply=1
> Using Galerkin computed coarse grid matrices
> GAMG specific options
> Threshold for dropping small values from graph 0
> AGG specific options
> Symmetric graph false
> Coarse grid solver -- level -------------------------------
> KSP Object: (mg_coarse_) 1 MPI processes
> type: gmres
> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> GMRES: happy breakdown tolerance 1e-30
> maximum iterations=1, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using NONE norm type for convergence test
> PC Object: (mg_coarse_) 1 MPI processes
> type: bjacobi
> block Jacobi: number of blocks = 1
> Local solve is same for all blocks, in the following KSP and
> PC objects:
> KSP Object: (mg_coarse_sub_) 1 MPI processes
> type: preonly
> maximum iterations=1, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using NONE norm type for convergence test
> PC Object: (mg_coarse_sub_) 1 MPI processes
> type: lu
> LU: out-of-place factorization
> tolerance for zero pivot 2.22045e-14
> using diagonal shift on blocks to prevent zero pivot
> [INBLOCKS]
> matrix ordering: nd
> factor fill ratio given 5, needed 1
> Factored matrix follows:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=3, cols=3
> package used to perform factorization: petsc
> total: nonzeros=9, allocated nonzeros=9
> total number of mallocs used during MatSetValues
> calls =0
> using I-node routines: found 1 nodes, limit used is 5
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=3, cols=3
> total: nonzeros=9, allocated nonzeros=9
> total number of mallocs used during MatSetValues calls =0
> using I-node routines: found 1 nodes, limit used is 5
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=3, cols=3
> total: nonzeros=9, allocated nonzeros=9
> total number of mallocs used during MatSetValues calls =0
> using I-node routines: found 1 nodes, limit used is 5
> Down solver (pre-smoother) on level 1 -------------------------------
> KSP Object: (mg_levels_1_) 1 MPI processes
> type: chebyshev
> Chebyshev: eigenvalue estimates: min = 0.0999929, max = 1.09992
> Chebyshev: eigenvalues estimated using gmres with translations
> [0 0.1; 0 1.1]
> KSP Object: (mg_levels_1_esteig_) 1 MPI processes
> type: gmres
> GMRES: restart=30, using Classical (unmodified)
> Gram-Schmidt Orthogonalization with no iterative refinement
> GMRES: happy breakdown tolerance 1e-30
> maximum iterations=10
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> maximum iterations=2
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> PC Object: (mg_levels_1_) 1 MPI processes
> type: sor
> SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=93, cols=93
> total: nonzeros=8649, allocated nonzeros=8649
> total number of mallocs used during MatSetValues calls =0
> using I-node routines: found 19 nodes, limit used is 5
> Up solver (post-smoother) same as down solver (pre-smoother)
> Down solver (pre-smoother) on level 2 -------------------------------
> KSP Object: (mg_levels_2_) 1 MPI processes
> type: chebyshev
> Chebyshev: eigenvalue estimates: min = 0.0998389, max = 1.09823
> Chebyshev: eigenvalues estimated using gmres with translations
> [0 0.1; 0 1.1]
> KSP Object: (mg_levels_2_esteig_) 1 MPI processes
> type: gmres
> GMRES: restart=30, using Classical (unmodified)
> Gram-Schmidt Orthogonalization with no iterative refinement
> GMRES: happy breakdown tolerance 1e-30
> maximum iterations=10
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> maximum iterations=2
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> PC Object: (mg_levels_2_) 1 MPI processes
> type: sor
> SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=2991, cols=2991
> total: nonzeros=8.94608e+06, allocated nonzeros=8.94608e+06
> total number of mallocs used during MatSetValues calls =0
> using I-node routines: found 599 nodes, limit used is 5
> Up solver (post-smoother) same as down solver (pre-smoother)
> Down solver (pre-smoother) on level 3 -------------------------------
> KSP Object: (mg_levels_3_) 1 MPI processes
> type: chebyshev
> Chebyshev: eigenvalue estimates: min = 0.0998975, max = 1.09887
> Chebyshev: eigenvalues estimated using gmres with translations
> [0 0.1; 0 1.1]
> KSP Object: (mg_levels_3_esteig_) 1 MPI processes
> type: gmres
> GMRES: restart=30, using Classical (unmodified)
> Gram-Schmidt Orthogonalization with no iterative refinement
> GMRES: happy breakdown tolerance 1e-30
> maximum iterations=10
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> maximum iterations=2
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> PC Object: (mg_levels_3_) 1 MPI processes
> type: sor
> SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=35419, cols=35419
> total: nonzeros=1.55936e+07, allocated nonzeros=1.55936e+07
> total number of mallocs used during MatSetValues calls =1
> not using I-node routines
> Up solver (post-smoother) same as down solver (pre-smoother)
> Down solver (pre-smoother) on level 4 -------------------------------
> KSP Object: (mg_levels_4_) 1 MPI processes
> type: chebyshev
> Chebyshev: eigenvalue estimates: min = 0.1, max = 1.1
> Chebyshev: eigenvalues estimated using gmres with translations
> [0 0.1; 0 1.1]
> KSP Object: (mg_levels_4_esteig_) 1 MPI processes
> type: gmres
> GMRES: restart=30, using Classical (unmodified)
> Gram-Schmidt Orthogonalization with no iterative refinement
> GMRES: happy breakdown tolerance 1e-30
> maximum iterations=10
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> maximum iterations=2
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> PC Object: (mg_levels_4_) 1 MPI processes
> type: sor
> SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=327680, cols=327680
> total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> Up solver (post-smoother) same as down solver (pre-smoother)
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=327680, cols=327680
> total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> helmholt 2 9e+03 gamg
> ************************************************************************************************************************
> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r
> -fCourier9' to print this document ***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance
> Summary: ----------------------------------------------
>
> ./ex6-master on a arch-linux2-c-opt named yam.doc.ic.ac.uk with 1
> processor, by lmitche1 Mon Apr 27 16:03:36 2015
> Using Petsc Development GIT revision: v3.5.3-2602-ga9b180a GIT Date:
> 2015-04-07 20:34:49 -0500
>
> Max Max/Min Avg Total
> Time (sec): 1.072e+02 1.00000 1.072e+02
> Objects: 2.620e+02 1.00000 2.620e+02
> Flops: 4.582e+10 1.00000 4.582e+10 4.582e+10
> Flops/sec: 4.275e+08 1.00000 4.275e+08 4.275e+08
> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Reductions: 0.000e+00 0.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
> e.g., VecAXPY() for real vectors of length
> N --> 2N flops
> and VecAXPY() for complex vectors of
> length N --> 8N flops
>
> Summary of Stages: ----- Time ------ ----- Flops ----- ---
> Messages --- -- Message Lengths -- -- Reductions --
> Avg %Total Avg %Total counts
> %Total Avg %Total counts %Total
> 0: Main Stage: 1.0898e-01 0.1% 7.4996e+06 0.0% 0.000e+00
> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
> 1: mystage 1: 1.0466e+02 97.6% 4.2348e+10 92.4% 0.000e+00
> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
> 2: mystage 2: 2.4395e+00 2.3% 3.4689e+09 7.6% 0.000e+00
> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
>
> -
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
> Count: number of times phase was executed
> Time and Flops: Max - maximum over all processors
> Ratio - ratio of maximum to minimum over all processors
> Mess: number of messages sent
> Avg. len: average message length (bytes)
> Reduct: number of global reductions
> Global: entire computation
> Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
> %T - percent time in this phase %F - percent flops in
> this phase
> %M - percent messages in this phase %L - percent message
> lengths in this phase
> %R - percent reductions in this phase
> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
> -
> ------------------------------------------------------------------------------------------------------------------------
> Event Count Time (sec) Flops
> --- Global --- --- Stage --- Total
> Max Ratio Max Ratio Max Ratio Mess Avg
> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
> -
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> ThreadCommRunKer 2 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatMult 1 1.0 4.4990e-03 1.0 6.19e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 4 83 0 0 0 1376
> MatAssemblyBegin 1 1.0 1.1921e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAssemblyEnd 1 1.0 1.0468e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 10 0 0 0 0 0
> MatLoad 1 1.0 9.3672e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 86 0 0 0 0 0
> VecNorm 1 1.0 9.1791e-05 1.0 6.55e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 9 0 0 0 7140
> VecSet 5 1.0 3.1860e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 3 0 0 0 0 0
> VecAXPY 1 1.0 3.9697e-04 1.0 6.55e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 9 0 0 0 1651
>
> --- Event Stage 1: mystage 1
>
> MatMult 40 1.0 3.5990e-01 1.0 5.52e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1535
> MatConvert 4 1.0 1.2038e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatScale 12 1.0 8.2839e-02 1.0 7.21e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 870
> MatAssemblyBegin 31 1.0 1.4067e-05 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAssemblyEnd 31 1.0 1.0995e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetRow 1464732 1.0 8.5680e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatCoarsen 4 1.0 2.4768e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAXPY 4 1.0 8.1362e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatMatMult 4 1.0 6.0399e-01 1.0 6.39e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 106
> MatMatMultSym 4 1.0 4.4536e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatMatMultNum 4 1.0 1.5859e-01 1.0 6.39e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 403
> MatPtAP 4 1.0 1.0255e+02 1.0 4.15e+10 1.0 0.0e+00
> 0.0e+00 0.0e+00 96 91 0 0 0 98 98 0 0 0 405
> MatPtAPSymbolic 4 1.0 6.1707e+01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 58 0 0 0 0 59 0 0 0 0 0
> MatPtAPNumeric 4 1.0 4.0840e+01 1.0 4.15e+10 1.0 0.0e+00
> 0.0e+00 0.0e+00 38 91 0 0 0 39 98 0 0 0 1017
> MatTrnMatMult 1 1.0 1.9648e-01 1.0 2.34e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 119
> MatTrnMatMultSym 1 1.0 1.3640e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatTrnMatMultNum 1 1.0 6.0079e-02 1.0 2.34e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 389
> MatGetSymTrans 5 1.0 8.4374e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecMDot 40 1.0 1.5124e-02 1.0 4.03e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2663
> VecNorm 44 1.0 1.1821e-03 1.0 8.06e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6815
> VecScale 44 1.0 1.4737e-03 1.0 4.03e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2733
> VecCopy 4 1.0 3.2711e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecSet 143 1.0 2.7236e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAXPY 4 1.0 3.9601e-04 1.0 7.32e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1849
> VecMAXPY 44 1.0 1.9676e-02 1.0 4.76e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2419
> VecAssemblyBegin 4 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAssemblyEnd 4 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecPointwiseMult 44 1.0 7.9026e-03 1.0 4.03e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 510
> VecSetRandom 4 1.0 3.6092e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecNormalize 44 1.0 2.6934e-03 1.0 1.21e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4486
> KSPGMRESOrthog 40 1.0 3.1765e-02 1.0 8.06e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2536
> KSPSetUp 10 1.0 1.1930e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> PCGAMGGraph_AGG 4 1.0 5.7353e-01 1.0 5.56e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 97
> PCGAMGCoarse_AGG 4 1.0 2.5846e-01 1.0 2.34e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 90
> PCGAMGProl_AGG 4 1.0 4.9806e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> PCGAMGPOpt_AGG 4 1.0 1.2128e+00 1.0 7.38e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 608
> GAMG: createProl 4 1.0 2.0973e+00 1.0 8.17e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 2 2 0 0 0 2 2 0 0 0 389
> Graph 8 1.0 5.7211e-01 1.0 5.56e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 97
> MIS/Agg 4 1.0 2.4861e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> SA: col data 4 1.0 7.6509e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> SA: frmProl0 4 1.0 4.6539e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> SA: smooth 4 1.0 1.2128e+00 1.0 7.38e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 608
> GAMG: partLevel 4 1.0 1.0255e+02 1.0 4.15e+10 1.0 0.0e+00
> 0.0e+00 0.0e+00 96 91 0 0 0 98 98 0 0 0 405
> PCSetUp 1 1.0 1.0465e+02 1.0 4.23e+10 1.0 0.0e+00
> 0.0e+00 0.0e+00 98 92 0 0 0 100100 0 0 0 405
>
> --- Event Stage 2: mystage 2
>
> MatMult 121 1.0 1.0144e+00 1.0 1.61e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 1 4 0 0 0 42 47 0 0 0 1592
> MatMultAdd 12 1.0 3.1757e-02 1.0 4.95e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 1558
> MatMultTranspose 12 1.0 3.7137e-02 1.0 4.95e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 2 1 0 0 0 1332
> MatSolve 6 1.0 9.2983e-06 1.0 9.00e+01 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 10
> MatSOR 116 1.0 1.2805e+00 1.0 1.61e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 1 4 0 0 0 52 47 0 0 0 1260
> MatLUFactorSym 1 1.0 1.5020e-05 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatLUFactorNum 1 1.0 5.9605e-06 1.0 1.60e+01 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3
> MatResidual 12 1.0 1.0432e-01 1.0 1.67e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 4 5 0 0 0 1599
> MatGetRowIJ 1 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetOrdering 1 1.0 4.2915e-05 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatView 8 1.0 6.7115e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecMDot 43 1.0 1.3515e-02 1.0 4.03e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 2980
> VecTDot 4 1.0 1.1048e-03 1.0 2.62e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2373
> VecNorm 53 1.0 1.4720e-03 1.0 1.00e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6809
> VecScale 50 1.0 1.4396e-03 1.0 4.03e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2798
> VecCopy 21 1.0 3.6387e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecSet 108 1.0 1.0684e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAXPY 15 1.0 2.1303e-03 1.0 4.09e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1918
> VecAYPX 97 1.0 1.1820e-02 1.0 1.16e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 985
> VecAXPBYCZ 48 1.0 8.2519e-03 1.0 2.20e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 2663
> VecMAXPY 50 1.0 1.6957e-02 1.0 4.76e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 2807
> VecNormalize 50 1.0 2.6865e-03 1.0 1.21e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4498
> KSPGMRESOrthog 43 1.0 2.7892e-02 1.0 8.06e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 2888
> KSPSetUp 5 1.0 2.9690e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSolve 1 1.0 2.4374e+00 1.0 3.47e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 2 8 0 0 0 100100 0 0 0 1423
> PCSetUp 1 1.0 9.3937e-05 1.0 1.60e+01 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> PCSetUpOnBlocks 3 1.0 9.6798e-05 1.0 1.60e+01 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> PCApply 3 1.0 2.4240e+00 1.0 3.45e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 2 8 0 0 0 99 99 0 0 0 1423
> -
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type Creations Destructions Memory Descendants'
> Mem.
> Reports information only for process 0.
>
> --- Event Stage 0: Main Stage
>
> Viewer 2 2 1520 0
> Matrix 1 10 54702412 0
> Vector 3 97 72154440 0
> Krylov Solver 0 11 146840 0
> Preconditioner 0 7 7332 0
> Index Set 0 3 2400 0
>
> --- Event Stage 1: mystage 1
>
> Viewer 1 0 0 0
> Matrix 22 14 691989068 0
> Matrix Coarsen 4 4 2576 0
> Vector 125 91 67210200 0
> Krylov Solver 15 4 120864 0
> Preconditioner 15 8 7520 0
> Index Set 4 4 3168 0
> PetscRandom 4 4 2560 0
>
> --- Event Stage 2: mystage 2
>
> Matrix 1 0 0 0
> Vector 60 0 0 0
> Index Set 5 2 1592 0
> ========================================================================================================================
> Average time to get PetscTime(): 0
> #PETSc Option Table entries:
> -f helmholtz-sphere.dat
> -ksp_convergence_test skip
> -ksp_max_it 2
> -ksp_monitor
> -ksp_type cg
> -ksp_view
> -log_summary
> -matload_block_size 1
> -pc_type gamg
> -table
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure options: --download-chaco=1 --download-ctetgen=1
> --download-exodusii=1 --download-hdf5=1 --download-hypre=1
> --download-metis=1 --download-ml=1 --download-mumps=1
> --download-netcdf=1 --download-parmetis=1 --download-ptscotch=1
> --download-scalapack=1 --download-superlu=1 --download-superlu_dist=1
> --download-triangle=1 --with-c2html=0 --with-debugging=0
> --with-make-np=32 --with-openmp=0 --with-pthreadclasses=0
> --with-shared-libraries=1 --with-threadcomm=0 PETSC_ARCH=arch-linux2-c-opt
> -----------------------------------------
> Libraries compiled on Wed Apr 8 10:00:43 2015 on yam.doc.ic.ac.uk
> Machine characteristics:
> Linux-3.13.0-45-generic-x86_64-with-Ubuntu-14.04-trusty
> Using PETSc directory: /data/lmitche1/src/deps/petsc
> Using PETSc arch: arch-linux2-c-opt
> -----------------------------------------
>
> Using C compiler: mpicc -fPIC -Wall -Wwrite-strings
> -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS}
> Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable
> -ffree-line-length-0 -Wno-unused-dummy-argument -O ${FOPTFLAGS}
> ${FFLAGS}
> -----------------------------------------
>
> Using include paths:
> -I/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/include
> -I/data/lmitche1/src/deps/petsc/include
> -I/data/lmitche1/src/deps/petsc/include
> -I/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/include
> -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi
> -----------------------------------------
>
> Using C linker: mpicc
> Using Fortran linker: mpif90
> Using libraries:
> -Wl,-rpath,/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib
> -L/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib -lpetsc
> -Wl,-rpath,/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib
> -L/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib -lcmumps
> -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lsuperlu_4.3
> -lsuperlu_dist_4.0 -lHYPRE -Wl,-rpath,/usr/lib/openmpi/lib
> -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_cxx
> -lstdc++ -lscalapack -lml -lmpi_cxx -lstdc++ -lexoIIv2for -lexodus
> -llapack -lblas -lparmetis -ltriangle -lnetcdf -lmetis -lchaco
> -lctetgen -lX11 -lptesmumps -lptscotch -lptscotcherr -lscotch
> -lscotcherr -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lssl
> -lcrypto -lm -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm
> -lquadmath -lm -lmpi_cxx -lstdc++ -lrt -lm -lz
> -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl
> -lmpi -lhwloc -lgcc_s -lpthread -ldl
> -----------------------------------------
>
>
>
> $ ./ex6-30ab49e4
> 0 KSP Residual norm 3.679528502747e-11
> 1 KSP Residual norm 1.410011347346e-14
> 2 KSP Residual norm 2.871653636831e-14
> KSP Object: 1 MPI processes
> type: cg
> maximum iterations=2, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
> type: gamg
> MG: type is MULTIPLICATIVE, levels=3 cycles=v
> Cycles per PCApply=1
> Using Galerkin computed coarse grid matrices
> Coarse grid solver -- level -------------------------------
> KSP Object: (mg_coarse_) 1 MPI processes
> type: preonly
> maximum iterations=1, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using NONE norm type for convergence test
> PC Object: (mg_coarse_) 1 MPI processes
> type: bjacobi
> block Jacobi: number of blocks = 1
> Local solve is same for all blocks, in the following KSP and
> PC objects:
> KSP Object: (mg_coarse_sub_) 1 MPI processes
> type: preonly
> maximum iterations=1, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using NONE norm type for convergence test
> PC Object: (mg_coarse_sub_) 1 MPI processes
> type: lu
> LU: out-of-place factorization
> tolerance for zero pivot 2.22045e-14
> using diagonal shift on blocks to prevent zero pivot
> [INBLOCKS]
> matrix ordering: nd
> factor fill ratio given 5, needed 1
> Factored matrix follows:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=592, cols=592
> package used to perform factorization: petsc
> total: nonzeros=350464, allocated nonzeros=350464
> total number of mallocs used during MatSetValues
> calls =0
> using I-node routines: found 119 nodes, limit used
> is 5
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=592, cols=592
> total: nonzeros=350464, allocated nonzeros=350464
> total number of mallocs used during MatSetValues calls =0
> using I-node routines: found 119 nodes, limit used is 5
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=592, cols=592
> total: nonzeros=350464, allocated nonzeros=350464
> total number of mallocs used during MatSetValues calls =0
> using I-node routines: found 119 nodes, limit used is 5
> Down solver (pre-smoother) on level 1 -------------------------------
> KSP Object: (mg_levels_1_) 1 MPI processes
> type: chebyshev
> Chebyshev: eigenvalue estimates: min = 0.0871826, max = 1.83084
> maximum iterations=2
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> PC Object: (mg_levels_1_) 1 MPI processes
> type: sor
> SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=35419, cols=35419
> total: nonzeros=1.55936e+07, allocated nonzeros=1.55936e+07
> total number of mallocs used during MatSetValues calls =1
> not using I-node routines
> Up solver (post-smoother) same as down solver (pre-smoother)
> Down solver (pre-smoother) on level 2 -------------------------------
> KSP Object: (mg_levels_2_) 1 MPI processes
> type: chebyshev
> Chebyshev: eigenvalue estimates: min = 0.099472, max = 2.08891
> maximum iterations=2
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> using nonzero initial guess
> using NONE norm type for convergence test
> PC Object: (mg_levels_2_) 1 MPI processes
> type: sor
> SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=327680, cols=327680
> total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> Up solver (post-smoother) same as down solver (pre-smoother)
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: seqaij
> rows=327680, cols=327680
> total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> Number of iterations = 2
> Residual norm = 8368.22
> ************************************************************************************************************************
> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r
> -fCourier9' to print this document ***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance
> Summary: ----------------------------------------------
>
> ./ex6-1ddf9fe on a test named yam.doc.ic.ac.uk with 1 processor, by
> lmitche1 Mon Apr 27 16:02:36 2015
> Using Petsc Release Version 3.5.2, unknown
>
> Max Max/Min Avg Total
> Time (sec): 2.828e+01 1.00000 2.828e+01
> Objects: 1.150e+02 1.00000 1.150e+02
> Flops: 1.006e+10 1.00000 1.006e+10 1.006e+10
> Flops/sec: 3.559e+08 1.00000 3.559e+08 3.559e+08
> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Reductions: 0.000e+00 0.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
> e.g., VecAXPY() for real vectors of length
> N --> 2N flops
> and VecAXPY() for complex vectors of
> length N --> 8N flops
>
> Summary of Stages: ----- Time ------ ----- Flops ----- ---
> Messages --- -- Message Lengths -- -- Reductions --
> Avg %Total Avg %Total counts
> %Total Avg %Total counts %Total
> 0: Main Stage: 9.9010e-02 0.4% 7.4996e+06 0.1% 0.000e+00
> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
> 1: mystage 1: 2.6509e+01 93.7% 8.4700e+09 84.2% 0.000e+00
> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
> 2: mystage 2: 1.6704e+00 5.9% 1.5861e+09 15.8% 0.000e+00
> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
>
> -
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
> Count: number of times phase was executed
> Time and Flops: Max - maximum over all processors
> Ratio - ratio of maximum to minimum over all processors
> Mess: number of messages sent
> Avg. len: average message length (bytes)
> Reduct: number of global reductions
> Global: entire computation
> Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
> %T - percent time in this phase %F - percent flops in
> this phase
> %M - percent messages in this phase %L - percent message
> lengths in this phase
> %R - percent reductions in this phase
> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
> -
> ------------------------------------------------------------------------------------------------------------------------
> Event Count Time (sec) Flops
> --- Global --- --- Stage --- Total
> Max Ratio Max Ratio Max Ratio Mess Avg
> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
> -
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> ThreadCommRunKer 2 1.0 3.0994e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatMult 1 1.0 7.2370e-03 1.0 6.19e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 7 83 0 0 0 855
> MatAssemblyBegin 1 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAssemblyEnd 1 1.0 1.0748e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 11 0 0 0 0 0
> MatLoad 1 1.0 8.5824e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 87 0 0 0 0 0
> VecNorm 1 1.0 9.2983e-05 1.0 6.55e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 9 0 0 0 7048
> VecSet 5 1.0 2.9252e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 3 0 0 0 0 0
> VecAXPY 1 1.0 4.6611e-04 1.0 6.55e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 9 0 0 0 1406
>
> --- Event Stage 1: mystage 1
>
> MatMult 20 1.0 2.8699e-01 1.0 3.73e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 1301
> MatConvert 2 1.0 8.3888e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatScale 6 1.0 7.1905e-02 1.0 4.85e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 675
> MatAssemblyBegin 20 1.0 1.9789e-05 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAssemblyEnd 20 1.0 1.1295e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetRow 1452396 1.0 9.5270e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatCoarsen 2 1.0 3.0676e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAXPY 2 1.0 8.2162e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatMatMult 2 1.0 4.2625e-01 1.0 4.31e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 2 0 0 0 0 2 1 0 0 0 101
> MatMatMultSym 2 1.0 3.0257e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> MatMatMultNum 2 1.0 1.2364e-01 1.0 4.31e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 349
> MatPtAP 2 1.0 2.3871e+01 1.0 7.82e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 84 78 0 0 0 90 92 0 0 0 328
> MatPtAPSymbolic 2 1.0 1.4329e+01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 51 0 0 0 0 54 0 0 0 0 0
> MatPtAPNumeric 2 1.0 9.5422e+00 1.0 7.82e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 34 78 0 0 0 36 92 0 0 0 819
> MatTrnMatMult 2 1.0 9.7712e-01 1.0 8.24e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 3 1 0 0 0 4 1 0 0 0 84
> MatTrnMatMultSym 2 1.0 5.0258e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
> MatTrnMatMultNum 2 1.0 4.7454e-01 1.0 8.24e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 174
> MatGetSymTrans 4 1.0 6.2370e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecMDot 20 1.0 1.7304e-02 1.0 3.99e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2308
> VecNorm 22 1.0 1.3692e-03 1.0 7.99e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 5834
> VecScale 22 1.0 1.8549e-03 1.0 3.99e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2153
> VecCopy 2 1.0 5.0211e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecSet 77 1.0 3.6245e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAXPY 2 1.0 3.7718e-04 1.0 7.26e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1925
> VecMAXPY 22 1.0 2.2252e-02 1.0 4.72e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 2121
> VecAssemblyBegin 2 1.0 9.5367e-07 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAssemblyEnd 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecPointwiseMult 22 1.0 9.2957e-03 1.0 3.99e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 430
> VecSetRandom 2 1.0 8.8599e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecNormalize 22 1.0 3.2570e-03 1.0 1.20e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3679
> KSPGMRESOrthog 20 1.0 3.6396e-02 1.0 7.99e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 2195
> KSPSetUp 6 1.0 1.4364e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> PCGAMGgraph_AGG 2 1.0 4.8670e-01 1.0 3.77e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 77
> PCGAMGcoarse_AGG 2 1.0 1.0664e+00 1.0 8.24e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 4 1 0 0 0 4 1 0 0 0 77
> PCGAMGProl_AGG 2 1.0 6.4827e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> PCGAMGPOpt_AGG 2 1.0 9.9913e-01 1.0 5.31e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 4 5 0 0 0 4 6 0 0 0 532
> PCSetUp 1 1.0 2.6505e+01 1.0 8.47e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 94 84 0 0 0 100100 0 0 0 320
>
> --- Event Stage 2: mystage 2
>
> MatMult 38 1.0 5.6846e-01 1.0 6.85e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 2 7 0 0 0 34 43 0 0 0 1204
> MatMultAdd 6 1.0 2.7303e-02 1.0 3.25e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 2 2 0 0 0 1189
> MatMultTranspose 6 1.0 3.2745e-02 1.0 3.25e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 2 2 0 0 0 991
> MatSolve 3 1.0 1.6339e-03 1.0 2.10e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1286
> MatSOR 36 1.0 9.4071e-01 1.0 6.79e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 3 7 0 0 0 56 43 0 0 0 722
> MatLUFactorSym 1 1.0 5.9440e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatLUFactorNum 1 1.0 4.0792e-02 1.0 1.15e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 1 0 0 0 2 7 0 0 0 2820
> MatResidual 6 1.0 9.2385e-02 1.0 1.13e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 1 0 0 0 6 7 0 0 0 1224
> MatGetRowIJ 1 1.0 1.4091e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetOrdering 1 1.0 2.3508e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatView 6 1.0 5.8508e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecTDot 4 1.0 1.4160e-03 1.0 2.62e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1851
> VecNorm 3 1.0 3.5286e-04 1.0 1.97e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 5572
> VecCopy 8 1.0 1.1313e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
> VecSet 22 1.0 4.3163e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAXPY 4 1.0 1.6727e-03 1.0 2.62e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1567
> VecAYPX 49 1.0 1.7958e-02 1.0 1.15e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 643
> VecAXPBYCZ 24 1.0 1.3117e-02 1.0 2.18e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 1661
> KSPSetUp 2 1.0 9.2983e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSolve 1 1.0 1.6690e+00 1.0 1.59e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 6 16 0 0 0 100100 0 0 0 950
> PCSetUp 1 1.0 4.7009e-02 1.0 1.15e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 1 0 0 0 3 7 0 0 0 2447
> PCSetUpOnBlocks 3 1.0 4.7014e-02 1.0 1.15e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00 0 1 0 0 0 3 7 0 0 0 2447
> PCApply 3 1.0 1.6409e+00 1.0 1.57e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 6 16 0 0 0 98 99 0 0 0 954
> -
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type Creations Destructions Memory Descendants'
> Mem.
> Reports information only for process 0.
>
> --- Event Stage 0: Main Stage
>
> Viewer 1 1 760 0
> Matrix 1 6 58874852 0
> Vector 3 20 27954224 0
> Krylov Solver 0 5 23360 0
> Preconditioner 0 5 5332 0
> Index Set 0 3 7112 0
>
> --- Event Stage 1: mystage 1
>
> Viewer 1 0 0 0
> Matrix 14 10 477163156 0
> Matrix Coarsen 2 2 1288 0
> Vector 69 52 66638640 0
> Krylov Solver 7 2 60432 0
> Preconditioner 7 2 2096 0
> Index Set 2 2 1584 0
> PetscRandom 2 2 1280 0
>
> --- Event Stage 2: mystage 2
>
> Matrix 1 0 0 0
> Index Set 5 2 2536 0
> ========================================================================================================================
> Average time to get PetscTime(): 0
> #PETSc Option Table entries:
> -f helmholtz-sphere.dat
> -ksp_convergence_test skip
> -ksp_max_it 2
> -ksp_monitor
> -ksp_type cg
> -ksp_view
> -log_summary
> -matload_block_size 1
> -pc_type gamg
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure options: PETSC_ARCH=test --with-debugging=0
> -----------------------------------------
> Libraries compiled on Mon Apr 27 10:49:24 2015 on yam.doc.ic.ac.uk
> Machine characteristics:
> Linux-3.13.0-45-generic-x86_64-with-Ubuntu-14.04-trusty
> Using PETSc directory: /data/lmitche1/src/deps/petsc
> Using PETSc arch: test
> -----------------------------------------
>
> Using C compiler: mpicc -fPIC -Wall -Wwrite-strings
> -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS}
> Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable
> -ffree-line-length-0 -Wno-unused-dummy-argument -O ${FOPTFLAGS}
> ${FFLAGS}
> -----------------------------------------
>
> Using include paths: -I/data/lmitche1/src/deps/petsc/test/include
> -I/data/lmitche1/src/deps/petsc/include
> -I/data/lmitche1/src/deps/petsc/include
> -I/data/lmitche1/src/deps/petsc/test/include
> -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi
> -----------------------------------------
>
> Using C linker: mpicc
> Using Fortran linker: mpif90
> Using libraries: -Wl,-rpath,/data/lmitche1/src/deps/petsc/test/lib
> -L/data/lmitche1/src/deps/petsc/test/lib -lpetsc -llapack -lblas -lX11
> -lssl -lcrypto -lpthread -lm -Wl,-rpath,/usr/lib/openmpi/lib
> -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_f90
> -lmpi_f77 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx
> -lstdc++ -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl
> -lmpi -lhwloc -lgcc_s -lpthread -ldl
> -----------------------------------------
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: Digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20150427/472c6994/attachment.sig>
More information about the petsc-dev
mailing list