[petsc-dev] Seeming performance regression with GAMG
Barry Smith
bsmith at mcs.anl.gov
Mon Apr 27 13:36:40 CDT 2015
Lawrence,
The git bisect command is a useful way to see exactly what commit caused a regression such as this. In this case I think Toby pretty much found it for you.
Barry
> On Apr 27, 2015, at 12:58 PM, Tobin Isaac <tisaac at ices.utexas.edu> wrote:
>
> On Mon, Apr 27, 2015 at 04:06:30PM +0100, Lawrence Mitchell wrote:
>> Dear all,
>>
>> we recently noticed a slowdown when using GAMG that I'm trying to
>> track down in a little more detail. I'm solving an Hdiv-L2
>> "helmholtz" pressure correction using a schur complement. I
>> precondition the schur complement with 'selfp', which morally looks
>> like a normal Helmholtz operator (except in the DG space). The domain
>> is very anisotropic (a thin atmospheric shell), so getting round to
>> trying Toby's column-based coarsening plugin is on the horizon but I
>> haven't done it yet.
>>
>> I don't have a good feel for exactly when things go worse, but here
>> are two data points:
>>
>> A recentish master (e4b003c), and master from 26th Feb (30ab49e4). I
>> notice in the former that MatPtAP takes significantly longer (full
>> logs below), different coarsening maybe? As a point of comparison,
>> the PCSetup for Hypre takes ballpark half a second on the same operator.
>>
>> I test with KSP ex6 (with a constant RHS):
>>
>> Any ideas?
>
> While there may be other changes that have affected your performance,
> I see two things in your logs:
>
> - The coarse matrix is much smaller (3 vs. 592). The default coarse
> equations limit was recently changed from 800 to 50. You can
> recover the old behavior with `-pc_gamg_coarse_eq_limit 800`.
> - GAMG now uses the square of the adjacency graph only on the finest
> level. This means that matrices on the coarser levels will be
> larger and have more entries, which probably explains the extra PtAP
> time. Maybe Mark can explain the decision to make this change.
>
> Cheers,
> Toby
>
>>
>> Cheers,
>>
>> Lawrence
>>
>> $ ./ex6-e4b003c -f helmholtz-sphere.dat -ksp_type cg
>> -ksp_convergence_test skip -ksp_max_it 2 -ksp_monitor -table
>> -pc_type gamg -log_summary -ksp_view
>> 0 KSP Residual norm 3.676132751311e-11
>> 1 KSP Residual norm 1.764616084171e-14
>> 2 KSP Residual norm 9.253867842133e-14
>> KSP Object: 1 MPI processes
>> type: cg
>> maximum iterations=2, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using PRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI processes
>> type: gamg
>> MG: type is MULTIPLICATIVE, levels=5 cycles=v
>> Cycles per PCApply=1
>> Using Galerkin computed coarse grid matrices
>> GAMG specific options
>> Threshold for dropping small values from graph 0
>> AGG specific options
>> Symmetric graph false
>> Coarse grid solver -- level -------------------------------
>> KSP Object: (mg_coarse_) 1 MPI processes
>> type: gmres
>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>> GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=1, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using NONE norm type for convergence test
>> PC Object: (mg_coarse_) 1 MPI processes
>> type: bjacobi
>> block Jacobi: number of blocks = 1
>> Local solve is same for all blocks, in the following KSP and
>> PC objects:
>> KSP Object: (mg_coarse_sub_) 1 MPI processes
>> type: preonly
>> maximum iterations=1, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using NONE norm type for convergence test
>> PC Object: (mg_coarse_sub_) 1 MPI processes
>> type: lu
>> LU: out-of-place factorization
>> tolerance for zero pivot 2.22045e-14
>> using diagonal shift on blocks to prevent zero pivot
>> [INBLOCKS]
>> matrix ordering: nd
>> factor fill ratio given 5, needed 1
>> Factored matrix follows:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=3, cols=3
>> package used to perform factorization: petsc
>> total: nonzeros=9, allocated nonzeros=9
>> total number of mallocs used during MatSetValues
>> calls =0
>> using I-node routines: found 1 nodes, limit used is 5
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=3, cols=3
>> total: nonzeros=9, allocated nonzeros=9
>> total number of mallocs used during MatSetValues calls =0
>> using I-node routines: found 1 nodes, limit used is 5
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=3, cols=3
>> total: nonzeros=9, allocated nonzeros=9
>> total number of mallocs used during MatSetValues calls =0
>> using I-node routines: found 1 nodes, limit used is 5
>> Down solver (pre-smoother) on level 1 -------------------------------
>> KSP Object: (mg_levels_1_) 1 MPI processes
>> type: chebyshev
>> Chebyshev: eigenvalue estimates: min = 0.0999929, max = 1.09992
>> Chebyshev: eigenvalues estimated using gmres with translations
>> [0 0.1; 0 1.1]
>> KSP Object: (mg_levels_1_esteig_) 1 MPI processes
>> type: gmres
>> GMRES: restart=30, using Classical (unmodified)
>> Gram-Schmidt Orthogonalization with no iterative refinement
>> GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=10
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> maximum iterations=2
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> PC Object: (mg_levels_1_) 1 MPI processes
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, local iterations
>> = 1, omega = 1
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=93, cols=93
>> total: nonzeros=8649, allocated nonzeros=8649
>> total number of mallocs used during MatSetValues calls =0
>> using I-node routines: found 19 nodes, limit used is 5
>> Up solver (post-smoother) same as down solver (pre-smoother)
>> Down solver (pre-smoother) on level 2 -------------------------------
>> KSP Object: (mg_levels_2_) 1 MPI processes
>> type: chebyshev
>> Chebyshev: eigenvalue estimates: min = 0.0998389, max = 1.09823
>> Chebyshev: eigenvalues estimated using gmres with translations
>> [0 0.1; 0 1.1]
>> KSP Object: (mg_levels_2_esteig_) 1 MPI processes
>> type: gmres
>> GMRES: restart=30, using Classical (unmodified)
>> Gram-Schmidt Orthogonalization with no iterative refinement
>> GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=10
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> maximum iterations=2
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> PC Object: (mg_levels_2_) 1 MPI processes
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, local iterations
>> = 1, omega = 1
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=2991, cols=2991
>> total: nonzeros=8.94608e+06, allocated nonzeros=8.94608e+06
>> total number of mallocs used during MatSetValues calls =0
>> using I-node routines: found 599 nodes, limit used is 5
>> Up solver (post-smoother) same as down solver (pre-smoother)
>> Down solver (pre-smoother) on level 3 -------------------------------
>> KSP Object: (mg_levels_3_) 1 MPI processes
>> type: chebyshev
>> Chebyshev: eigenvalue estimates: min = 0.0998975, max = 1.09887
>> Chebyshev: eigenvalues estimated using gmres with translations
>> [0 0.1; 0 1.1]
>> KSP Object: (mg_levels_3_esteig_) 1 MPI processes
>> type: gmres
>> GMRES: restart=30, using Classical (unmodified)
>> Gram-Schmidt Orthogonalization with no iterative refinement
>> GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=10
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> maximum iterations=2
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> PC Object: (mg_levels_3_) 1 MPI processes
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, local iterations
>> = 1, omega = 1
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=35419, cols=35419
>> total: nonzeros=1.55936e+07, allocated nonzeros=1.55936e+07
>> total number of mallocs used during MatSetValues calls =1
>> not using I-node routines
>> Up solver (post-smoother) same as down solver (pre-smoother)
>> Down solver (pre-smoother) on level 4 -------------------------------
>> KSP Object: (mg_levels_4_) 1 MPI processes
>> type: chebyshev
>> Chebyshev: eigenvalue estimates: min = 0.1, max = 1.1
>> Chebyshev: eigenvalues estimated using gmres with translations
>> [0 0.1; 0 1.1]
>> KSP Object: (mg_levels_4_esteig_) 1 MPI processes
>> type: gmres
>> GMRES: restart=30, using Classical (unmodified)
>> Gram-Schmidt Orthogonalization with no iterative refinement
>> GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=10
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> maximum iterations=2
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> PC Object: (mg_levels_4_) 1 MPI processes
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, local iterations
>> = 1, omega = 1
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=327680, cols=327680
>> total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
>> total number of mallocs used during MatSetValues calls =0
>> not using I-node routines
>> Up solver (post-smoother) same as down solver (pre-smoother)
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=327680, cols=327680
>> total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
>> total number of mallocs used during MatSetValues calls =0
>> not using I-node routines
>> helmholt 2 9e+03 gamg
>> ************************************************************************************************************************
>> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r
>> -fCourier9' to print this document ***
>> ************************************************************************************************************************
>>
>> ---------------------------------------------- PETSc Performance
>> Summary: ----------------------------------------------
>>
>> ./ex6-master on a arch-linux2-c-opt named yam.doc.ic.ac.uk with 1
>> processor, by lmitche1 Mon Apr 27 16:03:36 2015
>> Using Petsc Development GIT revision: v3.5.3-2602-ga9b180a GIT Date:
>> 2015-04-07 20:34:49 -0500
>>
>> Max Max/Min Avg Total
>> Time (sec): 1.072e+02 1.00000 1.072e+02
>> Objects: 2.620e+02 1.00000 2.620e+02
>> Flops: 4.582e+10 1.00000 4.582e+10 4.582e+10
>> Flops/sec: 4.275e+08 1.00000 4.275e+08 4.275e+08
>> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
>> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
>> MPI Reductions: 0.000e+00 0.00000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type
>> (multiply/divide/add/subtract)
>> e.g., VecAXPY() for real vectors of length
>> N --> 2N flops
>> and VecAXPY() for complex vectors of
>> length N --> 8N flops
>>
>> Summary of Stages: ----- Time ------ ----- Flops ----- ---
>> Messages --- -- Message Lengths -- -- Reductions --
>> Avg %Total Avg %Total counts
>> %Total Avg %Total counts %Total
>> 0: Main Stage: 1.0898e-01 0.1% 7.4996e+06 0.0% 0.000e+00
>> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
>> 1: mystage 1: 1.0466e+02 97.6% 4.2348e+10 92.4% 0.000e+00
>> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
>> 2: mystage 2: 2.4395e+00 2.3% 3.4689e+09 7.6% 0.000e+00
>> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
>>
>> -
>> ------------------------------------------------------------------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on
>> interpreting output.
>> Phase summary info:
>> Count: number of times phase was executed
>> Time and Flops: Max - maximum over all processors
>> Ratio - ratio of maximum to minimum over all processors
>> Mess: number of messages sent
>> Avg. len: average message length (bytes)
>> Reduct: number of global reductions
>> Global: entire computation
>> Stage: stages of a computation. Set stages with PetscLogStagePush()
>> and PetscLogStagePop().
>> %T - percent time in this phase %F - percent flops in
>> this phase
>> %M - percent messages in this phase %L - percent message
>> lengths in this phase
>> %R - percent reductions in this phase
>> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
>> over all processors)
>> -
>> ------------------------------------------------------------------------------------------------------------------------
>> Event Count Time (sec) Flops
>> --- Global --- --- Stage --- Total
>> Max Ratio Max Ratio Max Ratio Mess Avg
>> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
>> -
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> --- Event Stage 0: Main Stage
>>
>> ThreadCommRunKer 2 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatMult 1 1.0 4.4990e-03 1.0 6.19e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 4 83 0 0 0 1376
>> MatAssemblyBegin 1 1.0 1.1921e-06 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatAssemblyEnd 1 1.0 1.0468e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 10 0 0 0 0 0
>> MatLoad 1 1.0 9.3672e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 86 0 0 0 0 0
>> VecNorm 1 1.0 9.1791e-05 1.0 6.55e+05 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 9 0 0 0 7140
>> VecSet 5 1.0 3.1860e-03 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 3 0 0 0 0 0
>> VecAXPY 1 1.0 3.9697e-04 1.0 6.55e+05 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 9 0 0 0 1651
>>
>> --- Event Stage 1: mystage 1
>>
>> MatMult 40 1.0 3.5990e-01 1.0 5.52e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1535
>> MatConvert 4 1.0 1.2038e-01 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatScale 12 1.0 8.2839e-02 1.0 7.21e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 870
>> MatAssemblyBegin 31 1.0 1.4067e-05 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatAssemblyEnd 31 1.0 1.0995e-01 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatGetRow 1464732 1.0 8.5680e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatCoarsen 4 1.0 2.4768e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatAXPY 4 1.0 8.1362e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatMatMult 4 1.0 6.0399e-01 1.0 6.39e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 106
>> MatMatMultSym 4 1.0 4.4536e-01 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatMatMultNum 4 1.0 1.5859e-01 1.0 6.39e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 403
>> MatPtAP 4 1.0 1.0255e+02 1.0 4.15e+10 1.0 0.0e+00
>> 0.0e+00 0.0e+00 96 91 0 0 0 98 98 0 0 0 405
>> MatPtAPSymbolic 4 1.0 6.1707e+01 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 58 0 0 0 0 59 0 0 0 0 0
>> MatPtAPNumeric 4 1.0 4.0840e+01 1.0 4.15e+10 1.0 0.0e+00
>> 0.0e+00 0.0e+00 38 91 0 0 0 39 98 0 0 0 1017
>> MatTrnMatMult 1 1.0 1.9648e-01 1.0 2.34e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 119
>> MatTrnMatMultSym 1 1.0 1.3640e-01 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatTrnMatMultNum 1 1.0 6.0079e-02 1.0 2.34e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 389
>> MatGetSymTrans 5 1.0 8.4374e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecMDot 40 1.0 1.5124e-02 1.0 4.03e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2663
>> VecNorm 44 1.0 1.1821e-03 1.0 8.06e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6815
>> VecScale 44 1.0 1.4737e-03 1.0 4.03e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2733
>> VecCopy 4 1.0 3.2711e-04 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecSet 143 1.0 2.7236e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecAXPY 4 1.0 3.9601e-04 1.0 7.32e+05 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1849
>> VecMAXPY 44 1.0 1.9676e-02 1.0 4.76e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2419
>> VecAssemblyBegin 4 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecAssemblyEnd 4 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecPointwiseMult 44 1.0 7.9026e-03 1.0 4.03e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 510
>> VecSetRandom 4 1.0 3.6092e-03 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecNormalize 44 1.0 2.6934e-03 1.0 1.21e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4486
>> KSPGMRESOrthog 40 1.0 3.1765e-02 1.0 8.06e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2536
>> KSPSetUp 10 1.0 1.1930e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> PCGAMGGraph_AGG 4 1.0 5.7353e-01 1.0 5.56e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 97
>> PCGAMGCoarse_AGG 4 1.0 2.5846e-01 1.0 2.34e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 90
>> PCGAMGProl_AGG 4 1.0 4.9806e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> PCGAMGPOpt_AGG 4 1.0 1.2128e+00 1.0 7.38e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 608
>> GAMG: createProl 4 1.0 2.0973e+00 1.0 8.17e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 2 2 0 0 0 2 2 0 0 0 389
>> Graph 8 1.0 5.7211e-01 1.0 5.56e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 97
>> MIS/Agg 4 1.0 2.4861e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> SA: col data 4 1.0 7.6509e-04 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> SA: frmProl0 4 1.0 4.6539e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> SA: smooth 4 1.0 1.2128e+00 1.0 7.38e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 608
>> GAMG: partLevel 4 1.0 1.0255e+02 1.0 4.15e+10 1.0 0.0e+00
>> 0.0e+00 0.0e+00 96 91 0 0 0 98 98 0 0 0 405
>> PCSetUp 1 1.0 1.0465e+02 1.0 4.23e+10 1.0 0.0e+00
>> 0.0e+00 0.0e+00 98 92 0 0 0 100100 0 0 0 405
>>
>> --- Event Stage 2: mystage 2
>>
>> MatMult 121 1.0 1.0144e+00 1.0 1.61e+09 1.0 0.0e+00
>> 0.0e+00 0.0e+00 1 4 0 0 0 42 47 0 0 0 1592
>> MatMultAdd 12 1.0 3.1757e-02 1.0 4.95e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 1558
>> MatMultTranspose 12 1.0 3.7137e-02 1.0 4.95e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 2 1 0 0 0 1332
>> MatSolve 6 1.0 9.2983e-06 1.0 9.00e+01 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 10
>> MatSOR 116 1.0 1.2805e+00 1.0 1.61e+09 1.0 0.0e+00
>> 0.0e+00 0.0e+00 1 4 0 0 0 52 47 0 0 0 1260
>> MatLUFactorSym 1 1.0 1.5020e-05 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatLUFactorNum 1 1.0 5.9605e-06 1.0 1.60e+01 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3
>> MatResidual 12 1.0 1.0432e-01 1.0 1.67e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 4 5 0 0 0 1599
>> MatGetRowIJ 1 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatGetOrdering 1 1.0 4.2915e-05 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatView 8 1.0 6.7115e-04 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecMDot 43 1.0 1.3515e-02 1.0 4.03e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 2980
>> VecTDot 4 1.0 1.1048e-03 1.0 2.62e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2373
>> VecNorm 53 1.0 1.4720e-03 1.0 1.00e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6809
>> VecScale 50 1.0 1.4396e-03 1.0 4.03e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2798
>> VecCopy 21 1.0 3.6387e-03 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecSet 108 1.0 1.0684e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecAXPY 15 1.0 2.1303e-03 1.0 4.09e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1918
>> VecAYPX 97 1.0 1.1820e-02 1.0 1.16e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 985
>> VecAXPBYCZ 48 1.0 8.2519e-03 1.0 2.20e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 2663
>> VecMAXPY 50 1.0 1.6957e-02 1.0 4.76e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 2807
>> VecNormalize 50 1.0 2.6865e-03 1.0 1.21e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4498
>> KSPGMRESOrthog 43 1.0 2.7892e-02 1.0 8.06e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 2888
>> KSPSetUp 5 1.0 2.9690e-03 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> KSPSolve 1 1.0 2.4374e+00 1.0 3.47e+09 1.0 0.0e+00
>> 0.0e+00 0.0e+00 2 8 0 0 0 100100 0 0 0 1423
>> PCSetUp 1 1.0 9.3937e-05 1.0 1.60e+01 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> PCSetUpOnBlocks 3 1.0 9.6798e-05 1.0 1.60e+01 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> PCApply 3 1.0 2.4240e+00 1.0 3.45e+09 1.0 0.0e+00
>> 0.0e+00 0.0e+00 2 8 0 0 0 99 99 0 0 0 1423
>> -
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type Creations Destructions Memory Descendants'
>> Mem.
>> Reports information only for process 0.
>>
>> --- Event Stage 0: Main Stage
>>
>> Viewer 2 2 1520 0
>> Matrix 1 10 54702412 0
>> Vector 3 97 72154440 0
>> Krylov Solver 0 11 146840 0
>> Preconditioner 0 7 7332 0
>> Index Set 0 3 2400 0
>>
>> --- Event Stage 1: mystage 1
>>
>> Viewer 1 0 0 0
>> Matrix 22 14 691989068 0
>> Matrix Coarsen 4 4 2576 0
>> Vector 125 91 67210200 0
>> Krylov Solver 15 4 120864 0
>> Preconditioner 15 8 7520 0
>> Index Set 4 4 3168 0
>> PetscRandom 4 4 2560 0
>>
>> --- Event Stage 2: mystage 2
>>
>> Matrix 1 0 0 0
>> Vector 60 0 0 0
>> Index Set 5 2 1592 0
>> ========================================================================================================================
>> Average time to get PetscTime(): 0
>> #PETSc Option Table entries:
>> -f helmholtz-sphere.dat
>> -ksp_convergence_test skip
>> -ksp_max_it 2
>> -ksp_monitor
>> -ksp_type cg
>> -ksp_view
>> -log_summary
>> -matload_block_size 1
>> -pc_type gamg
>> -table
>> #End of PETSc Option Table entries
>> Compiled without FORTRAN kernels
>> Compiled with full precision matrices (default)
>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
>> Configure options: --download-chaco=1 --download-ctetgen=1
>> --download-exodusii=1 --download-hdf5=1 --download-hypre=1
>> --download-metis=1 --download-ml=1 --download-mumps=1
>> --download-netcdf=1 --download-parmetis=1 --download-ptscotch=1
>> --download-scalapack=1 --download-superlu=1 --download-superlu_dist=1
>> --download-triangle=1 --with-c2html=0 --with-debugging=0
>> --with-make-np=32 --with-openmp=0 --with-pthreadclasses=0
>> --with-shared-libraries=1 --with-threadcomm=0 PETSC_ARCH=arch-linux2-c-opt
>> -----------------------------------------
>> Libraries compiled on Wed Apr 8 10:00:43 2015 on yam.doc.ic.ac.uk
>> Machine characteristics:
>> Linux-3.13.0-45-generic-x86_64-with-Ubuntu-14.04-trusty
>> Using PETSc directory: /data/lmitche1/src/deps/petsc
>> Using PETSc arch: arch-linux2-c-opt
>> -----------------------------------------
>>
>> Using C compiler: mpicc -fPIC -Wall -Wwrite-strings
>> -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS}
>> Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable
>> -ffree-line-length-0 -Wno-unused-dummy-argument -O ${FOPTFLAGS}
>> ${FFLAGS}
>> -----------------------------------------
>>
>> Using include paths:
>> -I/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/include
>> -I/data/lmitche1/src/deps/petsc/include
>> -I/data/lmitche1/src/deps/petsc/include
>> -I/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/include
>> -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi
>> -----------------------------------------
>>
>> Using C linker: mpicc
>> Using Fortran linker: mpif90
>> Using libraries:
>> -Wl,-rpath,/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib
>> -L/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib -lpetsc
>> -Wl,-rpath,/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib
>> -L/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib -lcmumps
>> -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lsuperlu_4.3
>> -lsuperlu_dist_4.0 -lHYPRE -Wl,-rpath,/usr/lib/openmpi/lib
>> -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
>> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_cxx
>> -lstdc++ -lscalapack -lml -lmpi_cxx -lstdc++ -lexoIIv2for -lexodus
>> -llapack -lblas -lparmetis -ltriangle -lnetcdf -lmetis -lchaco
>> -lctetgen -lX11 -lptesmumps -lptscotch -lptscotcherr -lscotch
>> -lscotcherr -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lssl
>> -lcrypto -lm -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm
>> -lquadmath -lm -lmpi_cxx -lstdc++ -lrt -lm -lz
>> -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib
>> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
>> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu
>> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl
>> -lmpi -lhwloc -lgcc_s -lpthread -ldl
>> -----------------------------------------
>>
>>
>>
>> $ ./ex6-30ab49e4
>> 0 KSP Residual norm 3.679528502747e-11
>> 1 KSP Residual norm 1.410011347346e-14
>> 2 KSP Residual norm 2.871653636831e-14
>> KSP Object: 1 MPI processes
>> type: cg
>> maximum iterations=2, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using PRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI processes
>> type: gamg
>> MG: type is MULTIPLICATIVE, levels=3 cycles=v
>> Cycles per PCApply=1
>> Using Galerkin computed coarse grid matrices
>> Coarse grid solver -- level -------------------------------
>> KSP Object: (mg_coarse_) 1 MPI processes
>> type: preonly
>> maximum iterations=1, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using NONE norm type for convergence test
>> PC Object: (mg_coarse_) 1 MPI processes
>> type: bjacobi
>> block Jacobi: number of blocks = 1
>> Local solve is same for all blocks, in the following KSP and
>> PC objects:
>> KSP Object: (mg_coarse_sub_) 1 MPI processes
>> type: preonly
>> maximum iterations=1, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using NONE norm type for convergence test
>> PC Object: (mg_coarse_sub_) 1 MPI processes
>> type: lu
>> LU: out-of-place factorization
>> tolerance for zero pivot 2.22045e-14
>> using diagonal shift on blocks to prevent zero pivot
>> [INBLOCKS]
>> matrix ordering: nd
>> factor fill ratio given 5, needed 1
>> Factored matrix follows:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=592, cols=592
>> package used to perform factorization: petsc
>> total: nonzeros=350464, allocated nonzeros=350464
>> total number of mallocs used during MatSetValues
>> calls =0
>> using I-node routines: found 119 nodes, limit used
>> is 5
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=592, cols=592
>> total: nonzeros=350464, allocated nonzeros=350464
>> total number of mallocs used during MatSetValues calls =0
>> using I-node routines: found 119 nodes, limit used is 5
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=592, cols=592
>> total: nonzeros=350464, allocated nonzeros=350464
>> total number of mallocs used during MatSetValues calls =0
>> using I-node routines: found 119 nodes, limit used is 5
>> Down solver (pre-smoother) on level 1 -------------------------------
>> KSP Object: (mg_levels_1_) 1 MPI processes
>> type: chebyshev
>> Chebyshev: eigenvalue estimates: min = 0.0871826, max = 1.83084
>> maximum iterations=2
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> PC Object: (mg_levels_1_) 1 MPI processes
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, local iterations
>> = 1, omega = 1
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=35419, cols=35419
>> total: nonzeros=1.55936e+07, allocated nonzeros=1.55936e+07
>> total number of mallocs used during MatSetValues calls =1
>> not using I-node routines
>> Up solver (post-smoother) same as down solver (pre-smoother)
>> Down solver (pre-smoother) on level 2 -------------------------------
>> KSP Object: (mg_levels_2_) 1 MPI processes
>> type: chebyshev
>> Chebyshev: eigenvalue estimates: min = 0.099472, max = 2.08891
>> maximum iterations=2
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using NONE norm type for convergence test
>> PC Object: (mg_levels_2_) 1 MPI processes
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, local iterations
>> = 1, omega = 1
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=327680, cols=327680
>> total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
>> total number of mallocs used during MatSetValues calls =0
>> not using I-node routines
>> Up solver (post-smoother) same as down solver (pre-smoother)
>> linear system matrix = precond matrix:
>> Mat Object: 1 MPI processes
>> type: seqaij
>> rows=327680, cols=327680
>> total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
>> total number of mallocs used during MatSetValues calls =0
>> not using I-node routines
>> Number of iterations = 2
>> Residual norm = 8368.22
>> ************************************************************************************************************************
>> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r
>> -fCourier9' to print this document ***
>> ************************************************************************************************************************
>>
>> ---------------------------------------------- PETSc Performance
>> Summary: ----------------------------------------------
>>
>> ./ex6-1ddf9fe on a test named yam.doc.ic.ac.uk with 1 processor, by
>> lmitche1 Mon Apr 27 16:02:36 2015
>> Using Petsc Release Version 3.5.2, unknown
>>
>> Max Max/Min Avg Total
>> Time (sec): 2.828e+01 1.00000 2.828e+01
>> Objects: 1.150e+02 1.00000 1.150e+02
>> Flops: 1.006e+10 1.00000 1.006e+10 1.006e+10
>> Flops/sec: 3.559e+08 1.00000 3.559e+08 3.559e+08
>> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
>> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
>> MPI Reductions: 0.000e+00 0.00000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type
>> (multiply/divide/add/subtract)
>> e.g., VecAXPY() for real vectors of length
>> N --> 2N flops
>> and VecAXPY() for complex vectors of
>> length N --> 8N flops
>>
>> Summary of Stages: ----- Time ------ ----- Flops ----- ---
>> Messages --- -- Message Lengths -- -- Reductions --
>> Avg %Total Avg %Total counts
>> %Total Avg %Total counts %Total
>> 0: Main Stage: 9.9010e-02 0.4% 7.4996e+06 0.1% 0.000e+00
>> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
>> 1: mystage 1: 2.6509e+01 93.7% 8.4700e+09 84.2% 0.000e+00
>> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
>> 2: mystage 2: 1.6704e+00 5.9% 1.5861e+09 15.8% 0.000e+00
>> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
>>
>> -
>> ------------------------------------------------------------------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on
>> interpreting output.
>> Phase summary info:
>> Count: number of times phase was executed
>> Time and Flops: Max - maximum over all processors
>> Ratio - ratio of maximum to minimum over all processors
>> Mess: number of messages sent
>> Avg. len: average message length (bytes)
>> Reduct: number of global reductions
>> Global: entire computation
>> Stage: stages of a computation. Set stages with PetscLogStagePush()
>> and PetscLogStagePop().
>> %T - percent time in this phase %F - percent flops in
>> this phase
>> %M - percent messages in this phase %L - percent message
>> lengths in this phase
>> %R - percent reductions in this phase
>> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
>> over all processors)
>> -
>> ------------------------------------------------------------------------------------------------------------------------
>> Event Count Time (sec) Flops
>> --- Global --- --- Stage --- Total
>> Max Ratio Max Ratio Max Ratio Mess Avg
>> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
>> -
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> --- Event Stage 0: Main Stage
>>
>> ThreadCommRunKer 2 1.0 3.0994e-06 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatMult 1 1.0 7.2370e-03 1.0 6.19e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 7 83 0 0 0 855
>> MatAssemblyBegin 1 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatAssemblyEnd 1 1.0 1.0748e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 11 0 0 0 0 0
>> MatLoad 1 1.0 8.5824e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 87 0 0 0 0 0
>> VecNorm 1 1.0 9.2983e-05 1.0 6.55e+05 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 9 0 0 0 7048
>> VecSet 5 1.0 2.9252e-03 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 3 0 0 0 0 0
>> VecAXPY 1 1.0 4.6611e-04 1.0 6.55e+05 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 9 0 0 0 1406
>>
>> --- Event Stage 1: mystage 1
>>
>> MatMult 20 1.0 2.8699e-01 1.0 3.73e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 1301
>> MatConvert 2 1.0 8.3888e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatScale 6 1.0 7.1905e-02 1.0 4.85e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 675
>> MatAssemblyBegin 20 1.0 1.9789e-05 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatAssemblyEnd 20 1.0 1.1295e-01 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatGetRow 1452396 1.0 9.5270e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatCoarsen 2 1.0 3.0676e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatAXPY 2 1.0 8.2162e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatMatMult 2 1.0 4.2625e-01 1.0 4.31e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 2 0 0 0 0 2 1 0 0 0 101
>> MatMatMultSym 2 1.0 3.0257e-01 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
>> MatMatMultNum 2 1.0 1.2364e-01 1.0 4.31e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 349
>> MatPtAP 2 1.0 2.3871e+01 1.0 7.82e+09 1.0 0.0e+00
>> 0.0e+00 0.0e+00 84 78 0 0 0 90 92 0 0 0 328
>> MatPtAPSymbolic 2 1.0 1.4329e+01 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 51 0 0 0 0 54 0 0 0 0 0
>> MatPtAPNumeric 2 1.0 9.5422e+00 1.0 7.82e+09 1.0 0.0e+00
>> 0.0e+00 0.0e+00 34 78 0 0 0 36 92 0 0 0 819
>> MatTrnMatMult 2 1.0 9.7712e-01 1.0 8.24e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 3 1 0 0 0 4 1 0 0 0 84
>> MatTrnMatMultSym 2 1.0 5.0258e-01 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
>> MatTrnMatMultNum 2 1.0 4.7454e-01 1.0 8.24e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 174
>> MatGetSymTrans 4 1.0 6.2370e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecMDot 20 1.0 1.7304e-02 1.0 3.99e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2308
>> VecNorm 22 1.0 1.3692e-03 1.0 7.99e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 5834
>> VecScale 22 1.0 1.8549e-03 1.0 3.99e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2153
>> VecCopy 2 1.0 5.0211e-04 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecSet 77 1.0 3.6245e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecAXPY 2 1.0 3.7718e-04 1.0 7.26e+05 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1925
>> VecMAXPY 22 1.0 2.2252e-02 1.0 4.72e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 2121
>> VecAssemblyBegin 2 1.0 9.5367e-07 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecAssemblyEnd 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecPointwiseMult 22 1.0 9.2957e-03 1.0 3.99e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 430
>> VecSetRandom 2 1.0 8.8599e-03 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecNormalize 22 1.0 3.2570e-03 1.0 1.20e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3679
>> KSPGMRESOrthog 20 1.0 3.6396e-02 1.0 7.99e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 2195
>> KSPSetUp 6 1.0 1.4364e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> PCGAMGgraph_AGG 2 1.0 4.8670e-01 1.0 3.77e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 77
>> PCGAMGcoarse_AGG 2 1.0 1.0664e+00 1.0 8.24e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 4 1 0 0 0 4 1 0 0 0 77
>> PCGAMGProl_AGG 2 1.0 6.4827e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> PCGAMGPOpt_AGG 2 1.0 9.9913e-01 1.0 5.31e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 4 5 0 0 0 4 6 0 0 0 532
>> PCSetUp 1 1.0 2.6505e+01 1.0 8.47e+09 1.0 0.0e+00
>> 0.0e+00 0.0e+00 94 84 0 0 0 100100 0 0 0 320
>>
>> --- Event Stage 2: mystage 2
>>
>> MatMult 38 1.0 5.6846e-01 1.0 6.85e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 2 7 0 0 0 34 43 0 0 0 1204
>> MatMultAdd 6 1.0 2.7303e-02 1.0 3.25e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 2 2 0 0 0 1189
>> MatMultTranspose 6 1.0 3.2745e-02 1.0 3.25e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 2 2 0 0 0 991
>> MatSolve 3 1.0 1.6339e-03 1.0 2.10e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1286
>> MatSOR 36 1.0 9.4071e-01 1.0 6.79e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 3 7 0 0 0 56 43 0 0 0 722
>> MatLUFactorSym 1 1.0 5.9440e-03 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatLUFactorNum 1 1.0 4.0792e-02 1.0 1.15e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 1 0 0 0 2 7 0 0 0 2820
>> MatResidual 6 1.0 9.2385e-02 1.0 1.13e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 1 0 0 0 6 7 0 0 0 1224
>> MatGetRowIJ 1 1.0 1.4091e-04 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatGetOrdering 1 1.0 2.3508e-04 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> MatView 6 1.0 5.8508e-04 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecTDot 4 1.0 1.4160e-03 1.0 2.62e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1851
>> VecNorm 3 1.0 3.5286e-04 1.0 1.97e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 5572
>> VecCopy 8 1.0 1.1313e-02 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
>> VecSet 22 1.0 4.3163e-03 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> VecAXPY 4 1.0 1.6727e-03 1.0 2.62e+06 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1567
>> VecAYPX 49 1.0 1.7958e-02 1.0 1.15e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 643
>> VecAXPBYCZ 24 1.0 1.3117e-02 1.0 2.18e+07 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 1661
>> KSPSetUp 2 1.0 9.2983e-06 1.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
>> KSPSolve 1 1.0 1.6690e+00 1.0 1.59e+09 1.0 0.0e+00
>> 0.0e+00 0.0e+00 6 16 0 0 0 100100 0 0 0 950
>> PCSetUp 1 1.0 4.7009e-02 1.0 1.15e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 1 0 0 0 3 7 0 0 0 2447
>> PCSetUpOnBlocks 3 1.0 4.7014e-02 1.0 1.15e+08 1.0 0.0e+00
>> 0.0e+00 0.0e+00 0 1 0 0 0 3 7 0 0 0 2447
>> PCApply 3 1.0 1.6409e+00 1.0 1.57e+09 1.0 0.0e+00
>> 0.0e+00 0.0e+00 6 16 0 0 0 98 99 0 0 0 954
>> -
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type Creations Destructions Memory Descendants'
>> Mem.
>> Reports information only for process 0.
>>
>> --- Event Stage 0: Main Stage
>>
>> Viewer 1 1 760 0
>> Matrix 1 6 58874852 0
>> Vector 3 20 27954224 0
>> Krylov Solver 0 5 23360 0
>> Preconditioner 0 5 5332 0
>> Index Set 0 3 7112 0
>>
>> --- Event Stage 1: mystage 1
>>
>> Viewer 1 0 0 0
>> Matrix 14 10 477163156 0
>> Matrix Coarsen 2 2 1288 0
>> Vector 69 52 66638640 0
>> Krylov Solver 7 2 60432 0
>> Preconditioner 7 2 2096 0
>> Index Set 2 2 1584 0
>> PetscRandom 2 2 1280 0
>>
>> --- Event Stage 2: mystage 2
>>
>> Matrix 1 0 0 0
>> Index Set 5 2 2536 0
>> ========================================================================================================================
>> Average time to get PetscTime(): 0
>> #PETSc Option Table entries:
>> -f helmholtz-sphere.dat
>> -ksp_convergence_test skip
>> -ksp_max_it 2
>> -ksp_monitor
>> -ksp_type cg
>> -ksp_view
>> -log_summary
>> -matload_block_size 1
>> -pc_type gamg
>> #End of PETSc Option Table entries
>> Compiled without FORTRAN kernels
>> Compiled with full precision matrices (default)
>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
>> Configure options: PETSC_ARCH=test --with-debugging=0
>> -----------------------------------------
>> Libraries compiled on Mon Apr 27 10:49:24 2015 on yam.doc.ic.ac.uk
>> Machine characteristics:
>> Linux-3.13.0-45-generic-x86_64-with-Ubuntu-14.04-trusty
>> Using PETSc directory: /data/lmitche1/src/deps/petsc
>> Using PETSc arch: test
>> -----------------------------------------
>>
>> Using C compiler: mpicc -fPIC -Wall -Wwrite-strings
>> -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS}
>> Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable
>> -ffree-line-length-0 -Wno-unused-dummy-argument -O ${FOPTFLAGS}
>> ${FFLAGS}
>> -----------------------------------------
>>
>> Using include paths: -I/data/lmitche1/src/deps/petsc/test/include
>> -I/data/lmitche1/src/deps/petsc/include
>> -I/data/lmitche1/src/deps/petsc/include
>> -I/data/lmitche1/src/deps/petsc/test/include
>> -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi
>> -----------------------------------------
>>
>> Using C linker: mpicc
>> Using Fortran linker: mpif90
>> Using libraries: -Wl,-rpath,/data/lmitche1/src/deps/petsc/test/lib
>> -L/data/lmitche1/src/deps/petsc/test/lib -lpetsc -llapack -lblas -lX11
>> -lssl -lcrypto -lpthread -lm -Wl,-rpath,/usr/lib/openmpi/lib
>> -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
>> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_f90
>> -lmpi_f77 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx
>> -lstdc++ -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib
>> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
>> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu
>> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl
>> -lmpi -lhwloc -lgcc_s -lpthread -ldl
>> -----------------------------------------
>>
More information about the petsc-dev
mailing list