[petsc-dev] Seeming performance regression with GAMG

Tobin Isaac tisaac at ices.utexas.edu
Mon Apr 27 12:58:58 CDT 2015


On Mon, Apr 27, 2015 at 04:06:30PM +0100, Lawrence Mitchell wrote:
> Dear all,
> 
> we recently noticed a slowdown when using GAMG that I'm trying to
> track down in a little more detail.  I'm solving an Hdiv-L2
> "helmholtz" pressure correction using a schur complement.  I
> precondition the schur complement with 'selfp', which morally looks
> like a normal Helmholtz operator (except in the DG space).  The domain
> is very anisotropic (a thin atmospheric shell), so getting round to
> trying Toby's column-based coarsening plugin is on the horizon but I
> haven't done it yet.
> 
> I don't have a good feel for exactly when things go worse, but here
> are two data points:
> 
> A recentish master (e4b003c), and master from 26th Feb (30ab49e4).  I
> notice in the former that MatPtAP takes significantly longer (full
> logs below), different coarsening maybe?  As a point of comparison,
> the PCSetup for Hypre takes ballpark half a second on the same operator.
> 
> I test with KSP ex6 (with a constant RHS):
> 
> Any ideas?

While there may be other changes that have affected your performance,
I see two things in your logs:

- The coarse matrix is much smaller (3 vs. 592).  The default coarse
  equations limit was recently changed from 800 to 50.  You can
  recover the old behavior with `-pc_gamg_coarse_eq_limit 800`.
- GAMG now uses the square of the adjacency graph only on the finest
  level.  This means that matrices on the coarser levels will be
  larger and have more entries, which probably explains the extra PtAP
  time.  Maybe Mark can explain the decision to make this change.

Cheers,
  Toby

> 
> Cheers,
> 
> Lawrence
> 
> $ ./ex6-e4b003c -f helmholtz-sphere.dat  -ksp_type cg
> -ksp_convergence_test skip -ksp_max_it 2  -ksp_monitor -table
> -pc_type gamg  -log_summary -ksp_view
>   0 KSP Residual norm 3.676132751311e-11
>   1 KSP Residual norm 1.764616084171e-14
>   2 KSP Residual norm 9.253867842133e-14
> KSP Object: 1 MPI processes
>   type: cg
>   maximum iterations=2, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: gamg
>     MG: type is MULTIPLICATIVE, levels=5 cycles=v
>       Cycles per PCApply=1
>       Using Galerkin computed coarse grid matrices
>       GAMG specific options
>         Threshold for dropping small values from graph 0
>         AGG specific options
>           Symmetric graph false
>   Coarse grid solver -- level -------------------------------
>     KSP Object:    (mg_coarse_)     1 MPI processes
>       type: gmres
>         GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>         GMRES: happy breakdown tolerance 1e-30
>       maximum iterations=1, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object:    (mg_coarse_)     1 MPI processes
>       type: bjacobi
>         block Jacobi: number of blocks = 1
>         Local solve is same for all blocks, in the following KSP and
> PC objects:
>         KSP Object:        (mg_coarse_sub_)         1 MPI processes
>           type: preonly
>           maximum iterations=1, initial guess is zero
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>           left preconditioning
>           using NONE norm type for convergence test
>         PC Object:        (mg_coarse_sub_)         1 MPI processes
>           type: lu
>             LU: out-of-place factorization
>             tolerance for zero pivot 2.22045e-14
>             using diagonal shift on blocks to prevent zero pivot
> [INBLOCKS]
>             matrix ordering: nd
>             factor fill ratio given 5, needed 1
>               Factored matrix follows:
>                 Mat Object:                 1 MPI processes
>                   type: seqaij
>                   rows=3, cols=3
>                   package used to perform factorization: petsc
>                   total: nonzeros=9, allocated nonzeros=9
>                   total number of mallocs used during MatSetValues
> calls =0
>                     using I-node routines: found 1 nodes, limit used is 5
>           linear system matrix = precond matrix:
>           Mat Object:           1 MPI processes
>             type: seqaij
>             rows=3, cols=3
>             total: nonzeros=9, allocated nonzeros=9
>             total number of mallocs used during MatSetValues calls =0
>               using I-node routines: found 1 nodes, limit used is 5
>       linear system matrix = precond matrix:
>       Mat Object:       1 MPI processes
>         type: seqaij
>         rows=3, cols=3
>         total: nonzeros=9, allocated nonzeros=9
>         total number of mallocs used during MatSetValues calls =0
>           using I-node routines: found 1 nodes, limit used is 5
>   Down solver (pre-smoother) on level 1 -------------------------------
>     KSP Object:    (mg_levels_1_)     1 MPI processes
>       type: chebyshev
>         Chebyshev: eigenvalue estimates:  min = 0.0999929, max = 1.09992
>         Chebyshev: eigenvalues estimated using gmres with translations
>  [0 0.1; 0 1.1]
>         KSP Object:        (mg_levels_1_esteig_)         1 MPI processes
>           type: gmres
>             GMRES: restart=30, using Classical (unmodified)
> Gram-Schmidt Orthogonalization with no iterative refinement
>             GMRES: happy breakdown tolerance 1e-30
>           maximum iterations=10
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>           left preconditioning
>           using nonzero initial guess
>           using NONE norm type for convergence test
>       maximum iterations=2
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using nonzero initial guess
>       using NONE norm type for convergence test
>     PC Object:    (mg_levels_1_)     1 MPI processes
>       type: sor
>         SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
>       linear system matrix = precond matrix:
>       Mat Object:       1 MPI processes
>         type: seqaij
>         rows=93, cols=93
>         total: nonzeros=8649, allocated nonzeros=8649
>         total number of mallocs used during MatSetValues calls =0
>           using I-node routines: found 19 nodes, limit used is 5
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   Down solver (pre-smoother) on level 2 -------------------------------
>     KSP Object:    (mg_levels_2_)     1 MPI processes
>       type: chebyshev
>         Chebyshev: eigenvalue estimates:  min = 0.0998389, max = 1.09823
>         Chebyshev: eigenvalues estimated using gmres with translations
>  [0 0.1; 0 1.1]
>         KSP Object:        (mg_levels_2_esteig_)         1 MPI processes
>           type: gmres
>             GMRES: restart=30, using Classical (unmodified)
> Gram-Schmidt Orthogonalization with no iterative refinement
>             GMRES: happy breakdown tolerance 1e-30
>           maximum iterations=10
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>           left preconditioning
>           using nonzero initial guess
>           using NONE norm type for convergence test
>       maximum iterations=2
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using nonzero initial guess
>       using NONE norm type for convergence test
>     PC Object:    (mg_levels_2_)     1 MPI processes
>       type: sor
>         SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
>       linear system matrix = precond matrix:
>       Mat Object:       1 MPI processes
>         type: seqaij
>         rows=2991, cols=2991
>         total: nonzeros=8.94608e+06, allocated nonzeros=8.94608e+06
>         total number of mallocs used during MatSetValues calls =0
>           using I-node routines: found 599 nodes, limit used is 5
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   Down solver (pre-smoother) on level 3 -------------------------------
>     KSP Object:    (mg_levels_3_)     1 MPI processes
>       type: chebyshev
>         Chebyshev: eigenvalue estimates:  min = 0.0998975, max = 1.09887
>         Chebyshev: eigenvalues estimated using gmres with translations
>  [0 0.1; 0 1.1]
>         KSP Object:        (mg_levels_3_esteig_)         1 MPI processes
>           type: gmres
>             GMRES: restart=30, using Classical (unmodified)
> Gram-Schmidt Orthogonalization with no iterative refinement
>             GMRES: happy breakdown tolerance 1e-30
>           maximum iterations=10
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>           left preconditioning
>           using nonzero initial guess
>           using NONE norm type for convergence test
>       maximum iterations=2
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using nonzero initial guess
>       using NONE norm type for convergence test
>     PC Object:    (mg_levels_3_)     1 MPI processes
>       type: sor
>         SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
>       linear system matrix = precond matrix:
>       Mat Object:       1 MPI processes
>         type: seqaij
>         rows=35419, cols=35419
>         total: nonzeros=1.55936e+07, allocated nonzeros=1.55936e+07
>         total number of mallocs used during MatSetValues calls =1
>           not using I-node routines
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   Down solver (pre-smoother) on level 4 -------------------------------
>     KSP Object:    (mg_levels_4_)     1 MPI processes
>       type: chebyshev
>         Chebyshev: eigenvalue estimates:  min = 0.1, max = 1.1
>         Chebyshev: eigenvalues estimated using gmres with translations
>  [0 0.1; 0 1.1]
>         KSP Object:        (mg_levels_4_esteig_)         1 MPI processes
>           type: gmres
>             GMRES: restart=30, using Classical (unmodified)
> Gram-Schmidt Orthogonalization with no iterative refinement
>             GMRES: happy breakdown tolerance 1e-30
>           maximum iterations=10
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>           left preconditioning
>           using nonzero initial guess
>           using NONE norm type for convergence test
>       maximum iterations=2
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using nonzero initial guess
>       using NONE norm type for convergence test
>     PC Object:    (mg_levels_4_)     1 MPI processes
>       type: sor
>         SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
>       linear system matrix = precond matrix:
>       Mat Object:       1 MPI processes
>         type: seqaij
>         rows=327680, cols=327680
>         total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
>         total number of mallocs used during MatSetValues calls =0
>           not using I-node routines
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=327680, cols=327680
>     total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node routines
> helmholt   2 9e+03  gamg
> ************************************************************************************************************************
> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r
> -fCourier9' to print this document            ***
> ************************************************************************************************************************
> 
> ---------------------------------------------- PETSc Performance
> Summary: ----------------------------------------------
> 
> ./ex6-master on a arch-linux2-c-opt named yam.doc.ic.ac.uk with 1
> processor, by lmitche1 Mon Apr 27 16:03:36 2015
> Using Petsc Development GIT revision: v3.5.3-2602-ga9b180a  GIT Date:
> 2015-04-07 20:34:49 -0500
> 
>                          Max       Max/Min        Avg      Total
> Time (sec):           1.072e+02      1.00000   1.072e+02
> Objects:              2.620e+02      1.00000   2.620e+02
> Flops:                4.582e+10      1.00000   4.582e+10  4.582e+10
> Flops/sec:            4.275e+08      1.00000   4.275e+08  4.275e+08
> MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Reductions:       0.000e+00      0.00000
> 
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length
> N --> 2N flops
>                             and VecAXPY() for complex vectors of
> length N --> 8N flops
> 
> Summary of Stages:   ----- Time ------  ----- Flops -----  ---
> Messages ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts
> %Total     Avg         %Total   counts   %Total
>  0:      Main Stage: 1.0898e-01   0.1%  7.4996e+06   0.0%  0.000e+00
>  0.0%  0.000e+00        0.0%  0.000e+00   0.0%
>  1:       mystage 1: 1.0466e+02  97.6%  4.2348e+10  92.4%  0.000e+00
>  0.0%  0.000e+00        0.0%  0.000e+00   0.0%
>  2:       mystage 2: 2.4395e+00   2.3%  3.4689e+09   7.6%  0.000e+00
>  0.0%  0.000e+00        0.0%  0.000e+00   0.0%
> 
> -
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length (bytes)
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flops in
> this phase
>       %M - percent messages in this phase     %L - percent message
> lengths in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
> -
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops
>            --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> -
> ------------------------------------------------------------------------------------------------------------------------
> 
> --- Event Stage 0: Main Stage
> 
> ThreadCommRunKer       2 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatMult                1 1.0 4.4990e-03 1.0 6.19e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   4 83  0  0  0  1376
> MatAssemblyBegin       1 1.0 1.1921e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd         1 1.0 1.0468e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0  10  0  0  0  0     0
> MatLoad                1 1.0 9.3672e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0  86  0  0  0  0     0
> VecNorm                1 1.0 9.1791e-05 1.0 6.55e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  9  0  0  0  7140
> VecSet                 5 1.0 3.1860e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   3  0  0  0  0     0
> VecAXPY                1 1.0 3.9697e-04 1.0 6.55e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  9  0  0  0  1651
> 
> --- Event Stage 1: mystage 1
> 
> MatMult               40 1.0 3.5990e-01 1.0 5.52e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  1535
> MatConvert             4 1.0 1.2038e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatScale              12 1.0 8.2839e-02 1.0 7.21e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   870
> MatAssemblyBegin      31 1.0 1.4067e-05 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd        31 1.0 1.0995e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRow        1464732 1.0 8.5680e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatCoarsen             4 1.0 2.4768e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAXPY                4 1.0 8.1362e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatMatMult             4 1.0 6.0399e-01 1.0 6.39e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0   106
> MatMatMultSym          4 1.0 4.4536e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatMatMultNum          4 1.0 1.5859e-01 1.0 6.39e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   403
> MatPtAP                4 1.0 1.0255e+02 1.0 4.15e+10 1.0 0.0e+00
> 0.0e+00 0.0e+00 96 91  0  0  0  98 98  0  0  0   405
> MatPtAPSymbolic        4 1.0 6.1707e+01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 58  0  0  0  0  59  0  0  0  0     0
> MatPtAPNumeric         4 1.0 4.0840e+01 1.0 4.15e+10 1.0 0.0e+00
> 0.0e+00 0.0e+00 38 91  0  0  0  39 98  0  0  0  1017
> MatTrnMatMult          1 1.0 1.9648e-01 1.0 2.34e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   119
> MatTrnMatMultSym       1 1.0 1.3640e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatTrnMatMultNum       1 1.0 6.0079e-02 1.0 2.34e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   389
> MatGetSymTrans         5 1.0 8.4374e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecMDot               40 1.0 1.5124e-02 1.0 4.03e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2663
> VecNorm               44 1.0 1.1821e-03 1.0 8.06e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  6815
> VecScale              44 1.0 1.4737e-03 1.0 4.03e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2733
> VecCopy                4 1.0 3.2711e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet               143 1.0 2.7236e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY                4 1.0 3.9601e-04 1.0 7.32e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1849
> VecMAXPY              44 1.0 1.9676e-02 1.0 4.76e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2419
> VecAssemblyBegin       4 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAssemblyEnd         4 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecPointwiseMult      44 1.0 7.9026e-03 1.0 4.03e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   510
> VecSetRandom           4 1.0 3.6092e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecNormalize          44 1.0 2.6934e-03 1.0 1.21e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4486
> KSPGMRESOrthog        40 1.0 3.1765e-02 1.0 8.06e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2536
> KSPSetUp              10 1.0 1.1930e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCGAMGGraph_AGG        4 1.0 5.7353e-01 1.0 5.56e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0    97
> PCGAMGCoarse_AGG       4 1.0 2.5846e-01 1.0 2.34e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    90
> PCGAMGProl_AGG         4 1.0 4.9806e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCGAMGPOpt_AGG         4 1.0 1.2128e+00 1.0 7.38e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  1  2  0  0  0   1  2  0  0  0   608
> GAMG: createProl       4 1.0 2.0973e+00 1.0 8.17e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  2  2  0  0  0   2  2  0  0  0   389
>   Graph                8 1.0 5.7211e-01 1.0 5.56e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0    97
>   MIS/Agg              4 1.0 2.4861e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>   SA: col data         4 1.0 7.6509e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>   SA: frmProl0         4 1.0 4.6539e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>   SA: smooth           4 1.0 1.2128e+00 1.0 7.38e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  1  2  0  0  0   1  2  0  0  0   608
> GAMG: partLevel        4 1.0 1.0255e+02 1.0 4.15e+10 1.0 0.0e+00
> 0.0e+00 0.0e+00 96 91  0  0  0  98 98  0  0  0   405
> PCSetUp                1 1.0 1.0465e+02 1.0 4.23e+10 1.0 0.0e+00
> 0.0e+00 0.0e+00 98 92  0  0  0 100100  0  0  0   405
> 
> --- Event Stage 2: mystage 2
> 
> MatMult              121 1.0 1.0144e+00 1.0 1.61e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00  1  4  0  0  0  42 47  0  0  0  1592
> MatMultAdd            12 1.0 3.1757e-02 1.0 4.95e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   1  1  0  0  0  1558
> MatMultTranspose      12 1.0 3.7137e-02 1.0 4.95e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   2  1  0  0  0  1332
> MatSolve               6 1.0 9.2983e-06 1.0 9.00e+01 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    10
> MatSOR               116 1.0 1.2805e+00 1.0 1.61e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00  1  4  0  0  0  52 47  0  0  0  1260
> MatLUFactorSym         1 1.0 1.5020e-05 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatLUFactorNum         1 1.0 5.9605e-06 1.0 1.60e+01 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     3
> MatResidual           12 1.0 1.0432e-01 1.0 1.67e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   4  5  0  0  0  1599
> MatGetRowIJ            1 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 4.2915e-05 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatView                8 1.0 6.7115e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecMDot               43 1.0 1.3515e-02 1.0 4.03e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   1  1  0  0  0  2980
> VecTDot                4 1.0 1.1048e-03 1.0 2.62e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2373
> VecNorm               53 1.0 1.4720e-03 1.0 1.00e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  6809
> VecScale              50 1.0 1.4396e-03 1.0 4.03e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2798
> VecCopy               21 1.0 3.6387e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet               108 1.0 1.0684e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY               15 1.0 2.1303e-03 1.0 4.09e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1918
> VecAYPX               97 1.0 1.1820e-02 1.0 1.16e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   985
> VecAXPBYCZ            48 1.0 8.2519e-03 1.0 2.20e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0  2663
> VecMAXPY              50 1.0 1.6957e-02 1.0 4.76e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   1  1  0  0  0  2807
> VecNormalize          50 1.0 2.6865e-03 1.0 1.21e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4498
> KSPGMRESOrthog        43 1.0 2.7892e-02 1.0 8.06e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   1  2  0  0  0  2888
> KSPSetUp               5 1.0 2.9690e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve               1 1.0 2.4374e+00 1.0 3.47e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00  2  8  0  0  0 100100  0  0  0  1423
> PCSetUp                1 1.0 9.3937e-05 1.0 1.60e+01 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCSetUpOnBlocks        3 1.0 9.6798e-05 1.0 1.60e+01 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCApply                3 1.0 2.4240e+00 1.0 3.45e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00  2  8  0  0  0  99 99  0  0  0  1423
> -
> ------------------------------------------------------------------------------------------------------------------------
> 
> Memory usage is given in bytes:
> 
> Object Type          Creations   Destructions     Memory  Descendants'
> Mem.
> Reports information only for process 0.
> 
> --- Event Stage 0: Main Stage
> 
>               Viewer     2              2         1520     0
>               Matrix     1             10     54702412     0
>               Vector     3             97     72154440     0
>        Krylov Solver     0             11       146840     0
>       Preconditioner     0              7         7332     0
>            Index Set     0              3         2400     0
> 
> --- Event Stage 1: mystage 1
> 
>               Viewer     1              0            0     0
>               Matrix    22             14    691989068     0
>       Matrix Coarsen     4              4         2576     0
>               Vector   125             91     67210200     0
>        Krylov Solver    15              4       120864     0
>       Preconditioner    15              8         7520     0
>            Index Set     4              4         3168     0
>          PetscRandom     4              4         2560     0
> 
> --- Event Stage 2: mystage 2
> 
>               Matrix     1              0            0     0
>               Vector    60              0            0     0
>            Index Set     5              2         1592     0
> ========================================================================================================================
> Average time to get PetscTime(): 0
> #PETSc Option Table entries:
> -f helmholtz-sphere.dat
> -ksp_convergence_test skip
> -ksp_max_it 2
> -ksp_monitor
> -ksp_type cg
> -ksp_view
> -log_summary
> -matload_block_size 1
> -pc_type gamg
> -table
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure options: --download-chaco=1 --download-ctetgen=1
> --download-exodusii=1 --download-hdf5=1 --download-hypre=1
> --download-metis=1 --download-ml=1 --download-mumps=1
> --download-netcdf=1 --download-parmetis=1 --download-ptscotch=1
> --download-scalapack=1 --download-superlu=1 --download-superlu_dist=1
> --download-triangle=1 --with-c2html=0 --with-debugging=0
> --with-make-np=32 --with-openmp=0 --with-pthreadclasses=0
> --with-shared-libraries=1 --with-threadcomm=0 PETSC_ARCH=arch-linux2-c-opt
> -----------------------------------------
> Libraries compiled on Wed Apr  8 10:00:43 2015 on yam.doc.ic.ac.uk
> Machine characteristics:
> Linux-3.13.0-45-generic-x86_64-with-Ubuntu-14.04-trusty
> Using PETSc directory: /data/lmitche1/src/deps/petsc
> Using PETSc arch: arch-linux2-c-opt
> -----------------------------------------
> 
> Using C compiler: mpicc  -fPIC -Wall -Wwrite-strings
> -Wno-strict-aliasing -Wno-unknown-pragmas -O  ${COPTFLAGS} ${CFLAGS}
> Using Fortran compiler: mpif90  -fPIC -Wall -Wno-unused-variable
> -ffree-line-length-0 -Wno-unused-dummy-argument -O   ${FOPTFLAGS}
> ${FFLAGS}
> -----------------------------------------
> 
> Using include paths:
> -I/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/include
> -I/data/lmitche1/src/deps/petsc/include
> -I/data/lmitche1/src/deps/petsc/include
> -I/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/include
> -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi
> -----------------------------------------
> 
> Using C linker: mpicc
> Using Fortran linker: mpif90
> Using libraries:
> -Wl,-rpath,/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib
> -L/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib -lpetsc
> -Wl,-rpath,/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib
> -L/data/lmitche1/src/deps/petsc/arch-linux2-c-opt/lib -lcmumps
> -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lsuperlu_4.3
> -lsuperlu_dist_4.0 -lHYPRE -Wl,-rpath,/usr/lib/openmpi/lib
> -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_cxx
> -lstdc++ -lscalapack -lml -lmpi_cxx -lstdc++ -lexoIIv2for -lexodus
> -llapack -lblas -lparmetis -ltriangle -lnetcdf -lmetis -lchaco
> -lctetgen -lX11 -lptesmumps -lptscotch -lptscotcherr -lscotch
> -lscotcherr -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lssl
> -lcrypto -lm -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran -lm
> -lquadmath -lm -lmpi_cxx -lstdc++ -lrt -lm -lz
> -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl
> -lmpi -lhwloc -lgcc_s -lpthread -ldl
> -----------------------------------------
> 
> 
> 
> $ ./ex6-30ab49e4
>   0 KSP Residual norm 3.679528502747e-11
>   1 KSP Residual norm 1.410011347346e-14
>   2 KSP Residual norm 2.871653636831e-14
> KSP Object: 1 MPI processes
>   type: cg
>   maximum iterations=2, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: gamg
>     MG: type is MULTIPLICATIVE, levels=3 cycles=v
>       Cycles per PCApply=1
>       Using Galerkin computed coarse grid matrices
>   Coarse grid solver -- level -------------------------------
>     KSP Object:    (mg_coarse_)     1 MPI processes
>       type: preonly
>       maximum iterations=1, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object:    (mg_coarse_)     1 MPI processes
>       type: bjacobi
>         block Jacobi: number of blocks = 1
>         Local solve is same for all blocks, in the following KSP and
> PC objects:
>         KSP Object:        (mg_coarse_sub_)         1 MPI processes
>           type: preonly
>           maximum iterations=1, initial guess is zero
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>           left preconditioning
>           using NONE norm type for convergence test
>         PC Object:        (mg_coarse_sub_)         1 MPI processes
>           type: lu
>             LU: out-of-place factorization
>             tolerance for zero pivot 2.22045e-14
>             using diagonal shift on blocks to prevent zero pivot
> [INBLOCKS]
>             matrix ordering: nd
>             factor fill ratio given 5, needed 1
>               Factored matrix follows:
>                 Mat Object:                 1 MPI processes
>                   type: seqaij
>                   rows=592, cols=592
>                   package used to perform factorization: petsc
>                   total: nonzeros=350464, allocated nonzeros=350464
>                   total number of mallocs used during MatSetValues
> calls =0
>                     using I-node routines: found 119 nodes, limit used
> is 5
>           linear system matrix = precond matrix:
>           Mat Object:           1 MPI processes
>             type: seqaij
>             rows=592, cols=592
>             total: nonzeros=350464, allocated nonzeros=350464
>             total number of mallocs used during MatSetValues calls =0
>               using I-node routines: found 119 nodes, limit used is 5
>       linear system matrix = precond matrix:
>       Mat Object:       1 MPI processes
>         type: seqaij
>         rows=592, cols=592
>         total: nonzeros=350464, allocated nonzeros=350464
>         total number of mallocs used during MatSetValues calls =0
>           using I-node routines: found 119 nodes, limit used is 5
>   Down solver (pre-smoother) on level 1 -------------------------------
>     KSP Object:    (mg_levels_1_)     1 MPI processes
>       type: chebyshev
>         Chebyshev: eigenvalue estimates:  min = 0.0871826, max = 1.83084
>       maximum iterations=2
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using nonzero initial guess
>       using NONE norm type for convergence test
>     PC Object:    (mg_levels_1_)     1 MPI processes
>       type: sor
>         SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
>       linear system matrix = precond matrix:
>       Mat Object:       1 MPI processes
>         type: seqaij
>         rows=35419, cols=35419
>         total: nonzeros=1.55936e+07, allocated nonzeros=1.55936e+07
>         total number of mallocs used during MatSetValues calls =1
>           not using I-node routines
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   Down solver (pre-smoother) on level 2 -------------------------------
>     KSP Object:    (mg_levels_2_)     1 MPI processes
>       type: chebyshev
>         Chebyshev: eigenvalue estimates:  min = 0.099472, max = 2.08891
>       maximum iterations=2
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using nonzero initial guess
>       using NONE norm type for convergence test
>     PC Object:    (mg_levels_2_)     1 MPI processes
>       type: sor
>         SOR: type = local_symmetric, iterations = 1, local iterations
> = 1, omega = 1
>       linear system matrix = precond matrix:
>       Mat Object:       1 MPI processes
>         type: seqaij
>         rows=327680, cols=327680
>         total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
>         total number of mallocs used during MatSetValues calls =0
>           not using I-node routines
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=327680, cols=327680
>     total: nonzeros=3.25828e+06, allocated nonzeros=3.25828e+06
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node routines
> Number of iterations =   2
> Residual norm = 8368.22
> ************************************************************************************************************************
> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r
> -fCourier9' to print this document            ***
> ************************************************************************************************************************
> 
> ---------------------------------------------- PETSc Performance
> Summary: ----------------------------------------------
> 
> ./ex6-1ddf9fe on a test named yam.doc.ic.ac.uk with 1 processor, by
> lmitche1 Mon Apr 27 16:02:36 2015
> Using Petsc Release Version 3.5.2, unknown
> 
>                          Max       Max/Min        Avg      Total
> Time (sec):           2.828e+01      1.00000   2.828e+01
> Objects:              1.150e+02      1.00000   1.150e+02
> Flops:                1.006e+10      1.00000   1.006e+10  1.006e+10
> Flops/sec:            3.559e+08      1.00000   3.559e+08  3.559e+08
> MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Reductions:       0.000e+00      0.00000
> 
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length
> N --> 2N flops
>                             and VecAXPY() for complex vectors of
> length N --> 8N flops
> 
> Summary of Stages:   ----- Time ------  ----- Flops -----  ---
> Messages ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts
> %Total     Avg         %Total   counts   %Total
>  0:      Main Stage: 9.9010e-02   0.4%  7.4996e+06   0.1%  0.000e+00
>  0.0%  0.000e+00        0.0%  0.000e+00   0.0%
>  1:       mystage 1: 2.6509e+01  93.7%  8.4700e+09  84.2%  0.000e+00
>  0.0%  0.000e+00        0.0%  0.000e+00   0.0%
>  2:       mystage 2: 1.6704e+00   5.9%  1.5861e+09  15.8%  0.000e+00
>  0.0%  0.000e+00        0.0%  0.000e+00   0.0%
> 
> -
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length (bytes)
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flops in
> this phase
>       %M - percent messages in this phase     %L - percent message
> lengths in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
> -
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops
>            --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> -
> ------------------------------------------------------------------------------------------------------------------------
> 
> --- Event Stage 0: Main Stage
> 
> ThreadCommRunKer       2 1.0 3.0994e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatMult                1 1.0 7.2370e-03 1.0 6.19e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   7 83  0  0  0   855
> MatAssemblyBegin       1 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd         1 1.0 1.0748e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0  11  0  0  0  0     0
> MatLoad                1 1.0 8.5824e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0  87  0  0  0  0     0
> VecNorm                1 1.0 9.2983e-05 1.0 6.55e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  9  0  0  0  7048
> VecSet                 5 1.0 2.9252e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   3  0  0  0  0     0
> VecAXPY                1 1.0 4.6611e-04 1.0 6.55e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  9  0  0  0  1406
> 
> --- Event Stage 1: mystage 1
> 
> MatMult               20 1.0 2.8699e-01 1.0 3.73e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0  1301
> MatConvert             2 1.0 8.3888e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatScale               6 1.0 7.1905e-02 1.0 4.85e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0   675
> MatAssemblyBegin      20 1.0 1.9789e-05 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd        20 1.0 1.1295e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRow        1452396 1.0 9.5270e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatCoarsen             2 1.0 3.0676e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAXPY                2 1.0 8.2162e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatMatMult             2 1.0 4.2625e-01 1.0 4.31e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  2  0  0  0  0   2  1  0  0  0   101
> MatMatMultSym          2 1.0 3.0257e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> MatMatMultNum          2 1.0 1.2364e-01 1.0 4.31e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0   349
> MatPtAP                2 1.0 2.3871e+01 1.0 7.82e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 84 78  0  0  0  90 92  0  0  0   328
> MatPtAPSymbolic        2 1.0 1.4329e+01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 51  0  0  0  0  54  0  0  0  0     0
> MatPtAPNumeric         2 1.0 9.5422e+00 1.0 7.82e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 34 78  0  0  0  36 92  0  0  0   819
> MatTrnMatMult          2 1.0 9.7712e-01 1.0 8.24e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  3  1  0  0  0   4  1  0  0  0    84
> MatTrnMatMultSym       2 1.0 5.0258e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
> MatTrnMatMultNum       2 1.0 4.7454e-01 1.0 8.24e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  2  1  0  0  0   2  1  0  0  0   174
> MatGetSymTrans         4 1.0 6.2370e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecMDot               20 1.0 1.7304e-02 1.0 3.99e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2308
> VecNorm               22 1.0 1.3692e-03 1.0 7.99e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  5834
> VecScale              22 1.0 1.8549e-03 1.0 3.99e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2153
> VecCopy                2 1.0 5.0211e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet                77 1.0 3.6245e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY                2 1.0 3.7718e-04 1.0 7.26e+05 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1925
> VecMAXPY              22 1.0 2.2252e-02 1.0 4.72e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0  2121
> VecAssemblyBegin       2 1.0 9.5367e-07 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAssemblyEnd         2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecPointwiseMult      22 1.0 9.2957e-03 1.0 3.99e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   430
> VecSetRandom           2 1.0 8.8599e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecNormalize          22 1.0 3.2570e-03 1.0 1.20e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3679
> KSPGMRESOrthog        20 1.0 3.6396e-02 1.0 7.99e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  2195
> KSPSetUp               6 1.0 1.4364e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCGAMGgraph_AGG        2 1.0 4.8670e-01 1.0 3.77e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0    77
> PCGAMGcoarse_AGG       2 1.0 1.0664e+00 1.0 8.24e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  4  1  0  0  0   4  1  0  0  0    77
> PCGAMGProl_AGG         2 1.0 6.4827e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCGAMGPOpt_AGG         2 1.0 9.9913e-01 1.0 5.31e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  4  5  0  0  0   4  6  0  0  0   532
> PCSetUp                1 1.0 2.6505e+01 1.0 8.47e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00 94 84  0  0  0 100100  0  0  0   320
> 
> --- Event Stage 2: mystage 2
> 
> MatMult               38 1.0 5.6846e-01 1.0 6.85e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  2  7  0  0  0  34 43  0  0  0  1204
> MatMultAdd             6 1.0 2.7303e-02 1.0 3.25e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   2  2  0  0  0  1189
> MatMultTranspose       6 1.0 3.2745e-02 1.0 3.25e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   2  2  0  0  0   991
> MatSolve               3 1.0 1.6339e-03 1.0 2.10e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1286
> MatSOR                36 1.0 9.4071e-01 1.0 6.79e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  3  7  0  0  0  56 43  0  0  0   722
> MatLUFactorSym         1 1.0 5.9440e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatLUFactorNum         1 1.0 4.0792e-02 1.0 1.15e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  1  0  0  0   2  7  0  0  0  2820
> MatResidual            6 1.0 9.2385e-02 1.0 1.13e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  1  0  0  0   6  7  0  0  0  1224
> MatGetRowIJ            1 1.0 1.4091e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 2.3508e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatView                6 1.0 5.8508e-04 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecTDot                4 1.0 1.4160e-03 1.0 2.62e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1851
> VecNorm                3 1.0 3.5286e-04 1.0 1.97e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  5572
> VecCopy                8 1.0 1.1313e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> VecSet                22 1.0 4.3163e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY                4 1.0 1.6727e-03 1.0 2.62e+06 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1567
> VecAYPX               49 1.0 1.7958e-02 1.0 1.15e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   1  1  0  0  0   643
> VecAXPBYCZ            24 1.0 1.3117e-02 1.0 2.18e+07 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   1  1  0  0  0  1661
> KSPSetUp               2 1.0 9.2983e-06 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve               1 1.0 1.6690e+00 1.0 1.59e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00  6 16  0  0  0 100100  0  0  0   950
> PCSetUp                1 1.0 4.7009e-02 1.0 1.15e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  1  0  0  0   3  7  0  0  0  2447
> PCSetUpOnBlocks        3 1.0 4.7014e-02 1.0 1.15e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  1  0  0  0   3  7  0  0  0  2447
> PCApply                3 1.0 1.6409e+00 1.0 1.57e+09 1.0 0.0e+00
> 0.0e+00 0.0e+00  6 16  0  0  0  98 99  0  0  0   954
> -
> ------------------------------------------------------------------------------------------------------------------------
> 
> Memory usage is given in bytes:
> 
> Object Type          Creations   Destructions     Memory  Descendants'
> Mem.
> Reports information only for process 0.
> 
> --- Event Stage 0: Main Stage
> 
>               Viewer     1              1          760     0
>               Matrix     1              6     58874852     0
>               Vector     3             20     27954224     0
>        Krylov Solver     0              5        23360     0
>       Preconditioner     0              5         5332     0
>            Index Set     0              3         7112     0
> 
> --- Event Stage 1: mystage 1
> 
>               Viewer     1              0            0     0
>               Matrix    14             10    477163156     0
>       Matrix Coarsen     2              2         1288     0
>               Vector    69             52     66638640     0
>        Krylov Solver     7              2        60432     0
>       Preconditioner     7              2         2096     0
>            Index Set     2              2         1584     0
>          PetscRandom     2              2         1280     0
> 
> --- Event Stage 2: mystage 2
> 
>               Matrix     1              0            0     0
>            Index Set     5              2         2536     0
> ========================================================================================================================
> Average time to get PetscTime(): 0
> #PETSc Option Table entries:
> -f helmholtz-sphere.dat
> -ksp_convergence_test skip
> -ksp_max_it 2
> -ksp_monitor
> -ksp_type cg
> -ksp_view
> -log_summary
> -matload_block_size 1
> -pc_type gamg
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure options: PETSC_ARCH=test --with-debugging=0
> -----------------------------------------
> Libraries compiled on Mon Apr 27 10:49:24 2015 on yam.doc.ic.ac.uk
> Machine characteristics:
> Linux-3.13.0-45-generic-x86_64-with-Ubuntu-14.04-trusty
> Using PETSc directory: /data/lmitche1/src/deps/petsc
> Using PETSc arch: test
> -----------------------------------------
> 
> Using C compiler: mpicc  -fPIC -Wall -Wwrite-strings
> -Wno-strict-aliasing -Wno-unknown-pragmas -O  ${COPTFLAGS} ${CFLAGS}
> Using Fortran compiler: mpif90  -fPIC -Wall -Wno-unused-variable
> -ffree-line-length-0 -Wno-unused-dummy-argument -O   ${FOPTFLAGS}
> ${FFLAGS}
> -----------------------------------------
> 
> Using include paths: -I/data/lmitche1/src/deps/petsc/test/include
> -I/data/lmitche1/src/deps/petsc/include
> -I/data/lmitche1/src/deps/petsc/include
> -I/data/lmitche1/src/deps/petsc/test/include
> -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi
> -----------------------------------------
> 
> Using C linker: mpicc
> Using Fortran linker: mpif90
> Using libraries: -Wl,-rpath,/data/lmitche1/src/deps/petsc/test/lib
> -L/data/lmitche1/src/deps/petsc/test/lib -lpetsc -llapack -lblas -lX11
> -lssl -lcrypto -lpthread -lm -Wl,-rpath,/usr/lib/openmpi/lib
> -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_f90
> -lmpi_f77 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx
> -lstdc++ -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu
> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl
> -lmpi -lhwloc -lgcc_s -lpthread -ldl
> -----------------------------------------
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: Digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20150427/472c6994/attachment.sig>


More information about the petsc-dev mailing list