[petsc-users] PETSc (3.9.0) GAMG weak scaling test issue

"Alberto F. Martín" amartin at cimne.upc.edu
Wed Nov 7 10:02:15 CST 2018


Dear All,

we are performing a weak scaling test of the PETSc (v3.9.0) GAMG 
preconditioner when applied to the linear system arising
from the *conforming unfitted FE discretization *(using Q1 Lagrangian 
FEs) of a 3D PDE Poisson problem, where
the boundary of the domain (a popcorn flake)  is described as a 
zero-level-set embedded within a uniform background
(Cartesian-like) hexahedral mesh. Details underlying the FEM formulation 
can be made available on demand if you
believe that this might be helpful, but let me just point out that it is 
designed such that it addresses the well-known
ill-conditioning issues of unfitted FE discretizations due to the small 
cut cell problem.

The weak scaling test is set up as follows. We start from a single cube 
background mesh, and refine it uniformly several
steps, until we have approximately either 10**3 (load1), 20**3 (load2), 
or 40**3 (load3) hexahedra/MPI task when
distributing it over 4 MPI tasks. The benchmark is scaled such that the 
next larger scale problem to be tested is obtained
by uniformly refining the mesh from the previous scale and running it on 
8x times the number of MPI tasks that we used
in the previous scale.  As a result, we obtain three weak scaling curves 
for each of the three fixed loads per MPI task
above, on the following total number of MPI tasks: 4, 32, 262, 2097, 
16777. The underlying mesh is not partitioned among
MPI tasks using ParMETIS (unstructured multilevel graph partitioning)  
nor optimally by hand, but following the so-called
z-shape space-filling curves provided by an underlying octree-like mesh 
handler (i.e., p4est library).

I configured the preconditioned linear solver as follows:

-ksp_type cg
-ksp_monitor
-ksp_rtol 1.0e-6
-ksp_converged_reason
-ksp_max_it 500
-ksp_norm_type unpreconditioned
-ksp_view
-log_view

-pc_type gamg
-pc_gamg_type agg
-mg_levels_esteig_ksp_type cg
-mg_coarse_sub_pc_type cholesky
-mg_coarse_sub_pc_factor_mat_ordering_type nd
-pc_gamg_process_eq_limit 50
-pc_gamg_square_graph 0
-pc_gamg_agg_nsmooths 1

Raw timings (in seconds) of the preconditioner set up and PCG iterative 
solution stage, and number of iterations are as follows:

**preconditioner set up**
(load1): [0.02542160451, 0.05169247743, 0.09266782179, 0.2426272957, 
13.64161944]
(load2): [0.1239175797  , 0.1885528499  , 0.2719282564  , 0.4783878336, 
13.37947339]
(load3): [0.6565349903  , 0.9435049873  , 1.299908397    , 1.916243652  
, 16.02904088]

**PCG stage**
(load1): [0.003287350759, 0.008163803257, 0.03565631993, 0.08343045413, 
0.6937994603]
(load2): [0.0205939794    , 0.03594723623  , 0.07593298424, 
0.1212046621  , 0.6780373845]
(load3): [0.1310882876    , 0.3214917686    , 0.5532023879  , 
0.766881627    , 1.485446003]

**number of PCG iterations**
(load1): [5, 8, 11, 13, 13]
(load2): [7, 10, 12, 13, 13]
(load3): [8, 10, 12, 13, 13]

It can be observed that both the number of linear solver iterations and 
the PCG stage timings (weakly)
scale remarkably, but t*here is a significant time increase when scaling 
the problem from 2097 to 16777 MPI tasks **
**for the preconditioner setup stage* (e.g., 1.916243652 vs 16.02904088 
sec. with 40**3 cells per MPI task).
I gathered the combined output of -ksp_view and -log_view (only) for all 
the points involving the load3 weak scaling
test (find them attached to this message). Please note that within each 
run, I execute the these two stages up-to
three times, and this influences absolute timings given in -log_view.

Looking at the output of -log_view, it is very strange to me, e.g., that 
the stage labelled as "Graph"
does not scale properly as it is just a call to MatDuplicate if the 
block size of the matrix is 1 (our case), and
I guess that it is just a local operation that does not require any 
communication.
What I am missing here? The load does not seem to be unbalanced looking 
at the "Ratio" column.

I wonder whether the observed behaviour is as expected, or this a 
miss-configuration of the solver from our side.
I played (quite a lot) with several parameter-value combinations, and 
the configuration above is the one that led to fastest
execution  (from the ones tested, that might be incomplete, I can also 
provide further feedback if helpful).
Any feedback that we can get from your experience in order to find the 
cause(s) of this issue and a mitigating solution
will be of high added value.

Thanks very much in advance!
Best regards,
  Alberto.

-- 
Alberto F. Martín-Huertas
Senior Researcher, PhD. Computational Science
Centre Internacional de Mètodes Numèrics a l'Enginyeria (CIMNE)
Parc Mediterrani de la Tecnologia, UPC
Esteve Terradas 5, Building C3, Office 215,
08860 Castelldefels (Barcelona, Spain)
Tel.: (+34) 9341 34223
e-mail:amartin at cimne.upc.edu

FEMPAR project co-founder
web: http://www.fempar.org

________________
IMPORTANT NOTICE
All personal data contained on this mail will be processed confidentially and registered in a file property of CIMNE in
order to manage corporate communications. You may exercise the rights of access, rectification, erasure and object by
letter sent to Ed. C1 Campus Norte UPC. Gran Capitán s/n Barcelona.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20181107/6ccab3ca/attachment-0001.html>
-------------- next part --------------
KSP Object: 4 MPI processes
  type: cg
  maximum iterations=500, initial guess is zero
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 4 MPI processes
  type: gamg
    type is MULTIPLICATIVE, levels=4 cycles=v
      Cycles per PCApply=1
      Using externally compute Galerkin coarse grid matrices
      GAMG specific options
        Threshold for dropping small values in graph on each level =   0.   0.  
        Threshold scaling factor for each level not specified = 1.
        AGG specific options
          Symmetric graph false
          Number of levels to square graph 0
          Number smoothing steps 1
  Coarse grid solver -- level -------------------------------
        KSP Object: (mg_coarse_) 4 MPI processes
          type: preonly
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
          left preconditioning
          using NONE norm type for convergence test
        PC Object: (mg_coarse_) 4 MPI processes
          type: bjacobi
            number of blocks = 4
            Local solve is same for all blocks, in the following KSP and PC objects:
          KSP Object: (mg_coarse_sub_) 1 MPI processes
            type: preonly
            maximum iterations=1, initial guess is zero
            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
            left preconditioning
            using NONE norm type for convergence test
          PC Object: (mg_coarse_sub_) 1 MPI processes
            type: cholesky
              out-of-place factorization
              tolerance for zero pivot 2.22045e-14
              matrix ordering: nd
              factor fill ratio given 5., needed 1.
                Factored matrix follows:
                  Mat Object: 1 MPI processes
                    type: seqsbaij
                    rows=6, cols=6
                    package used to perform factorization: petsc
                    total: nonzeros=21, allocated nonzeros=21
                    total number of mallocs used during MatSetValues calls =0
                        block size is 1
            linear system matrix = precond matrix:
            Mat Object: 1 MPI processes
              type: seqaij
              rows=6, cols=6
              total: nonzeros=36, allocated nonzeros=36
              total number of mallocs used during MatSetValues calls =0
                using I-node routines: found 2 nodes, limit used is 5
          linear system matrix = precond matrix:
          Mat Object: 4 MPI processes
            type: mpiaij
            rows=6, cols=6
            total: nonzeros=36, allocated nonzeros=36
            total number of mallocs used during MatSetValues calls =0
              using I-node (on process 0) routines: found 2 nodes, limit used is 5
  Down solver (pre-smoother) on level 1 -------------------------------
      KSP Object: (mg_levels_1_) 4 MPI processes
        type: chebyshev
          eigenvalue estimates used:  min = 0.129951, max = 1.42946
          eigenvalues estimate via cg min 0.51315, max 1.29951
          eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
        KSP Object: (mg_levels_1_esteig_) 4 MPI processes
          type: cg
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
          estimating eigenvalues using noisy right hand side
        maximum iterations=2, nonzero initial guess
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        using NONE norm type for convergence test
      PC Object: (mg_levels_1_) 4 MPI processes
        type: sor
          type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
        linear system matrix = precond matrix:
        Mat Object: 4 MPI processes
          type: mpiaij
          rows=201, cols=201
          total: nonzeros=24313, allocated nonzeros=24313
          total number of mallocs used during MatSetValues calls =0
            using nonscalable MatPtAP() implementation
            not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object: (mg_levels_2_) 4 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0.132036, max = 1.4524
        eigenvalues estimate via cg min 0.0839922, max 1.32036
        eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
      KSP Object: (mg_levels_2_esteig_) 4 MPI processes
        type: cg
        maximum iterations=10, initial guess is zero
        tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
        left preconditioning
        using PRECONDITIONED norm type for convergence test
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_2_) 4 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object: 4 MPI processes
        type: mpiaij
        rows=4621, cols=4621
        total: nonzeros=369149, allocated nonzeros=369149
        total number of mallocs used during MatSetValues calls =0
          using nonscalable MatPtAP() implementation
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
  KSP Object: (mg_levels_3_) 4 MPI processes
    type: chebyshev
      eigenvalue estimates used:  min = 0.167146, max = 1.83861
      eigenvalues estimate via cg min 0.0634859, max 1.67146
      eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
    KSP Object: (mg_levels_3_esteig_) 4 MPI processes
      type: cg
      maximum iterations=10, initial guess is zero
      tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
      left preconditioning
      using PRECONDITIONED norm type for convergence test
      estimating eigenvalues using noisy right hand side
    maximum iterations=2, nonzero initial guess
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (mg_levels_3_) 4 MPI processes
    type: sor
      type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
    linear system matrix = precond matrix:
    Mat Object: 4 MPI processes
      type: mpiaij
      rows=63511, cols=63511
      total: nonzeros=2301395, allocated nonzeros=38106600
      total number of mallocs used during MatSetValues calls =0
        not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object: 4 MPI processes
    type: mpiaij
    rows=63511, cols=63511
    total: nonzeros=2301395, allocated nonzeros=38106600
    total number of mallocs used during MatSetValues calls =0
      not using I-node (on process 0) routines

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

/gpfs/scratch/upc26/upc26229/build_rel_fempar_cell_agg_ompi/FEMPAR/bin/par_test_h_adaptive_poisson_unfitted on a arch-linux2-c-opt named s14r2b46 with 4 processors, by upc26229 Wed Nov  7 01:07:35 2018
Using Petsc Release Version 3.9.0, Apr, 07, 2018 

                         Max       Max/Min        Avg      Total 
Time (sec):           1.076e+02      1.00000   1.076e+02
Objects:              9.890e+02      1.00304   9.868e+02
Flop:                 6.620e+08      1.09228   6.334e+08  2.533e+09
Flop/sec:            6.150e+06      1.09228   5.884e+06  2.353e+07
MPI Messages:         3.141e+03      1.04997   3.054e+03  1.222e+04
MPI Message Lengths:  1.331e+07      1.02147   4.319e+03  5.277e+07
MPI Reductions:       1.427e+03      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flop
                            and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 1.0765e+02 100.0%  2.5334e+09 100.0%  1.222e+04 100.0%  4.319e+03      100.0%  1.414e+03  99.1% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

BuildTwoSided          9 1.0 1.0124e-03 2.9 0.00e+00 0.0 5.4e+01 8.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BuildTwoSidedF        75 1.0 1.8647e-0111.1 0.00e+00 0.0 3.2e+02 5.0e+04 0.0e+00  0  0  3 30  0   0  0  3 30  0     0
VecMDot               90 1.0 4.3525e-03 1.9 5.94e+06 1.1 0.0e+00 0.0e+00 9.0e+01  0  1  0  0  6   0  1  0  0  6  5180
VecTDot              237 1.0 1.6390e-02 3.5 3.88e+06 1.1 0.0e+00 0.0e+00 2.4e+02  0  1  0  0 17   0  1  0  0 17   897
VecNorm              225 1.0 7.4959e-03 2.6 3.28e+06 1.1 0.0e+00 0.0e+00 2.2e+02  0  0  0  0 16   0  0  0  0 16  1661
VecScale              99 1.0 2.6664e-04 1.2 5.94e+05 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  8457
VecCopy              105 1.0 5.5974e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               402 1.0 5.1273e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              237 1.0 1.1864e-03 1.1 3.88e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 12396
VecAYPX              678 1.0 3.3991e-03 1.2 6.00e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  6695
VecAXPBYCZ           288 1.0 2.0727e-03 1.1 8.64e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 15825
VecMAXPY              99 1.0 1.9828e-03 1.3 7.02e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 13441
VecAssemblyBegin      24 1.0 3.5199e-03 1.1 0.00e+00 0.0 6.0e+01 3.6e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd        24 1.0 1.2138e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult      99 1.0 6.6968e-04 1.1 5.94e+05 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3367
VecScatterBegin      810 1.0 5.6504e-03 1.0 0.00e+00 0.0 9.3e+03 2.7e+03 0.0e+00  0  0 76 47  0   0  0 76 47  0     0
VecScatterEnd        810 1.0 1.4432e-02 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSetRandom           9 1.0 1.5864e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize          99 1.0 1.6803e-03 1.7 1.78e+06 1.1 0.0e+00 0.0e+00 9.9e+01  0  0  0  0  7   0  0  0  0  7  4026
MatMult              636 1.0 1.9035e-01 1.0 3.12e+08 1.1 7.6e+03 3.0e+03 0.0e+00  0 47 62 44  0   0 47 62 44  0  6275
MatMultAdd            72 1.0 1.0663e-02 1.1 6.23e+06 1.1 6.5e+02 5.0e+02 0.0e+00  0  1  5  1  0   0  1  5  1  0  2248
MatMultTranspose      72 1.0 1.3565e-02 1.3 6.23e+06 1.1 6.5e+02 5.0e+02 0.0e+00  0  1  5  1  0   0  1  5  1  0  1767
MatSolve              24 0.0 4.1796e-05 0.0 1.58e+03 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    38
MatSOR               531 1.0 2.2511e-01 1.1 2.33e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0 35  0  0  0   0 35  0  0  0  3950
MatCholFctrSym         3 1.0 4.2330e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCholFctrNum         3 1.0 4.0101e-05 1.8 1.80e+01 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatConvert             9 1.0 1.6046e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale              27 1.0 5.5906e-03 1.1 5.00e+06 1.1 1.1e+02 2.9e+03 0.0e+00  0  1  1  1  0   0  1  1  1  0  3428
MatResidual           72 1.0 2.0435e-02 1.1 3.38e+07 1.1 8.6e+02 2.9e+03 0.0e+00  0  5  7  5  0   0  5  7  5  0  6330
MatAssemblyBegin     168 1.0 2.3268e-01 1.4 0.00e+00 0.0 2.6e+02 6.1e+04 0.0e+00  0  0  2 30  0   0  0  2 30  0     0
MatAssemblyEnd       168 1.0 1.3291e-01 1.1 0.00e+00 0.0 7.8e+02 8.4e+02 3.6e+02  0  0  6  1 25   0  0  6  1 25     0
MatGetRow         162018 1.1 1.9420e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            3 0.0 4.2650e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCreateSubMat        6 1.0 1.6496e-03 1.0 0.00e+00 0.0 4.8e+01 5.1e+01 9.6e+01  0  0  0  0  7   0  0  0  0  7     0
MatGetOrdering         3 0.0 2.7078e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen             9 1.0 7.8692e-03 1.1 0.00e+00 0.0 7.0e+02 2.2e+03 2.7e+01  0  0  6  3  2   0  0  6  3  2     0
MatZeroEntries         9 1.0 1.0505e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView               21 1.4 3.8399e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+01  0  0  0  0  1   0  0  0  0  1     0
MatAXPY                9 1.0 1.6539e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMult             9 1.0 7.2559e-02 1.0 4.22e+06 1.1 6.3e+02 2.1e+03 1.1e+02  0  1  5  2  8   0  1  5  2  8   223
MatMatMultSym          9 1.0 5.8217e-02 1.0 0.00e+00 0.0 5.2e+02 1.9e+03 1.1e+02  0  0  4  2  8   0  0  4  2  8     0
MatMatMultNum          9 1.0 1.4338e-02 1.0 4.22e+06 1.1 1.1e+02 2.9e+03 0.0e+00  0  1  1  1  0   0  1  1  1  0  1128
MatPtAP                9 1.0 3.5600e-01 1.0 5.53e+07 1.1 9.5e+02 1.7e+04 1.4e+02  0  9  8 30  9   0  9  8 30 10   605
MatPtAPSymbolic        9 1.0 2.5158e-01 1.0 0.00e+00 0.0 6.2e+02 1.4e+04 6.3e+01  0  0  5 16  4   0  0  5 16  4     0
MatPtAPNumeric         9 1.0 1.0436e-01 1.0 5.53e+07 1.1 3.3e+02 2.2e+04 7.2e+01  0  9  3 14  5   0  9  3 14  5  2064
MatGetLocalMat        27 1.0 1.0135e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol         27 1.0 7.4055e-03 1.2 0.00e+00 0.0 7.6e+02 9.9e+03 0.0e+00  0  0  6 14  0   0  0  6 14  0     0
KSPGMRESOrthog        90 1.0 5.7355e-03 1.4 1.19e+07 1.1 0.0e+00 0.0e+00 9.0e+01  0  2  0  0  6   0  2  0  0  6  7863
KSPSetUp              36 1.0 3.4705e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  2   0  0  0  0  2     0
KSPSolve               3 1.0 4.3135e-01 1.0 5.40e+08 1.1 7.9e+03 2.6e+03 3.6e+02  0 81 64 39 25   0 81 64 39 26  4786
PCGAMGGraph_AGG        9 1.0 2.2263e-01 1.0 4.22e+06 1.1 3.2e+02 1.9e+03 1.1e+02  0  1  3  1  8   0  1  3  1  8    73
PCGAMGCoarse_AGG       9 1.0 1.1966e-02 1.0 0.00e+00 0.0 7.0e+02 2.2e+03 2.7e+01  0  0  6  3  2   0  0  6  3  2     0
PCGAMGProl_AGG         9 1.0 4.8133e-02 1.0 0.00e+00 0.0 3.8e+02 1.7e+03 1.4e+02  0  0  3  1 10   0  0  3  1 10     0
PCGAMGPOpt_AGG         9 1.0 1.4798e-01 1.0 6.22e+07 1.1 1.7e+03 2.6e+03 3.7e+02  0  9 14  8 26   0  9 14  8 26  1605
GAMG: createProl       9 1.0 4.3246e-01 1.0 6.64e+07 1.1 3.1e+03 2.3e+03 6.5e+02  0 10 26 14 45   0 10 26 14 46   586
  Graph               18 1.0 2.2127e-01 1.0 4.22e+06 1.1 3.2e+02 1.9e+03 1.1e+02  0  1  3  1  8   0  1  3  1  8    73
  MIS/Agg              9 1.0 7.9907e-03 1.1 0.00e+00 0.0 7.0e+02 2.2e+03 2.7e+01  0  0  6  3  2   0  0  6  3  2     0
  SA: col data         9 1.0 5.8624e-03 1.1 0.00e+00 0.0 2.2e+02 2.7e+03 3.6e+01  0  0  2  1  3   0  0  2  1  3     0
  SA: frmProl0         9 1.0 4.0594e-02 1.0 0.00e+00 0.0 1.7e+02 4.9e+02 7.2e+01  0  0  1  0  5   0  0  1  0  5     0
  SA: smooth           9 1.0 9.3119e-02 1.0 5.00e+06 1.1 6.3e+02 2.1e+03 1.3e+02  0  1  5  2  9   0  1  5  2  9   206
GAMG: partLevel        9 1.0 3.5860e-01 1.0 5.53e+07 1.1 1.0e+03 1.6e+04 2.9e+02  0  9  8 30 20   0  9  8 30 20   601
  repartition          3 1.0 1.7322e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  1   0  0  0  0  1     0
  Invert-Sort          3 1.0 1.5528e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  1   0  0  0  0  1     0
  Move A               3 1.0 1.1300e-03 1.0 0.00e+00 0.0 3.0e+01 6.6e+01 5.1e+01  0  0  0  0  4   0  0  0  0  4     0
  Move P               3 1.0 7.7750e-04 1.0 0.00e+00 0.0 1.8e+01 2.6e+01 5.1e+01  0  0  0  0  4   0  0  0  0  4     0
PCSetUp                6 1.0 7.9483e-01 1.0 1.22e+08 1.1 4.1e+03 5.7e+03 9.9e+02  1 19 34 44 69   1 19 34 44 70   590
PCSetUpOnBlocks       24 1.0 5.2736e-04 1.1 1.80e+01 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply               24 1.0 4.0983e-01 1.0 5.07e+08 1.1 7.6e+03 2.5e+03 2.9e+02  0 76 62 36 20   0 76 62 36 20  4727
SFSetGraph             9 1.0 2.1600e-06 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp                9 1.0 1.6055e-03 1.7 0.00e+00 0.0 1.6e+02 1.8e+03 0.0e+00  0  0  1  1  0   0  0  1  1  0     0
SFBcastBegin          45 1.0 5.1013e-04 1.1 0.00e+00 0.0 5.4e+02 2.4e+03 0.0e+00  0  0  4  2  0   0  0  4  2  0     0
SFBcastEnd            45 1.0 7.3911e-04 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector   429            429     16885896     0.
              Matrix   252            252    628801440     0.
      Matrix Coarsen     9              9         6156     0.
           Index Set   147            147       299856     0.
         Vec Scatter    57             57        79848     0.
       Krylov Solver    36             36       314928     0.
      Preconditioner    27             27        29544     0.
              Viewer     5              4         3584     0.
         PetscRandom    18             18        12492     0.
   Star Forest Graph     9              9         8496     0.
========================================================================================================================
Average time to get PetscTime(): 4.1601e-08
Average time for MPI_Barrier(): 1.76301e-06
Average time for zero size MPI_Send(): 1.59626e-06
#PETSc Option Table entries:
--prefix run_a0b0c0d0e0f0g0h0i0_n4_l3
-aggrmeth alla_serial
-beta 10.0
-betaest .true.
-check .false.
-datadt data_distribution_fully_assembled
-dm 3
-dom -1.0
-in_space .true.
-ksp_converged_reason
-ksp_max_it 500
-ksp_monitor
-ksp_norm_type unpreconditioned
-ksp_rtol 1.0e-6
-ksp_type cg
-ksp_view
-l 1
-levelset popcorn
-levelsettol 1.0e-6
-log_view
-lsdom 0.0
-maxl 6
-mg_coarse_sub_pc_factor_mat_ordering_type nd
-mg_coarse_sub_pc_type cholesky
-mg_levels_esteig_ksp_type cg
-no_signal_handler
-nruns 3
-order 1
-pc_gamg_agg_nsmooths 1
-pc_gamg_process_eq_limit 50
-pc_gamg_square_graph 0
-pc_gamg_type agg
-pc_type gamg
-petscrc /gpfs/scratch/upc26/upc26229/par_cell_aggr_poisson/paper/weak_scal_ompi/2nd-w-scal/petscrc-0
-tt 1
-uagg .true.
-wratio 10
-wsolution .false.
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 -with-blaslapack-dir=/apps/INTEL/2017.4/mkl --with-debugging=0 --with-x=0 --with-shared-libraries=1 --with-mpi=1 --with-64-bit-indices
-----------------------------------------
Libraries compiled on 2018-06-04 18:55:32 on login1 
Machine characteristics: Linux-4.4.103-92.56-default-x86_64-with-SuSE-12-x86_64
Using PETSc directory: /gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------

Using C compiler: mpicc  -fPIC  -wd1572 -g -O3  
Using Fortran compiler: mpif90  -fPIC -g -O3    
-----------------------------------------

Using include paths: -I/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/include -I/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/lib -L/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/apps/INTEL/2017.4/mkl/lib/intel64 -L/apps/INTEL/2017.4/mkl/lib/intel64 -Wl,-rpath,/usr/mpi/intel/openmpi-1.10.4-hfi/lib64 -L/usr/mpi/intel/openmpi-1.10.4-hfi/lib64 -Wl,-rpath,/gpfs/apps/MN4/INTEL/2018.0.128/compilers_and_libraries_2018.0.128/linux/compiler/lib/intel64_lin -L/gpfs/apps/MN4/INTEL/2018.0.128/compilers_and_libraries_2018.0.128/linux/compiler/lib/intel64_lin -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.8 -L/usr/lib64/gcc/x86_64-suse-linux/4.8 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s -lstdc++ -ldl
-----------------------------------------






-------------- next part --------------
Linear solve converged due to CONVERGED_RTOL iterations 9
KSP Object: 32 MPI processes
  type: cg
  maximum iterations=500, initial guess is zero
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 32 MPI processes
  type: gamg
    type is MULTIPLICATIVE, levels=4 cycles=v
      Cycles per PCApply=1
      Using externally compute Galerkin coarse grid matrices
      GAMG specific options
        Threshold for dropping small values in graph on each level =   0.   0.  
        Threshold scaling factor for each level not specified = 1.
        AGG specific options
          Symmetric graph false
          Number of levels to square graph 0
          Number smoothing steps 1
  Coarse grid solver -- level -------------------------------
        KSP Object: (mg_coarse_) 32 MPI processes
          type: preonly
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
          left preconditioning
          using NONE norm type for convergence test
        PC Object: (mg_coarse_) 32 MPI processes
          type: bjacobi
            number of blocks = 32
            Local solve is same for all blocks, in the following KSP and PC objects:
          KSP Object: (mg_coarse_sub_) 1 MPI processes
            type: preonly
            maximum iterations=1, initial guess is zero
            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
            left preconditioning
            using NONE norm type for convergence test
          PC Object: (mg_coarse_sub_) 1 MPI processes
            type: cholesky
              out-of-place factorization
              tolerance for zero pivot 2.22045e-14
              matrix ordering: nd
              factor fill ratio given 5., needed 1.
                Factored matrix follows:
                  Mat Object: 1 MPI processes
                    type: seqsbaij
                    rows=35, cols=35
                    package used to perform factorization: petsc
                    total: nonzeros=630, allocated nonzeros=630
                    total number of mallocs used during MatSetValues calls =0
                        block size is 1
            linear system matrix = precond matrix:
            Mat Object: 1 MPI processes
              type: seqaij
              rows=35, cols=35
              total: nonzeros=1225, allocated nonzeros=1225
              total number of mallocs used during MatSetValues calls =0
                using I-node routines: found 7 nodes, limit used is 5
          linear system matrix = precond matrix:
          Mat Object: 32 MPI processes
            type: mpiaij
            rows=35, cols=35
            total: nonzeros=1225, allocated nonzeros=1225
            total number of mallocs used during MatSetValues calls =0
              using I-node (on process 0) routines: found 7 nodes, limit used is 5
  Down solver (pre-smoother) on level 1 -------------------------------
      KSP Object: (mg_levels_1_) 32 MPI processes
        type: chebyshev
          eigenvalue estimates used:  min = 0.140301, max = 1.54331
          eigenvalues estimate via cg min 0.150843, max 1.40301
          eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
        KSP Object: (mg_levels_1_esteig_) 32 MPI processes
          type: cg
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
          estimating eigenvalues using noisy right hand side
        maximum iterations=2, nonzero initial guess
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        using NONE norm type for convergence test
      PC Object: (mg_levels_1_) 32 MPI processes
        type: sor
          type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
        linear system matrix = precond matrix:
        Mat Object: 32 MPI processes
          type: mpiaij
          rows=1654, cols=1654
          total: nonzeros=302008, allocated nonzeros=302008
          total number of mallocs used during MatSetValues calls =0
            using nonscalable MatPtAP() implementation
            using I-node (on process 0) routines: found 15 nodes, limit used is 5
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object: (mg_levels_2_) 32 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0.135428, max = 1.48971
        eigenvalues estimate via cg min 0.0330649, max 1.35428
        eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
      KSP Object: (mg_levels_2_esteig_) 32 MPI processes
        type: cg
        maximum iterations=10, initial guess is zero
        tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
        left preconditioning
        using PRECONDITIONED norm type for convergence test
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_2_) 32 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object: 32 MPI processes
        type: mpiaij
        rows=38899, cols=38899
        total: nonzeros=3088735, allocated nonzeros=3088735
        total number of mallocs used during MatSetValues calls =0
          using nonscalable MatPtAP() implementation
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
  KSP Object: (mg_levels_3_) 32 MPI processes
    type: chebyshev
      eigenvalue estimates used:  min = 0.196606, max = 2.16267
      eigenvalues estimate via cg min 0.0475838, max 1.96606
      eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
    KSP Object: (mg_levels_3_esteig_) 32 MPI processes
      type: cg
      maximum iterations=10, initial guess is zero
      tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
      left preconditioning
      using PRECONDITIONED norm type for convergence test
      estimating eigenvalues using noisy right hand side
    maximum iterations=2, nonzero initial guess
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (mg_levels_3_) 32 MPI processes
    type: sor
      type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
    linear system matrix = precond matrix:
    Mat Object: 32 MPI processes
      type: mpiaij
      rows=508459, cols=508459
      total: nonzeros=16204885, allocated nonzeros=305075400
      total number of mallocs used during MatSetValues calls =0
        not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object: 32 MPI processes
    type: mpiaij
    rows=508459, cols=508459
    total: nonzeros=16204885, allocated nonzeros=305075400
    total number of mallocs used during MatSetValues calls =0
      not using I-node (on process 0) routines

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

/gpfs/scratch/upc26/upc26229/build_rel_fempar_cell_agg_ompi/FEMPAR/bin/par_test_h_adaptive_poisson_unfitted on a arch-linux2-c-opt named s15r1b25 with 32 processors, by upc26229 Wed Nov  7 01:09:23 2018
Using Petsc Release Version 3.9.0, Apr, 07, 2018 

                         Max       Max/Min        Avg      Total 
Time (sec):           1.621e+02      1.00000   1.621e+02
Objects:              9.890e+02      1.00304   9.861e+02
Flop:                 7.802e+08      2.25680   6.170e+08  1.974e+10
Flop/sec:            4.812e+06      2.25680   3.806e+06  1.218e+08
MPI Messages:         2.457e+04      2.04836   1.917e+04  6.134e+05
MPI Message Lengths:  3.844e+07      2.16684   1.506e+03  9.236e+08
MPI Reductions:       1.469e+03      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flop
                            and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 1.6212e+02 100.0%  1.9744e+10 100.0%  6.134e+05 100.0%  1.506e+03      100.0%  1.456e+03  99.1% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

BuildTwoSided          9 1.0 7.2859e-03 6.2 0.00e+00 0.0 2.5e+03 8.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BuildTwoSidedF        75 1.0 7.6260e+0063.2 0.00e+00 0.0 1.2e+04 2.1e+04 0.0e+00  2  0  2 27  0   2  0  2 27  0     0
VecMDot               90 1.0 3.3601e-02 7.0 7.73e+06 3.0 0.0e+00 0.0e+00 9.0e+01  0  1  0  0  6   0  1  0  0  6  5391
VecTDot              243 1.0 1.1469e-0117.2 5.28e+06 3.0 0.0e+00 0.0e+00 2.4e+02  0  1  0  0 17   0  1  0  0 17  1082
VecNorm              228 1.0 6.2426e-02 8.2 4.39e+06 3.0 0.0e+00 0.0e+00 2.3e+02  0  1  0  0 16   0  1  0  0 16  1650
VecScale              99 1.0 1.0887e-03 6.2 7.73e+05 3.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 16641
VecCopy              114 1.0 1.3520e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               438 1.0 1.0085e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              243 1.0 6.7886e-03 3.4 5.28e+06 3.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 18279
VecAYPX              753 1.0 9.3539e-03 3.1 8.63e+06 3.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 21626
VecAXPBYCZ           324 1.0 5.0166e-03 2.0 1.26e+07 3.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 59097
VecMAXPY              99 1.0 4.4716e-03 4.9 9.14e+06 3.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 47884
VecAssemblyBegin      24 1.0 1.1001e-02 2.2 0.00e+00 0.0 1.7e+03 2.0e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd        24 1.0 3.4029e-04 5.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult      99 1.0 1.1331e-03 2.3 7.73e+05 3.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 15989
VecScatterBegin      885 1.0 3.6779e-02 2.2 0.00e+00 0.0 4.6e+05 9.8e+02 0.0e+00  0  0 75 49  0   0  0 75 49  0     0
VecScatterEnd        885 1.0 3.2675e-0112.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSetRandom           9 1.0 2.0615e-03 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize          99 1.0 8.0360e-03 3.0 2.32e+06 3.0 0.0e+00 0.0e+00 9.9e+01  0  0  0  0  7   0  0  0  0  7  6764
MatMult              693 1.0 4.7642e-01 1.3 3.73e+08 2.2 3.9e+05 1.1e+03 0.0e+00  0 48 64 46  0   0 48 64 46  0 19814
MatMultAdd            81 1.0 2.5011e-02 1.8 9.13e+06 2.9 2.7e+04 2.4e+02 0.0e+00  0  1  4  1  0   0  1  4  1  0  8668
MatMultTranspose      81 1.0 4.0727e-02 2.3 9.13e+06 2.9 2.7e+04 2.4e+02 0.0e+00  0  1  4  1  0   0  1  4  1  0  5323
MatSolve              27 0.0 2.1556e-04 0.0 6.52e+04 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   302
MatSOR               585 1.0 4.9140e-01 1.9 2.65e+08 2.2 0.0e+00 0.0e+00 0.0e+00  0 34  0  0  0   0 34  0  0  0 13575
MatCholFctrSym         3 1.0 6.5050e-03313.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCholFctrNum         3 1.0 3.2232e-03996.2 1.05e+02 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatConvert             9 1.0 2.4005e-02 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale              27 1.0 1.1253e-02 2.1 5.57e+06 2.3 5.1e+03 1.0e+03 0.0e+00  0  1  1  1  0   0  1  1  1  0 12589
MatResidual           81 1.0 7.8485e-02 2.2 4.17e+07 2.2 4.6e+04 1.0e+03 0.0e+00  0  5  7  5  0   0  5  7  5  0 13482
MatAssemblyBegin     168 1.0 7.7038e+0036.5 0.00e+00 0.0 9.9e+03 2.5e+04 0.0e+00  2  0  2 26  0   2  0  2 26  0     0
MatAssemblyEnd       168 1.0 2.8208e-01 1.5 0.00e+00 0.0 3.7e+04 3.1e+02 3.6e+02  0  0  6  1 25   0  0  6  1 25     0
MatGetRow         210816 3.0 2.5672e-02 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            3 0.0 2.7154e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCreateSubMat        6 1.0 1.1316e-02 1.0 0.00e+00 0.0 5.9e+02 1.4e+02 9.6e+01  0  0  0  0  7   0  0  0  0  7     0
MatGetOrdering         3 0.0 7.1425e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen             9 1.0 2.3828e-02 1.2 0.00e+00 0.0 5.4e+04 7.5e+02 6.0e+01  0  0  9  4  4   0  0  9  4  4     0
MatZeroEntries         9 1.0 1.7683e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView               21 1.4 5.7901e-03 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+01  0  0  0  0  1   0  0  0  0  1     0
MatAXPY                9 1.0 2.3933e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMult             9 1.0 1.0349e-01 1.0 4.64e+06 2.2 3.0e+04 7.4e+02 1.1e+02  0  1  5  2  7   0  1  5  2  7  1136
MatMatMultSym          9 1.0 8.4956e-02 1.0 0.00e+00 0.0 2.4e+04 6.8e+02 1.1e+02  0  0  4  2  7   0  0  4  2  7     0
MatMatMultNum          9 1.0 1.9249e-02 1.1 4.64e+06 2.2 5.1e+03 1.0e+03 0.0e+00  0  1  1  1  0   0  1  1  1  0  6108
MatPtAP                9 1.0 5.3590e-01 1.0 6.57e+07 2.3 4.7e+04 5.9e+03 1.4e+02  0  8  8 30  9   0  8  8 30  9  3097
MatPtAPSymbolic        9 1.0 3.5159e-01 1.0 0.00e+00 0.0 2.9e+04 5.2e+03 6.3e+01  0  0  5 16  4   0  0  5 16  4     0
MatPtAPNumeric         9 1.0 1.8422e-01 1.0 6.57e+07 2.3 1.8e+04 6.9e+03 7.2e+01  0  8  3 14  5   0  8  3 14  5  9008
MatGetLocalMat        27 1.0 1.3379e-02 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol         27 1.0 2.5772e-02 1.3 0.00e+00 0.0 3.6e+04 3.6e+03 0.0e+00  0  0  6 14  0   0  0  6 14  0     0
KSPGMRESOrthog        90 1.0 3.4506e-02 4.2 1.55e+07 3.0 0.0e+00 0.0e+00 9.0e+01  0  2  0  0  6   0  2  0  0  6 10501
KSPSetUp              36 1.0 7.0966e-03 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  2   0  0  0  0  2     0
KSPSolve               3 1.0 9.5439e-01 1.0 6.41e+08 2.2 3.9e+05 9.8e+02 3.7e+02  1 82 64 42 25   1 82 64 42 26 16969
PCGAMGGraph_AGG        9 1.0 2.9486e-01 1.0 4.64e+06 2.2 1.5e+04 6.9e+02 1.1e+02  0  1  2  1  7   0  1  2  1  7   399
PCGAMGCoarse_AGG       9 1.0 2.6490e-02 1.1 0.00e+00 0.0 5.4e+04 7.5e+02 6.0e+01  0  0  9  4  4   0  0  9  4  4     0
PCGAMGProl_AGG         9 1.0 8.0748e-02 1.0 0.00e+00 0.0 1.5e+04 8.1e+02 1.4e+02  0  0  2  1 10   0  0  2  1 10     0
PCGAMGPOpt_AGG         9 1.0 2.4086e-01 1.0 6.97e+07 2.3 8.1e+04 9.3e+02 3.7e+02  0  9 13  8 25   0  9 13  8 25  7357
GAMG: createProl       9 1.0 6.4623e-01 1.0 7.44e+07 2.3 1.7e+05 8.4e+02 6.8e+02  0 10 27 15 46   0 10 27 15 47  2924
  Graph               18 1.0 2.8964e-01 1.0 4.64e+06 2.2 1.5e+04 6.9e+02 1.1e+02  0  1  2  1  7   0  1  2  1  7   406
  MIS/Agg              9 1.0 2.3962e-02 1.2 0.00e+00 0.0 5.4e+04 7.5e+02 6.0e+01  0  0  9  4  4   0  0  9  4  4     0
  SA: col data         9 1.0 9.0052e-03 1.1 0.00e+00 0.0 1.0e+04 1.0e+03 3.6e+01  0  0  2  1  2   0  0  2  1  2     0
  SA: frmProl0         9 1.0 6.9267e-02 1.0 0.00e+00 0.0 4.7e+03 3.3e+02 7.2e+01  0  0  1  0  5   0  0  1  0  5     0
  SA: smooth           9 1.0 1.3371e-01 1.0 5.57e+06 2.3 3.0e+04 7.4e+02 1.3e+02  0  1  5  2  9   0  1  5  2  9  1059
GAMG: partLevel        9 1.0 5.5452e-01 1.0 6.57e+07 2.3 4.8e+04 5.8e+03 2.9e+02  0  8  8 30 20   0  8  8 30 20  2993
  repartition          3 1.0 1.7596e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  1   0  0  0  0  1     0
  Invert-Sort          3 1.0 6.5844e-04 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  1   0  0  0  0  1     0
  Move A               3 1.0 6.4088e-03 1.1 0.00e+00 0.0 4.0e+02 1.6e+02 5.1e+01  0  0  0  0  3   0  0  0  0  4     0
  Move P               3 1.0 5.6291e-03 1.1 0.00e+00 0.0 1.9e+02 9.1e+01 5.1e+01  0  0  0  0  3   0  0  0  0  4     0
PCSetUp                6 1.0 1.2264e+00 1.0 1.40e+08 2.3 2.1e+05 2.0e+03 1.0e+03  1 18 35 45 70   1 18 35 45 70  2894
PCSetUpOnBlocks       27 1.0 7.7373e-03 1.6 1.05e+02 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply               27 1.0 9.0805e-01 1.1 6.00e+08 2.2 3.8e+05 9.3e+02 2.9e+02  1 77 62 38 20   1 77 62 38 20 16705
SFSetGraph             9 1.0 2.6450e-06 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp                9 1.0 8.5383e-03 2.5 0.00e+00 0.0 7.6e+03 6.9e+02 0.0e+00  0  0  1  1  0   0  0  1  1  0     0
SFBcastBegin          78 1.0 4.1889e-03 2.7 0.00e+00 0.0 4.7e+04 7.6e+02 0.0e+00  0  0  8  4  0   0  0  8  4  0     0
SFBcastEnd            78 1.0 5.7714e-03 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector   429            429     10070544     0.
              Matrix   252            252    362421120     0.
      Matrix Coarsen     9              9         6156     0.
           Index Set   147            147       365304     0.
         Vec Scatter    57             57        79800     0.
       Krylov Solver    36             36       314928     0.
      Preconditioner    27             27        29544     0.
              Viewer     5              4         3584     0.
         PetscRandom    18             18        12492     0.
   Star Forest Graph     9              9         8496     0.
========================================================================================================================
Average time to get PetscTime(): 4.1537e-08
Average time for MPI_Barrier(): 3.98215e-06
Average time for zero size MPI_Send(): 1.49463e-06
#PETSc Option Table entries:
--prefix run_a0b0c0d0e0f0g0h0i0_n5_l3
-aggrmeth alla_serial
-beta 10.0
-betaest .true.
-check .false.
-datadt data_distribution_fully_assembled
-dm 3
-dom -1.0
-in_space .true.
-ksp_converged_reason
-ksp_max_it 500
-ksp_monitor
-ksp_norm_type unpreconditioned
-ksp_rtol 1.0e-6
-ksp_type cg
-ksp_view
-l 1
-levelset popcorn
-levelsettol 1.0e-6
-log_view
-lsdom 0.0
-maxl 7
-mg_coarse_sub_pc_factor_mat_ordering_type nd
-mg_coarse_sub_pc_type cholesky
-mg_levels_esteig_ksp_type cg
-no_signal_handler
-nruns 3
-order 1
-pc_gamg_agg_nsmooths 1
-pc_gamg_process_eq_limit 50
-pc_gamg_square_graph 0
-pc_gamg_type agg
-pc_type gamg
-petscrc /gpfs/scratch/upc26/upc26229/par_cell_aggr_poisson/paper/weak_scal_ompi/2nd-w-scal/petscrc-0
-tt 1
-uagg .true.
-wratio 10
-wsolution .false.
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 -with-blaslapack-dir=/apps/INTEL/2017.4/mkl --with-debugging=0 --with-x=0 --with-shared-libraries=1 --with-mpi=1 --with-64-bit-indices
-----------------------------------------
Libraries compiled on 2018-06-04 18:55:32 on login1 
Machine characteristics: Linux-4.4.103-92.56-default-x86_64-with-SuSE-12-x86_64
Using PETSc directory: /gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------

Using C compiler: mpicc  -fPIC  -wd1572 -g -O3  
Using Fortran compiler: mpif90  -fPIC -g -O3    
-----------------------------------------

Using include paths: -I/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/include -I/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/lib -L/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/apps/INTEL/2017.4/mkl/lib/intel64 -L/apps/INTEL/2017.4/mkl/lib/intel64 -Wl,-rpath,/usr/mpi/intel/openmpi-1.10.4-hfi/lib64 -L/usr/mpi/intel/openmpi-1.10.4-hfi/lib64 -Wl,-rpath,/gpfs/apps/MN4/INTEL/2018.0.128/compilers_and_libraries_2018.0.128/linux/compiler/lib/intel64_lin -L/gpfs/apps/MN4/INTEL/2018.0.128/compilers_and_libraries_2018.0.128/linux/compiler/lib/intel64_lin -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.8 -L/usr/lib64/gcc/x86_64-suse-linux/4.8 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s -lstdc++ -ldl
-----------------------------------------

Ending run at Wed Nov  7 01:09:23 CET 2018
Ending script at Wed Nov  7 01:09:23 CET 2018



-------------- next part --------------
KSP Object: 262 MPI processes
  type: cg
  maximum iterations=500, initial guess is zero
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 262 MPI processes
  type: gamg
    type is MULTIPLICATIVE, levels=5 cycles=v
      Cycles per PCApply=1
      Using externally compute Galerkin coarse grid matrices
      GAMG specific options
        Threshold for dropping small values in graph on each level =   0.   0.   0.  
        Threshold scaling factor for each level not specified = 1.
        AGG specific options
          Symmetric graph false
          Number of levels to square graph 0
          Number smoothing steps 1
  Coarse grid solver -- level -------------------------------
          KSP Object: (mg_coarse_) 262 MPI processes
            type: preonly
            maximum iterations=10000, initial guess is zero
            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
            left preconditioning
            using NONE norm type for convergence test
          PC Object: (mg_coarse_) 262 MPI processes
            type: bjacobi
              number of blocks = 262
              Local solve is same for all blocks, in the following KSP and PC objects:
            KSP Object: (mg_coarse_sub_) 1 MPI processes
              type: preonly
              maximum iterations=1, initial guess is zero
              tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
              left preconditioning
              using NONE norm type for convergence test
            PC Object: (mg_coarse_sub_) 1 MPI processes
              type: cholesky
                out-of-place factorization
                tolerance for zero pivot 2.22045e-14
                matrix ordering: nd
                factor fill ratio given 5., needed 1.
                  Factored matrix follows:
                    Mat Object: 1 MPI processes
                      type: seqsbaij
                      rows=4, cols=4
                      package used to perform factorization: petsc
                      total: nonzeros=10, allocated nonzeros=10
                      total number of mallocs used during MatSetValues calls =0
                          block size is 1
              linear system matrix = precond matrix:
              Mat Object: 1 MPI processes
                type: seqaij
                rows=4, cols=4
                total: nonzeros=16, allocated nonzeros=16
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 1 nodes, limit used is 5
            linear system matrix = precond matrix:
            Mat Object: 262 MPI processes
              type: mpiaij
              rows=4, cols=4
              total: nonzeros=16, allocated nonzeros=16
              total number of mallocs used during MatSetValues calls =0
                using I-node (on process 0) routines: found 1 nodes, limit used is 5
  Down solver (pre-smoother) on level 1 -------------------------------
        KSP Object: (mg_levels_1_) 262 MPI processes
          type: chebyshev
            eigenvalue estimates used:  min = 0.129006, max = 1.41907
            eigenvalues estimate via cg min 0.482341, max 1.29006
            eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
          KSP Object: (mg_levels_1_esteig_) 262 MPI processes
            type: cg
            maximum iterations=10, initial guess is zero
            tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
            left preconditioning
            using PRECONDITIONED norm type for convergence test
            estimating eigenvalues using noisy right hand side
          maximum iterations=2, nonzero initial guess
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
          left preconditioning
          using NONE norm type for convergence test
        PC Object: (mg_levels_1_) 262 MPI processes
          type: sor
            type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
          linear system matrix = precond matrix:
          Mat Object: 262 MPI processes
            type: mpiaij
            rows=284, cols=284
            total: nonzeros=47942, allocated nonzeros=47942
            total number of mallocs used during MatSetValues calls =0
              not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
      KSP Object: (mg_levels_2_) 262 MPI processes
        type: chebyshev
          eigenvalue estimates used:  min = 0.160435, max = 1.76479
          eigenvalues estimate via cg min 0.0880722, max 1.60435
          eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
        KSP Object: (mg_levels_2_esteig_) 262 MPI processes
          type: cg
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
          estimating eigenvalues using noisy right hand side
        maximum iterations=2, nonzero initial guess
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        using NONE norm type for convergence test
      PC Object: (mg_levels_2_) 262 MPI processes
        type: sor
          type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
        linear system matrix = precond matrix:
        Mat Object: 262 MPI processes
          type: mpiaij
          rows=13842, cols=13842
          total: nonzeros=2801068, allocated nonzeros=2801068
          total number of mallocs used during MatSetValues calls =0
            using nonscalable MatPtAP() implementation
            not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object: (mg_levels_3_) 262 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0.135811, max = 1.49392
        eigenvalues estimate via cg min 0.036202, max 1.35811
        eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
      KSP Object: (mg_levels_3_esteig_) 262 MPI processes
        type: cg
        maximum iterations=10, initial guess is zero
        tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
        left preconditioning
        using PRECONDITIONED norm type for convergence test
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_3_) 262 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object: 262 MPI processes
        type: mpiaij
        rows=319856, cols=319856
        total: nonzeros=25116236, allocated nonzeros=25116236
        total number of mallocs used during MatSetValues calls =0
          using scalable MatPtAP() implementation
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 4 -------------------------------
  KSP Object: (mg_levels_4_) 262 MPI processes
    type: chebyshev
      eigenvalue estimates used:  min = 0.298538, max = 3.28392
      eigenvalues estimate via cg min 0.0506704, max 2.98538
      eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
    KSP Object: (mg_levels_4_esteig_) 262 MPI processes
      type: cg
      maximum iterations=10, initial guess is zero
      tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
      left preconditioning
      using PRECONDITIONED norm type for convergence test
      estimating eigenvalues using noisy right hand side
    maximum iterations=2, nonzero initial guess
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (mg_levels_4_) 262 MPI processes
    type: sor
      type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
    linear system matrix = precond matrix:
    Mat Object: 262 MPI processes
      type: mpiaij
      rows=4068981, cols=4068981
      total: nonzeros=120055495, allocated nonzeros=2441388600
      total number of mallocs used during MatSetValues calls =0
        not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object: 262 MPI processes
    type: mpiaij
    rows=4068981, cols=4068981
    total: nonzeros=120055495, allocated nonzeros=2441388600
    total number of mallocs used during MatSetValues calls =0
      not using I-node (on process 0) routines


************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

/gpfs/scratch/upc26/upc26229/build_rel_fempar_cell_agg_ompi/FEMPAR/bin/par_test_h_adaptive_poisson_unfitted on a arch-linux2-c-opt named s07r1b55 with 262 processors, by upc26229 Wed Nov  7 01:14:20 2018
Using Petsc Release Version 3.9.0, Apr, 07, 2018 

                         Max       Max/Min        Avg      Total 
Time (sec):           2.359e+02      1.00000   2.359e+02
Objects:              1.355e+03      1.00222   1.352e+03
Flop:                 8.832e+08      0.00000   6.569e+08  1.721e+11
Flop/sec:            3.743e+06      0.00000   2.784e+06  7.295e+08
MPI Messages:         5.577e+04   5069.63636   3.334e+04  8.736e+06
MPI Message Lengths:  4.973e+07   1130198.90909   1.019e+03  8.904e+09
MPI Reductions:       2.072e+03      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flop
                            and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 2.3593e+02 100.0%  1.7210e+11 100.0%  8.736e+06 100.0%  1.019e+03      100.0%  2.059e+03  99.4% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

BuildTwoSided         12 1.0 1.6995e-02 7.1 0.00e+00 0.0 3.0e+04 8.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BuildTwoSidedF       102 1.0 1.0505e+0133.9 0.00e+00 0.0 1.3e+05 1.6e+04 0.0e+00  3  0  1 24  0   3  0  1 24  0     0
VecMDot              120 1.0 1.0321e-01 9.5 7.43e+06 0.0 0.0e+00 0.0e+00 1.2e+02  0  1  0  0  6   0  1  0  0  6 14077
VecTDot              318 1.0 1.3620e+0061.0 5.57e+06 0.0 0.0e+00 0.0e+00 3.2e+02  0  1  0  0 15   0  1  0  0 15   802
VecNorm              300 1.0 2.2448e-0110.2 4.46e+06 0.0 0.0e+00 0.0e+00 3.0e+02  0  1  0  0 14   0  1  0  0 15  3894
VecScale             132 1.0 4.2758e-0329.9 7.43e+05 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 33981
VecCopy              174 1.0 1.9950e-0355.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               651 1.0 1.4187e-0314.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              318 1.0 2.0724e-02148.6 5.57e+06 0.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 52686
VecAYPX             1194 1.0 1.5941e-02106.7 9.89e+06 0.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 121379
VecAXPBYCZ           528 1.0 9.4921e-03156.6 1.49e+07 0.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 306144
VecMAXPY             132 1.0 7.9654e-0356.3 8.78e+06 0.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 215576
VecAssemblyBegin      33 1.0 1.9882e-02 3.0 0.00e+00 0.0 1.7e+04 1.7e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd        33 1.0 4.4571e-03149.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult     132 1.0 1.8402e-0362.2 7.43e+05 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 78959
VecScatterBegin     1371 1.0 6.3832e-02215.6 0.00e+00 0.0 6.4e+06 7.3e+02 0.0e+00  0  0 73 52  0   0  0 73 52  0     0
VecScatterEnd       1371 1.0 1.2430e+006808.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSetRandom          12 1.0 2.0955e-03958.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize         132 1.0 2.2506e-02 4.3 2.23e+06 0.0 0.0e+00 0.0e+00 1.3e+02  0  0  0  0  6   0  0  0  0  6 19368
MatMult             1065 1.0 1.1689e+001737.8 4.26e+08 0.0 5.4e+06 8.1e+02 0.0e+00  0 48 62 49  0   0 48 62 49  0 71034
MatMultAdd           132 1.0 8.6626e-021090.7 1.08e+07 0.0 3.7e+05 1.9e+02 0.0e+00  0  1  4  1  0   0  1  4  1  0 24436
MatMultTranspose     132 1.0 5.6369e-013353.5 1.08e+07 0.0 3.7e+05 1.9e+02 0.0e+00  0  1  4  1  0   0  1  4  1  0  3755
MatSolve              33 0.0 5.7641e-05 0.0 9.24e+02 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    16
MatSOR               924 1.0 7.8933e-015774.2 3.08e+08 0.0 0.0e+00 0.0e+00 0.0e+00  0 34  0  0  0   0 34  0  0  0 74391
MatCholFctrSym         3 1.0 1.0960e-02545.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCholFctrNum         3 1.0 6.4946e-032309.9 1.20e+01 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatConvert            12 1.0 3.4631e-0230.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale              36 1.0 2.6028e-02184.6 5.48e+06 0.0 6.1e+04 7.7e+02 0.0e+00  0  1  1  1  0   0  1  1  1  0 41516
MatResidual          132 1.0 1.8136e-011415.7 5.00e+07 0.0 6.7e+05 7.7e+02 0.0e+00  0  6  8  6  0   0  6  8  6  0 53866
MatAssemblyBegin     231 1.0 1.0487e+0118.5 0.00e+00 0.0 1.1e+05 1.9e+04 0.0e+00  3  0  1 23  0   3  0  1 23  0     0
MatAssemblyEnd       231 1.0 4.6673e-01 1.6 0.00e+00 0.0 5.3e+05 2.0e+02 5.0e+02  0  0  6  1 24   0  0  6  1 24     0
MatGetRow         202635 0.0 2.5265e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            3 0.0 1.2548e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCreateSubMat       12 1.0 5.2781e-02 1.0 0.00e+00 0.0 5.4e+03 4.8e+02 1.9e+02  0  0  0  0  9   0  0  0  0  9     0
MatGetOrdering         3 0.0 1.7776e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen            12 1.0 4.3461e-02 1.3 0.00e+00 0.0 1.1e+06 4.1e+02 1.2e+02  0  0 13  5  6   0  0 13  5  6     0
MatZeroEntries        12 1.0 1.7174e-03325.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView               24 1.3 6.3856e-02 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  1   0  0  0  0  1     0
MatAXPY               12 1.0 2.7936e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMult            12 1.0 3.3930e-01 1.1 4.54e+06 0.0 3.5e+05 5.5e+02 1.5e+02  0  1  4  2  7   0  1  4  2  7  2618
MatMatMultSym         12 1.0 2.5782e-01 1.0 0.00e+00 0.0 2.9e+05 5.1e+02 1.4e+02  0  0  3  2  7   0  0  3  2  7     0
MatMatMultNum         12 1.0 5.1971e-02 1.1 4.54e+06 0.0 6.1e+04 7.7e+02 0.0e+00  0  1  1  1  0   0  1  1  1  0 17089
MatPtAP               12 1.0 1.0058e+00 1.0 6.57e+07 0.0 6.6e+05 3.9e+03 1.8e+02  0  7  8 29  9   0  7  8 29  9 12715
MatPtAPSymbolic       12 1.0 5.2416e-01 1.0 0.00e+00 0.0 3.5e+05 4.0e+03 8.4e+01  0  0  4 16  4   0  0  4 16  4     0
MatPtAPNumeric        12 1.0 4.7255e-01 1.0 6.57e+07 0.0 3.2e+05 3.8e+03 9.6e+01  0  7  4 13  5   0  7  4 13  5 27062
MatGetLocalMat        36 1.0 1.3884e-0237.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol         36 1.0 5.1478e-0293.5 0.00e+00 0.0 4.3e+05 2.7e+03 0.0e+00  0  0  5 13  0   0  0  5 13  0     0
KSPGMRESOrthog       120 1.0 1.0809e-01 7.9 1.49e+07 0.0 0.0e+00 0.0e+00 1.2e+02  0  2  0  0  6   0  2  0  0  6 26883
KSPSetUp              45 1.0 1.8355e-02 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+01  0  0  0  0  1   0  0  0  0  1     0
KSPSolve               3 1.0 1.5817e+00 1.0 7.46e+08 0.0 5.6e+06 7.3e+02 4.9e+02  1 84 64 45 23   1 84 64 45 24 91555
PCGAMGGraph_AGG       12 1.0 3.4556e-01 1.0 4.54e+06 0.0 1.8e+05 5.2e+02 1.4e+02  0  1  2  1  7   0  1  2  1  7  2570
PCGAMGCoarse_AGG      12 1.0 4.4724e-02 1.2 0.00e+00 0.0 1.1e+06 4.1e+02 1.2e+02  0  0 13  5  6   0  0 13  5  6     0
PCGAMGProl_AGG        12 1.0 3.5267e-01 1.0 0.00e+00 0.0 1.7e+05 6.4e+02 1.9e+02  0  0  2  1  9   0  0  2  1  9     0
PCGAMGPOpt_AGG        12 1.0 5.1901e-01 1.0 6.89e+07 0.0 9.6e+05 6.9e+02 5.0e+02  0  8 11  7 24   0  8 11  7 24 26218
GAMG: createProl      12 1.0 1.2634e+00 1.0 7.34e+07 0.0 2.4e+06 5.4e+02 9.5e+02  1  8 28 15 46   1  8 28 15 46 11473
  Graph               24 1.0 3.3748e-01 1.0 4.54e+06 0.0 1.8e+05 5.2e+02 1.4e+02  0  1  2  1  7   0  1  2  1  7  2632
  MIS/Agg             12 1.0 4.3569e-02 1.3 0.00e+00 0.0 1.1e+06 4.1e+02 1.2e+02  0  0 13  5  6   0  0 13  5  6     0
  SA: col data        12 1.0 1.2331e-02 1.1 0.00e+00 0.0 1.2e+05 7.7e+02 4.8e+01  0  0  1  1  2   0  0  1  1  2     0
  SA: frmProl0        12 1.0 3.3684e-01 1.0 0.00e+00 0.0 4.7e+04 2.9e+02 9.6e+01  0  0  1  0  5   0  0  1  0  5     0
  SA: smooth          12 1.0 3.7167e-01 1.1 5.48e+06 0.0 3.5e+05 5.5e+02 1.7e+02  0  1  4  2  8   0  1  4  2  8  2907
GAMG: partLevel       12 1.0 1.0910e+00 1.0 6.57e+07 0.0 6.7e+05 3.9e+03 4.9e+02  0  7  8 29 24   0  7  8 29 24 11721
  repartition          6 1.0 8.1465e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+01  0  0  0  0  2   0  0  0  0  2     0
  Invert-Sort          6 1.0 3.8278e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0
  Move A               6 1.0 4.3095e-02 1.0 0.00e+00 0.0 2.9e+03 8.1e+02 1.0e+02  0  0  0  0  5   0  0  0  0  5     0
  Move P               6 1.0 1.6401e-02 1.1 0.00e+00 0.0 2.5e+03 9.0e+01 1.0e+02  0  0  0  0  5   0  0  0  0  5     0
PCSetUp                6 1.0 2.4101e+00 1.0 1.38e+08 0.0 3.1e+06 1.3e+03 1.5e+03  1 16 36 44 73   1 16 36 44 73 11321
PCSetUpOnBlocks       33 1.0 1.9296e-02 5.9 1.20e+01 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply               33 1.0 1.4915e+00 5.2 6.98e+08 0.0 5.4e+06 6.9e+02 3.8e+02  1 79 62 42 19   1 79 62 42 19 90795
SFSetGraph            12 1.0 3.8510e-06 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp               12 1.0 1.7290e-02 3.4 0.00e+00 0.0 9.1e+04 5.2e+02 0.0e+00  0  0  1  1  0   0  0  1  1  0     0
SFBcastBegin         144 1.0 8.3527e-0354.1 0.00e+00 0.0 1.0e+06 4.0e+02 0.0e+00  0  0 12  5  0   0  0 12  5  0     0
SFBcastEnd           144 1.0 1.6454e-02500.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector   573            573      1139304     0.
              Matrix   348            348      6730200     0.
      Matrix Coarsen    12             12         8208     0.
           Index Set   222            222       228240     0.
         Vec Scatter    81             81       113448     0.
       Krylov Solver    45             45       415944     0.
      Preconditioner    33             33        35448     0.
              Viewer     5              4         3584     0.
         PetscRandom    24             24        16656     0.
   Star Forest Graph    12             12        11328     0.
========================================================================================================================
Average time to get PetscTime(): 4.40516e-08
Average time for MPI_Barrier(): 1.38046e-05
Average time for zero size MPI_Send(): 1.46144e-06
#PETSc Option Table entries:
--prefix run_a0b0c0d0e0f0g0h0i0_n6_l3
-aggrmeth alla_serial
-beta 10.0
-betaest .true.
-check .false.
-datadt data_distribution_fully_assembled
-dm 3
-dom -1.0
-in_space .true.
-ksp_converged_reason
-ksp_max_it 500
-ksp_monitor
-ksp_norm_type unpreconditioned
-ksp_rtol 1.0e-6
-ksp_type cg
-ksp_view
-l 1
-levelset popcorn
-levelsettol 1.0e-6
-log_view
-lsdom 0.0
-maxl 8
-mg_coarse_sub_pc_factor_mat_ordering_type nd
-mg_coarse_sub_pc_type cholesky
-mg_levels_esteig_ksp_type cg
-no_signal_handler
-nruns 3
-order 1
-pc_gamg_agg_nsmooths 1
-pc_gamg_process_eq_limit 50
-pc_gamg_square_graph 0
-pc_gamg_type agg
-pc_type gamg
-petscrc /gpfs/scratch/upc26/upc26229/par_cell_aggr_poisson/paper/weak_scal_ompi/2nd-w-scal/petscrc-0
-tt 1
-uagg .true.
-wratio 10
-wsolution .false.
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 -with-blaslapack-dir=/apps/INTEL/2017.4/mkl --with-debugging=0 --with-x=0 --with-shared-libraries=1 --with-mpi=1 --with-64-bit-indices
-----------------------------------------
Libraries compiled on 2018-06-04 18:55:32 on login1 
Machine characteristics: Linux-4.4.103-92.56-default-x86_64-with-SuSE-12-x86_64
Using PETSc directory: /gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------

Using C compiler: mpicc  -fPIC  -wd1572 -g -O3  
Using Fortran compiler: mpif90  -fPIC -g -O3    
-----------------------------------------

Using include paths: -I/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/include -I/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/lib -L/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/apps/INTEL/2017.4/mkl/lib/intel64 -L/apps/INTEL/2017.4/mkl/lib/intel64 -Wl,-rpath,/usr/mpi/intel/openmpi-1.10.4-hfi/lib64 -L/usr/mpi/intel/openmpi-1.10.4-hfi/lib64 -Wl,-rpath,/gpfs/apps/MN4/INTEL/2018.0.128/compilers_and_libraries_2018.0.128/linux/compiler/lib/intel64_lin -L/gpfs/apps/MN4/INTEL/2018.0.128/compilers_and_libraries_2018.0.128/linux/compiler/lib/intel64_lin -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.8 -L/usr/lib64/gcc/x86_64-suse-linux/4.8 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s -lstdc++ -ldl
-----------------------------------------

Ending run at Wed Nov  7 01:14:21 CET 2018
Ending script at Wed Nov  7 01:14:21 CET 2018
-------------- next part --------------
KSP Object: 2097 MPI processes
  type: cg
  maximum iterations=500, initial guess is zero
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 2097 MPI processes
  type: gamg
    type is MULTIPLICATIVE, levels=5 cycles=v
      Cycles per PCApply=1
      Using externally compute Galerkin coarse grid matrices
      GAMG specific options
        Threshold for dropping small values in graph on each level =   0.   0.   0.  
        Threshold scaling factor for each level not specified = 1.
        AGG specific options
          Symmetric graph false
          Number of levels to square graph 0
          Number smoothing steps 1
  Coarse grid solver -- level -------------------------------
          KSP Object: (mg_coarse_) 2097 MPI processes
            type: preonly
            maximum iterations=10000, initial guess is zero
            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
            left preconditioning
            using NONE norm type for convergence test
          PC Object: (mg_coarse_) 2097 MPI processes
            type: bjacobi
              number of blocks = 2097
              Local solve is same for all blocks, in the following KSP and PC objects:
            KSP Object: (mg_coarse_sub_) 1 MPI processes
              type: preonly
              maximum iterations=1, initial guess is zero
              tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
              left preconditioning
              using NONE norm type for convergence test
            PC Object: (mg_coarse_sub_) 1 MPI processes
              type: cholesky
                out-of-place factorization
                tolerance for zero pivot 2.22045e-14
                matrix ordering: nd
                factor fill ratio given 5., needed 1.
                  Factored matrix follows:
                    Mat Object: 1 MPI processes
                      type: seqsbaij
                      rows=36, cols=36
                      package used to perform factorization: petsc
                      total: nonzeros=666, allocated nonzeros=666
                      total number of mallocs used during MatSetValues calls =0
                          block size is 1
              linear system matrix = precond matrix:
              Mat Object: 1 MPI processes
                type: seqaij
                rows=36, cols=36
                total: nonzeros=1296, allocated nonzeros=1296
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 8 nodes, limit used is 5
            linear system matrix = precond matrix:
            Mat Object: 2097 MPI processes
              type: mpiaij
              rows=36, cols=36
              total: nonzeros=1296, allocated nonzeros=1296
              total number of mallocs used during MatSetValues calls =0
                using I-node (on process 0) routines: found 8 nodes, limit used is 5
  Down solver (pre-smoother) on level 1 -------------------------------
        KSP Object: (mg_levels_1_) 2097 MPI processes
          type: chebyshev
            eigenvalue estimates used:  min = 0.167617, max = 1.84379
            eigenvalues estimate via cg min 0.106378, max 1.67617
            eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
          KSP Object: (mg_levels_1_esteig_) 2097 MPI processes
            type: cg
            maximum iterations=10, initial guess is zero
            tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
            left preconditioning
            using PRECONDITIONED norm type for convergence test
            estimating eigenvalues using noisy right hand side
          maximum iterations=2, nonzero initial guess
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
          left preconditioning
          using NONE norm type for convergence test
        PC Object: (mg_levels_1_) 2097 MPI processes
          type: sor
            type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
          linear system matrix = precond matrix:
          Mat Object: 2097 MPI processes
            type: mpiaij
            rows=2304, cols=2304
            total: nonzeros=598220, allocated nonzeros=598220
            total number of mallocs used during MatSetValues calls =0
              not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
      KSP Object: (mg_levels_2_) 2097 MPI processes
        type: chebyshev
          eigenvalue estimates used:  min = 0.14326, max = 1.57586
          eigenvalues estimate via cg min 0.0397147, max 1.4326
          eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
        KSP Object: (mg_levels_2_esteig_) 2097 MPI processes
          type: cg
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
          estimating eigenvalues using noisy right hand side
        maximum iterations=2, nonzero initial guess
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        using NONE norm type for convergence test
      PC Object: (mg_levels_2_) 2097 MPI processes
        type: sor
          type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
        linear system matrix = precond matrix:
        Mat Object: 2097 MPI processes
          type: mpiaij
          rows=112580, cols=112580
          total: nonzeros=23420598, allocated nonzeros=23420598
          total number of mallocs used during MatSetValues calls =0
            using scalable MatPtAP() implementation
            not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object: (mg_levels_3_) 2097 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0.136096, max = 1.49705
        eigenvalues estimate via cg min 0.0338371, max 1.36096
        eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
      KSP Object: (mg_levels_3_esteig_) 2097 MPI processes
        type: cg
        maximum iterations=10, initial guess is zero
        tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
        left preconditioning
        using PRECONDITIONED norm type for convergence test
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_3_) 2097 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object: 2097 MPI processes
        type: mpiaij
        rows=2597459, cols=2597459
        total: nonzeros=202116477, allocated nonzeros=202116477
        total number of mallocs used during MatSetValues calls =0
          using scalable MatPtAP() implementation
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 4 -------------------------------
  KSP Object: (mg_levels_4_) 2097 MPI processes
    type: chebyshev
      eigenvalue estimates used:  min = 0.335857, max = 3.69443
      eigenvalues estimate via cg min 0.0542715, max 3.35857
      eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
    KSP Object: (mg_levels_4_esteig_) 2097 MPI processes
      type: cg
      maximum iterations=10, initial guess is zero
      tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
      left preconditioning
      using PRECONDITIONED norm type for convergence test
      estimating eigenvalues using noisy right hand side
    maximum iterations=2, nonzero initial guess
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (mg_levels_4_) 2097 MPI processes
    type: sor
      type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
    linear system matrix = precond matrix:
    Mat Object: 2097 MPI processes
      type: mpiaij
      rows=32552439, cols=32552439
      total: nonzeros=920267663, allocated nonzeros=19531463400
      total number of mallocs used during MatSetValues calls =0
        not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object: 2097 MPI processes
    type: mpiaij
    rows=32552439, cols=32552439
    total: nonzeros=920267663, allocated nonzeros=19531463400
    total number of mallocs used during MatSetValues calls =0
      not using I-node (on process 0) routines

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

/gpfs/scratch/upc26/upc26229/build_rel_fempar_cell_agg_ompi/FEMPAR/bin/par_test_h_adaptive_poisson_unfitted on a arch-linux2-c-opt named s07r2b02 with 2097 processors, by upc26229 Wed Nov  7 01:15:12 2018
Using Petsc Release Version 3.9.0, Apr, 07, 2018 

                         Max       Max/Min        Avg      Total 
Time (sec):           2.458e+02      1.00000   2.458e+02
Objects:              1.355e+03      1.00222   1.352e+03
Flop:                 9.818e+08      0.00000   6.789e+08  1.424e+12
Flop/sec:            3.994e+06      0.00000   2.762e+06  5.791e+09
MPI Messages:         1.103e+05   10027.81818   4.452e+04  9.336e+07
MPI Message Lengths:  5.692e+07   1293601.40909   8.203e+02  7.658e+10
MPI Reductions:       2.216e+03      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flop
                            and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 2.4583e+02 100.0%  1.4237e+12 100.0%  9.336e+07 100.0%  8.203e+02      100.0%  2.203e+03  99.4% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

BuildTwoSided         12 1.0 4.6126e-02 2.1 0.00e+00 0.0 2.9e+05 8.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BuildTwoSidedF       102 1.0 1.1057e+0119.1 0.00e+00 0.0 1.1e+06 1.5e+04 0.0e+00  3  0  1 22  0   3  0  1 22  0     0
VecMDot              120 1.0 1.3730e-01 4.0 7.37e+06 0.0 0.0e+00 0.0e+00 1.2e+02  0  1  0  0  5   0  1  0  0  5 84752
VecTDot              324 1.0 1.8522e+0019.8 5.77e+06 0.0 0.0e+00 0.0e+00 3.2e+02  0  1  0  0 15   0  1  0  0 15  4929
VecNorm              303 1.0 3.3777e-01 3.6 4.55e+06 0.0 0.0e+00 0.0e+00 3.0e+02  0  1  0  0 14   0  1  0  0 14 21298
VecScale             132 1.0 6.1834e-0351.4 7.37e+05 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 188203
VecCopy              186 1.0 2.5507e-03101.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               696 1.0 4.9682e-0364.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              324 1.0 1.8679e-02143.7 5.77e+06 0.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 488834
VecAYPX             1293 1.0 1.8702e-02166.9 1.06e+07 0.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 895518
VecAXPBYCZ           576 1.0 1.2857e-02233.9 1.61e+07 0.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 1974899
VecMAXPY             132 1.0 8.0905e-03119.3 8.71e+06 0.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 1699935
VecAssemblyBegin      33 1.0 2.7789e-02 2.2 0.00e+00 0.0 1.3e+05 1.8e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd        33 1.0 5.7797e-03196.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult     132 1.0 3.0804e-03165.9 7.37e+05 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 377790
VecScatterBegin     1470 1.0 9.9138e-02609.8 0.00e+00 0.0 6.5e+07 6.3e+02 0.0e+00  0  0 69 53  0   0  0 69 53  0     0
VecScatterEnd       1470 1.0 1.6594e+0012314.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSetRandom          12 1.0 2.2327e-031311.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize         132 1.0 5.1942e-02 1.7 2.21e+06 0.0 0.0e+00 0.0e+00 1.3e+02  0  0  0  0  6   0  0  0  0  6 67213
MatMult             1140 1.0 1.5379e+002703.7 4.83e+08 0.0 5.4e+07 7.0e+02 0.0e+00  0 48 58 50  0   0 48 58 50  0 447478
MatMultAdd           144 1.0 3.0789e-014466.1 1.18e+07 0.0 4.2e+06 1.5e+02 0.0e+00  0  1  5  1  0   0  1  5  1  0 59866
MatMultTranspose     144 1.0 7.5095e-015720.6 1.18e+07 0.0 4.2e+06 1.5e+02 0.0e+00  0  1  5  1  0   0  1  5  1  0 24545
MatSolve              36 0.0 1.6145e-04 0.0 9.20e+04 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   570
MatSOR               996 1.0 9.8081e-018037.0 3.40e+08 0.0 0.0e+00 0.0e+00 0.0e+00  0 34  0  0  0   0 34  0  0  0 497713
MatCholFctrSym         3 1.0 1.0764e-02556.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCholFctrNum         3 1.0 7.9658e-032816.4 1.08e+02 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatConvert            12 1.0 3.8779e-0251.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale              36 1.0 3.0256e-02564.4 5.74e+06 0.0 5.7e+05 6.7e+02 0.0e+00  0  1  1  1  0   0  1  1  1  0 278106
MatResidual          144 1.0 2.4749e-012126.4 5.77e+07 0.0 6.9e+06 6.7e+02 0.0e+00  0  6  7  6  0   0  6  7  6  0 333518
MatAssemblyBegin     231 1.0 1.1035e+0115.1 0.00e+00 0.0 9.7e+05 1.7e+04 0.0e+00  3  0  1 22  0   3  0  1 22  0     0
MatAssemblyEnd       231 1.0 5.6362e-01 1.5 0.00e+00 0.0 5.5e+06 1.6e+02 5.0e+02  0  0  6  1 23   0  0  6  1 23     0
MatGetRow         201060 0.0 2.9343e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            3 0.0 4.4128e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCreateSubMat       12 1.0 1.0294e-01 1.1 0.00e+00 0.0 2.6e+05 1.3e+02 1.9e+02  0  0  0  0  9   0  0  0  0  9     0
MatGetOrdering         3 0.0 1.7719e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen            12 1.0 1.3539e-01 1.2 0.00e+00 0.0 1.7e+07 3.0e+02 2.5e+02  0  0 18  7 11   0  0 18  7 11     0
MatZeroEntries        12 1.0 1.8496e-03427.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView               24 1.3 6.3806e-02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  1   0  0  0  0  1     0
MatAXPY               12 1.0 3.7202e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMult            12 1.0 7.9738e-01 1.5 4.80e+06 0.0 3.3e+06 4.9e+02 1.5e+02  0  0  3  2  7   0  0  3  2  7  8626
MatMatMultSym         12 1.0 4.7848e-01 1.0 0.00e+00 0.0 2.7e+06 4.5e+02 1.4e+02  0  0  3  2  6   0  0  3  2  7     0
MatMatMultNum         12 1.0 7.1849e-02 1.1 4.80e+06 0.0 5.7e+05 6.7e+02 0.0e+00  0  0  1  1  0   0  0  1  1  0 95734
MatPtAP               12 1.0 1.3269e+00 1.0 6.99e+07 0.0 6.5e+06 3.3e+03 1.9e+02  1  7  7 28  8   1  7  7 28  8 75292
MatPtAPSymbolic       12 1.0 7.0335e-01 1.0 0.00e+00 0.0 3.2e+06 3.6e+03 8.4e+01  0  0  3 15  4   0  0  3 15  4     0
MatPtAPNumeric        12 1.0 6.0478e-01 1.0 6.99e+07 0.0 3.3e+06 3.0e+03 9.6e+01  0  7  4 13  4   0  7  4 13  4 165193
MatGetLocalMat        36 1.0 1.9234e-0255.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol         36 1.0 1.0496e-01201.1 0.00e+00 0.0 4.0e+06 2.4e+03 0.0e+00  0  0  4 13  0   0  0  4 13  0     0
KSPGMRESOrthog       120 1.0 1.4184e-01 3.7 1.47e+07 0.0 0.0e+00 0.0e+00 1.2e+02  0  2  0  0  5   0  2  0  0  5 164085
KSPSetUp              45 1.0 7.5013e-02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+01  0  0  0  0  1   0  0  0  0  1     0
KSPSolve               3 1.0 2.1762e+00 1.0 8.36e+08 0.0 5.7e+07 6.3e+02 5.0e+02  1 85 61 47 22   1 85 61 47 22 556264
PCGAMGGraph_AGG       12 1.0 4.0398e-01 1.0 4.80e+06 0.0 1.7e+06 4.5e+02 1.4e+02  0  0  2  1  6   0  0  2  1  7 17026
PCGAMGCoarse_AGG      12 1.0 1.3612e-01 1.1 0.00e+00 0.0 1.7e+07 3.0e+02 2.5e+02  0  0 18  7 11   0  0 18  7 11     0
PCGAMGProl_AGG        12 1.0 4.2615e-01 1.0 0.00e+00 0.0 1.5e+06 5.7e+02 1.9e+02  0  0  2  1  9   0  0  2  1  9     0
PCGAMGPOpt_AGG        12 1.0 1.0610e+00 1.0 7.12e+07 0.0 9.0e+06 6.1e+02 5.0e+02  0  7 10  7 22   0  7 10  7 23 100280
GAMG: createProl      12 1.0 2.0248e+00 1.0 7.60e+07 0.0 2.9e+07 4.2e+02 1.1e+03  1  8 31 16 49   1  8 31 16 49 55945
  Graph               24 1.0 3.9569e-01 1.0 4.80e+06 0.0 1.7e+06 4.5e+02 1.4e+02  0  0  2  1  6   0  0  2  1  7 17383
  MIS/Agg             12 1.0 1.3554e-01 1.2 0.00e+00 0.0 1.7e+07 3.0e+02 2.5e+02  0  0 18  7 11   0  0 18  7 11     0
  SA: col data        12 1.0 2.5961e-02 1.1 0.00e+00 0.0 1.1e+06 6.7e+02 4.8e+01  0  0  1  1  2   0  0  1  1  2     0
  SA: frmProl0        12 1.0 3.9272e-01 1.0 0.00e+00 0.0 3.9e+05 2.7e+02 9.6e+01  0  0  0  0  4   0  0  0  0  4     0
  SA: smooth          12 1.0 8.4174e-01 1.4 5.74e+06 0.0 3.3e+06 4.9e+02 1.7e+02  0  1  3  2  8   0  1  3  2  8  9996
GAMG: partLevel       12 1.0 1.5243e+00 1.0 6.99e+07 0.0 6.8e+06 3.2e+03 4.9e+02  1  7  7 28 22   1  7  7 28 22 65544
  repartition          6 1.0 4.4372e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+01  0  0  0  0  2   0  0  0  0  2     0
  Invert-Sort          6 1.0 2.8916e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0
  Move A               6 1.0 6.3890e-02 1.1 0.00e+00 0.0 1.2e+05 2.7e+02 1.0e+02  0  0  0  0  5   0  0  0  0  5     0
  Move P               6 1.0 4.9374e-02 1.2 0.00e+00 0.0 1.5e+05 1.6e+01 1.0e+02  0  0  0  0  5   0  0  0  0  5     0
PCSetUp                6 1.0 3.6222e+00 1.0 1.46e+08 0.0 3.6e+07 9.5e+02 1.6e+03  1 15 38 44 74   1 15 38 44 75 58854
PCSetUpOnBlocks       36 1.0 2.1539e-02117.9 1.08e+02 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply               36 1.0 1.9656e+00 5.3 7.80e+08 0.0 5.5e+07 5.9e+02 3.8e+02  1 79 59 43 17   1 79 59 43 17 575586
SFSetGraph            12 1.0 5.8748e-06 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp               12 1.0 5.2335e-02 2.0 0.00e+00 0.0 8.6e+05 4.5e+02 0.0e+00  0  0  1  1  0   0  0  1  1  0     0
SFBcastBegin         273 1.0 2.4409e-02186.6 0.00e+00 0.0 1.6e+07 2.9e+02 0.0e+00  0  0 17  6  0   0  0 17  6  0     0
SFBcastEnd           273 1.0 4.6278e-02920.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector   573            573      1019832     0.
              Matrix   348            348      1351536     0.
      Matrix Coarsen    12             12         8208     0.
           Index Set   222            222       418080     0.
         Vec Scatter    81             81       113400     0.
       Krylov Solver    45             45       415944     0.
      Preconditioner    33             33        35448     0.
              Viewer     5              4         3584     0.
         PetscRandom    24             24        16656     0.
   Star Forest Graph    12             12        11328     0.
========================================================================================================================
Average time to get PetscTime(): 4.2282e-08
Average time for MPI_Barrier(): 1.84676e-05
Average time for zero size MPI_Send(): 1.59141e-06
#PETSc Option Table entries:
--prefix run_a0b0c0d0e0f0g0h0i0_n7_l3
-aggrmeth alla_serial
-beta 10.0
-betaest .true.
-check .false.
-datadt data_distribution_fully_assembled
-dm 3
-dom -1.0
-in_space .true.
-ksp_converged_reason
-ksp_max_it 500
-ksp_monitor
-ksp_norm_type unpreconditioned
-ksp_rtol 1.0e-6
-ksp_type cg
-ksp_view
-l 1
-levelset popcorn
-levelsettol 1.0e-6
-log_view
-lsdom 0.0
-maxl 9
-mg_coarse_sub_pc_factor_mat_ordering_type nd
-mg_coarse_sub_pc_type cholesky
-mg_levels_esteig_ksp_type cg
-no_signal_handler
-nruns 3
-order 1
-pc_gamg_agg_nsmooths 1
-pc_gamg_process_eq_limit 50
-pc_gamg_square_graph 0
-pc_gamg_type agg
-pc_type gamg
-petscrc /gpfs/scratch/upc26/upc26229/par_cell_aggr_poisson/paper/weak_scal_ompi/2nd-w-scal/petscrc-0
-tt 1
-uagg .true.
-wratio 10
-wsolution .false.
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 -with-blaslapack-dir=/apps/INTEL/2017.4/mkl --with-debugging=0 --with-x=0 --with-shared-libraries=1 --with-mpi=1 --with-64-bit-indices
-----------------------------------------
Libraries compiled on 2018-06-04 18:55:32 on login1 
Machine characteristics: Linux-4.4.103-92.56-default-x86_64-with-SuSE-12-x86_64
Using PETSc directory: /gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------

Using C compiler: mpicc  -fPIC  -wd1572 -g -O3  
Using Fortran compiler: mpif90  -fPIC -g -O3    
-----------------------------------------

Using include paths: -I/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/include -I/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/lib -L/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/apps/INTEL/2017.4/mkl/lib/intel64 -L/apps/INTEL/2017.4/mkl/lib/intel64 -Wl,-rpath,/usr/mpi/intel/openmpi-1.10.4-hfi/lib64 -L/usr/mpi/intel/openmpi-1.10.4-hfi/lib64 -Wl,-rpath,/gpfs/apps/MN4/INTEL/2018.0.128/compilers_and_libraries_2018.0.128/linux/compiler/lib/intel64_lin -L/gpfs/apps/MN4/INTEL/2018.0.128/compilers_and_libraries_2018.0.128/linux/compiler/lib/intel64_lin -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.8 -L/usr/lib64/gcc/x86_64-suse-linux/4.8 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s -lstdc++ -ldl
-----------------------------------------

Ending run at Wed Nov  7 01:15:13 CET 2018
Ending script at Wed Nov  7 01:15:13 CET 2018
-------------- next part --------------
Linear solve converged due to CONVERGED_RTOL iterations 12
KSP Object: 16777 MPI processes
  type: cg
  maximum iterations=500, initial guess is zero
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 16777 MPI processes
  type: gamg
    type is MULTIPLICATIVE, levels=6 cycles=v
      Cycles per PCApply=1
      Using externally compute Galerkin coarse grid matrices
      GAMG specific options
        Threshold for dropping small values in graph on each level =   0.   0.   0.   0.  
        Threshold scaling factor for each level not specified = 1.
        AGG specific options
          Symmetric graph false
          Number of levels to square graph 0
          Number smoothing steps 1
  Coarse grid solver -- level -------------------------------
            KSP Object: (mg_coarse_) 16777 MPI processes
              type: preonly
              maximum iterations=10000, initial guess is zero
              tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
              left preconditioning
              using NONE norm type for convergence test
            PC Object: (mg_coarse_) 16777 MPI processes
              type: bjacobi
                number of blocks = 16777
                Local solve is same for all blocks, in the following KSP and PC objects:
              KSP Object: (mg_coarse_sub_) 1 MPI processes
                type: preonly
                maximum iterations=1, initial guess is zero
                tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
                left preconditioning
                using NONE norm type for convergence test
              PC Object: (mg_coarse_sub_) 1 MPI processes
                type: cholesky
                  out-of-place factorization
                  tolerance for zero pivot 2.22045e-14
                  matrix ordering: nd
                  factor fill ratio given 5., needed 1.
                    Factored matrix follows:
                      Mat Object: 1 MPI processes
                        type: seqsbaij
                        rows=4, cols=4
                        package used to perform factorization: petsc
                        total: nonzeros=10, allocated nonzeros=10
                        total number of mallocs used during MatSetValues calls =0
                            block size is 1
                linear system matrix = precond matrix:
                Mat Object: 1 MPI processes
                  type: seqaij
                  rows=4, cols=4
                  total: nonzeros=16, allocated nonzeros=16
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 1 nodes, limit used is 5
              linear system matrix = precond matrix:
              Mat Object: 16777 MPI processes
                type: mpiaij
                rows=4, cols=4
                total: nonzeros=16, allocated nonzeros=16
                total number of mallocs used during MatSetValues calls =0
                  using I-node (on process 0) routines: found 1 nodes, limit used is 5
  Down solver (pre-smoother) on level 1 -------------------------------
          KSP Object: (mg_levels_1_) 16777 MPI processes
            type: chebyshev
              eigenvalue estimates used:  min = 0.0999843, max = 1.09983
              eigenvalues estimate via cg min 0.575611, max 0.999843
              eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
            KSP Object: (mg_levels_1_esteig_) 16777 MPI processes
              type: cg
              maximum iterations=10, initial guess is zero
              tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
              left preconditioning
              using PRECONDITIONED norm type for convergence test
              estimating eigenvalues using noisy right hand side
            maximum iterations=2, nonzero initial guess
            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
            left preconditioning
            using NONE norm type for convergence test
          PC Object: (mg_levels_1_) 16777 MPI processes
            type: sor
              type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
            linear system matrix = precond matrix:
            Mat Object: 16777 MPI processes
              type: mpiaij
              rows=269, cols=269
              total: nonzeros=46217, allocated nonzeros=46217
              total number of mallocs used during MatSetValues calls =0
                not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
        KSP Object: (mg_levels_2_) 16777 MPI processes
          type: chebyshev
            eigenvalue estimates used:  min = 0.18053, max = 1.98584
            eigenvalues estimate via cg min 0.0637775, max 1.8053
            eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
          KSP Object: (mg_levels_2_esteig_) 16777 MPI processes
            type: cg
            maximum iterations=10, initial guess is zero
            tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
            left preconditioning
            using PRECONDITIONED norm type for convergence test
            estimating eigenvalues using noisy right hand side
          maximum iterations=2, nonzero initial guess
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
          left preconditioning
          using NONE norm type for convergence test
        PC Object: (mg_levels_2_) 16777 MPI processes
          type: sor
            type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
          linear system matrix = precond matrix:
          Mat Object: 16777 MPI processes
            type: mpiaij
            rows=18451, cols=18451
            total: nonzeros=5470355, allocated nonzeros=5470355
            total number of mallocs used during MatSetValues calls =0
              not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
      KSP Object: (mg_levels_3_) 16777 MPI processes
        type: chebyshev
          eigenvalue estimates used:  min = 0.156694, max = 1.72364
          eigenvalues estimate via cg min 0.0434381, max 1.56694
          eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
        KSP Object: (mg_levels_3_esteig_) 16777 MPI processes
          type: cg
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
          estimating eigenvalues using noisy right hand side
        maximum iterations=2, nonzero initial guess
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        using NONE norm type for convergence test
      PC Object: (mg_levels_3_) 16777 MPI processes
        type: sor
          type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
        linear system matrix = precond matrix:
        Mat Object: 16777 MPI processes
          type: mpiaij
          rows=908791, cols=908791
          total: nonzeros=191134331, allocated nonzeros=191134331
          total number of mallocs used during MatSetValues calls =0
            using scalable MatPtAP() implementation
            not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 4 -------------------------------
    KSP Object: (mg_levels_4_) 16777 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0.13616, max = 1.49776
        eigenvalues estimate via cg min 0.0335059, max 1.3616
        eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
      KSP Object: (mg_levels_4_esteig_) 16777 MPI processes
        type: cg
        maximum iterations=10, initial guess is zero
        tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
        left preconditioning
        using PRECONDITIONED norm type for convergence test
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_4_) 16777 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object: 16777 MPI processes
        type: mpiaij
        rows=20910556, cols=20910556
        total: nonzeros=1618051660, allocated nonzeros=1618051660
        total number of mallocs used during MatSetValues calls =0
          using scalable MatPtAP() implementation
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 5 -------------------------------
  KSP Object: (mg_levels_5_) 16777 MPI processes
    type: chebyshev
      eigenvalue estimates used:  min = 0.336389, max = 3.70028
      eigenvalues estimate via cg min 0.0534122, max 3.36389
      eigenvalues estimated using cg with translations  [0. 0.1; 0. 1.1]
    KSP Object: (mg_levels_5_esteig_) 16777 MPI processes
      type: cg
      maximum iterations=10, initial guess is zero
      tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
      left preconditioning
      using PRECONDITIONED norm type for convergence test
      estimating eigenvalues using noisy right hand side
    maximum iterations=2, nonzero initial guess
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (mg_levels_5_) 16777 MPI processes
    type: sor
      type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
    linear system matrix = precond matrix:
    Mat Object: 16777 MPI processes
      type: mpiaij
      rows=260421387, cols=260421387
      total: nonzeros=7197955643, allocated nonzeros=156252832200
      total number of mallocs used during MatSetValues calls =0
        not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object: 16777 MPI processes
    type: mpiaij
    rows=260421387, cols=260421387
    total: nonzeros=7197955643, allocated nonzeros=156252832200
    total number of mallocs used during MatSetValues calls =0
      not using I-node (on process 0) routines

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

/gpfs/scratch/upc26/upc26229/build_rel_fempar_cell_agg_ompi/FEMPAR/bin/par_test_h_adaptive_poisson_unfitted on a arch-linux2-c-opt named s02r2b25 with 16777 processors, by upc26229 Wed Nov  7 01:31:35 2018
Using Petsc Release Version 3.9.0, Apr, 07, 2018 

                         Max       Max/Min        Avg      Total 
Time (sec):           3.137e+02      1.00000   3.137e+02
Objects:              1.745e+03      1.00172   1.742e+03
Flop:                 9.833e+08      0.00000   6.678e+08  1.120e+13
Flop/sec:            3.134e+06      0.00000   2.129e+06  3.571e+10
MPI Messages:         2.180e+05   19813.90909   5.157e+04  8.652e+08
MPI Message Lengths:  6.226e+07   1414906.81818   7.366e+02  6.374e+11
MPI Reductions:       3.011e+03      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flop
                            and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 3.1371e+02 100.0%  1.1204e+13 100.0%  8.652e+08 100.0%  7.366e+02      100.0%  2.998e+03  99.6% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

BuildTwoSided         15 1.0 1.2635e-01 2.4 0.00e+00 0.0 2.3e+06 8.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BuildTwoSidedF       123 1.0 1.4200e+01 7.9 0.00e+00 0.0 9.0e+06 1.5e+04 0.0e+00  4  0  1 21  0   4  0  1 21  0     0
VecMDot              150 1.0 4.6170e-01 1.7 7.48e+06 0.0 0.0e+00 0.0e+00 1.5e+02  0  1  0  0  5   0  1  0  0  5 201728
VecTDot              384 1.0 3.1098e+00 4.0 5.80e+06 0.0 0.0e+00 0.0e+00 3.8e+02  0  1  0  0 13   0  1  0  0 13 23494
VecNorm              369 1.0 1.1910e+00 1.5 4.59e+06 0.0 0.0e+00 0.0e+00 3.7e+02  0  1  0  0 12   0  1  0  0 12 48340
VecScale             165 1.0 1.2617e-02212.0 7.49e+05 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 738275
VecCopy              231 1.0 1.0534e-02327.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               864 1.0 1.9701e-02189.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              387 1.0 4.0720e-02309.4 5.80e+06 0.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 1794326
VecAYPX             1608 1.0 2.3452e-02168.1 1.07e+07 0.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 5715622
VecAXPBYCZ           720 1.0 1.8540e-02276.4 1.63e+07 0.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 10961725
VecMAXPY             165 1.0 1.5714e-02398.8 8.85e+06 0.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 7005508
VecAssemblyBegin      42 1.0 4.3601e-01 1.5 0.00e+00 0.0 1.0e+06 1.8e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd        42 1.0 5.7755e-03134.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult     165 1.0 2.8011e-03138.9 7.49e+05 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 3325353
VecScatterBegin     1842 1.0 1.0872e-01510.7 0.00e+00 0.0 5.1e+08 6.4e+02 0.0e+00  0  0 59 51  0   0  0 59 51  0     0
VecScatterEnd       1842 1.0 2.2453e+0013155.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSetRandom          15 1.0 3.1115e-031343.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize         165 1.0 5.2679e-01 1.3 2.25e+06 0.0 0.0e+00 0.0e+00 1.6e+02  0  0  0  0  5   0  0  0  0  6 53045
MatMult             1416 1.0 1.9601e+002744.6 4.82e+08 0.0 4.3e+08 7.2e+02 0.0e+00  0 48 50 48  0   0 48 50 48  0 2757975
MatMultAdd           180 1.0 7.8791e-019183.2 1.33e+07 0.0 3.3e+07 1.6e+02 0.0e+00  0  1  4  1  0   0  1  4  1  0 186808
MatMultTranspose     180 1.0 1.0899e+006594.6 1.33e+07 0.0 3.3e+07 1.6e+02 0.0e+00  0  1  4  1  0   0  1  4  1  0 135049
MatSolve              36 0.0 4.2692e-05 0.0 1.01e+03 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    24
MatSOR              1245 1.0 1.0514e+007119.8 3.46e+08 0.0 0.0e+00 0.0e+00 0.0e+00  0 34  0  0  0   0 34  0  0  0 3644239
MatCholFctrSym         3 1.0 1.4223e-02755.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCholFctrNum         3 1.0 8.4996e-033106.3 1.20e+01 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatConvert            15 1.0 4.5698e-0228.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale              45 1.0 1.0334e-011406.4 5.72e+06 0.0 4.5e+06 6.9e+02 0.0e+00  0  1  1  0  0   0  1  1  0  0 641987
MatResidual          180 1.0 3.4803e-012420.0 5.74e+07 0.0 5.4e+07 6.9e+02 0.0e+00  0  6  6  6  0   0  6  6  6  0 1864527
MatAssemblyBegin     306 1.0 1.3789e+01 5.9 0.00e+00 0.0 8.0e+06 1.7e+04 0.0e+00  4  0  1 21  0   4  0  1 21  0     0
MatAssemblyEnd       306 1.0 1.3870e+01 1.0 0.00e+00 0.0 4.6e+07 1.5e+02 6.5e+02  4  0  5  1 22   4  0  5  1 22     0
MatGetRow         204147 0.0 3.1561e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            3 0.0 1.3432e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCreateSubMat       18 1.0 7.7521e+00 1.0 0.00e+00 0.0 1.4e+06 2.2e+02 2.8e+02  2  0  0  0  9   2  0  0  0  9     0
MatGetOrdering         3 0.0 1.4079e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen            15 1.0 1.0908e+00 1.1 0.00e+00 0.0 2.6e+08 2.4e+02 5.3e+02  0  0 30 10 18   0  0 30 10 18     0
MatZeroEntries        15 1.0 2.3229e-03397.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView               27 1.3 3.1142e-01 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+01  0  0  0  0  1   0  0  0  0  1     0
MatAXPY               15 1.0 2.4644e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMult            15 1.0 6.6292e+00 1.0 4.79e+06 0.0 2.6e+07 4.9e+02 1.9e+02  2  0  3  2  6   2  0  3  2  6  8157
MatMatMultSym         15 1.0 6.2354e+00 1.0 0.00e+00 0.0 2.1e+07 4.5e+02 1.8e+02  2  0  2  2  6   2  0  2  2  6     0
MatMatMultNum         15 1.0 1.1974e-01 1.3 4.79e+06 0.0 4.5e+06 6.9e+02 0.0e+00  0  0  1  0  0   0  0  1  0  0 451599
MatPtAP               15 1.0 7.7822e+00 1.0 7.71e+07 0.0 5.5e+07 3.2e+03 2.3e+02  2  7  6 27  8   2  7  6 27  8 101415
MatPtAPSymbolic       15 1.0 4.5700e+00 1.0 0.00e+00 0.0 2.5e+07 3.7e+03 1.0e+02  1  0  3 15  3   1  0  3 15  4     0
MatPtAPNumeric        15 1.0 3.3127e+00 1.0 7.71e+07 0.0 2.9e+07 2.8e+03 1.2e+02  1  7  3 13  4   1  7  3 13  4 238246
MatGetLocalMat        45 1.0 1.9456e-0244.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol         45 1.0 1.0016e-01150.0 0.00e+00 0.0 3.2e+07 2.5e+03 0.0e+00  0  0  4 12  0   0  0  4 12  0     0
KSPGMRESOrthog       150 1.0 4.6669e-01 1.7 1.50e+07 0.0 0.0e+00 0.0e+00 1.5e+02  0  2  0  0  5   0  2  0  0  5 399162
KSPSetUp              54 1.0 1.4842e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+01  0  0  0  0  1   0  0  0  0  1     0
KSPSolve               3 1.0 3.9999e+00 1.1 8.39e+08 0.0 4.5e+08 6.4e+02 5.9e+02  1 85 52 45 20   1 85 52 45 20 2380078
PCGAMGGraph_AGG       15 1.0 6.2019e+00 1.0 4.79e+06 0.0 1.4e+07 4.6e+02 1.8e+02  2  0  2  1  6   2  0  2  1  6  8719
PCGAMGCoarse_AGG      15 1.0 1.0930e+00 1.1 0.00e+00 0.0 2.6e+08 2.4e+02 5.3e+02  0  0 30 10 18   0  0 30 10 18     0
PCGAMGProl_AGG        15 1.0 7.1075e+00 1.0 0.00e+00 0.0 1.2e+07 5.8e+02 2.4e+02  2  0  1  1  8   2  0  1  1  8     0
PCGAMGPOpt_AGG        15 1.0 1.1478e+01 1.0 7.11e+07 0.0 7.1e+07 6.2e+02 6.2e+02  4  8  8  7 21   4  8  8  7 21 73252
GAMG: createProl      15 1.0 2.5825e+01 1.0 7.59e+07 0.0 3.5e+08 3.4e+02 1.6e+03  8  8 41 19 52   8  8 41 19 53 34652
  Graph               30 1.0 6.1862e+00 1.0 4.79e+06 0.0 1.4e+07 4.6e+02 1.8e+02  2  0  2  1  6   2  0  2  1  6  8741
  MIS/Agg             15 1.0 1.0910e+00 1.1 0.00e+00 0.0 2.6e+08 2.4e+02 5.3e+02  0  0 30 10 18   0  0 30 10 18     0
  SA: col data        15 1.0 2.3531e+00 1.0 0.00e+00 0.0 9.0e+06 6.9e+02 6.0e+01  1  0  1  1  2   1  0  1  1  2     0
  SA: frmProl0        15 1.0 2.8294e+00 1.0 0.00e+00 0.0 3.2e+06 2.7e+02 1.2e+02  1  0  0  0  4   1  0  0  0  4     0
  SA: smooth          15 1.0 7.7540e+00 1.0 5.72e+06 0.0 2.6e+07 4.9e+02 2.2e+02  2  1  3  2  7   2  1  3  2  7  8556
GAMG: partLevel       15 1.0 1.9884e+01 1.0 7.71e+07 0.0 5.6e+07 3.1e+03 6.8e+02  6  7  6 28 23   6  7  6 28 23 39691
  repartition          9 1.0 1.2793e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.4e+01  0  0  0  0  2   0  0  0  0  2     0
  Invert-Sort          9 1.0 1.6471e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+01  1  0  0  0  1   1  0  0  0  1     0
  Move A               9 1.0 3.9110e+00 1.0 0.00e+00 0.0 5.1e+05 5.6e+02 1.5e+02  1  0  0  0  5   1  0  0  0  5     0
  Move P               9 1.0 3.9752e+00 1.0 0.00e+00 0.0 8.8e+05 2.1e+01 1.5e+02  1  0  0  0  5   1  0  0  0  5     0
PCSetUp                6 1.0 4.7888e+01 1.0 1.51e+08 0.0 4.1e+08 7.2e+02 2.3e+03 15 15 47 46 78  15 15 47 46 78 35168
PCSetUpOnBlocks       36 1.0 2.2114e-0248.6 1.20e+01 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply               36 1.0 3.2792e+00 2.8 7.84e+08 0.0 4.4e+08 6.1e+02 4.8e+02  1 79 50 41 16   1 79 50 41 16 2713702
SFSetGraph            15 1.0 1.1425e-05 9.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp               15 1.0 1.4317e-01 1.9 0.00e+00 0.0 6.8e+06 4.6e+02 0.0e+00  0  0  1  0  0   0  0  1  0  0     0
SFBcastBegin         564 1.0 3.8685e-02183.9 0.00e+00 0.0 2.5e+08 2.4e+02 0.0e+00  0  0 29  9  0   0  0 29  9  0     0
SFBcastEnd           564 1.0 3.0283e-013088.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector   735            735      1523448     0.
              Matrix   444            444     11007960     0.
      Matrix Coarsen    15             15        10260     0.
           Index Set   303            303      2069856     0.
         Vec Scatter   105            105       147192     0.
       Krylov Solver    54             54       516960     0.
      Preconditioner    39             39        41352     0.
              Viewer     5              4         3584     0.
         PetscRandom    30             30        20820     0.
   Star Forest Graph    15             15        14160     0.
========================================================================================================================
Average time to get PetscTime(): 4.25614e-08
Average time for MPI_Barrier(): 0.00116431
Average time for zero size MPI_Send(): 1.79813e-06
#PETSc Option Table entries:
--prefix run_a0b0c0d0e0f0g0h0i0_n8_l3
-aggrmeth alla_serial
-beta 10.0
-betaest .true.
-check .false.
-datadt data_distribution_fully_assembled
-dm 3
-dom -1.0
-in_space .true.
-ksp_converged_reason
-ksp_max_it 500
-ksp_monitor
-ksp_norm_type unpreconditioned
-ksp_rtol 1.0e-6
-ksp_type cg
-ksp_view
-l 1
-levelset popcorn
-levelsettol 1.0e-6
-log_view
-lsdom 0.0
-maxl 10
-mg_coarse_sub_pc_factor_mat_ordering_type nd
-mg_coarse_sub_pc_type cholesky
-mg_levels_esteig_ksp_type cg
-no_signal_handler
-nruns 3
-order 1
-pc_gamg_agg_nsmooths 1
-pc_gamg_process_eq_limit 50
-pc_gamg_square_graph 0
-pc_gamg_type agg
-pc_type gamg
-petscrc /gpfs/scratch/upc26/upc26229/par_cell_aggr_poisson/paper/weak_scal_ompi/2nd-w-scal/petscrc-0
-tt 1
-uagg .true.
-wratio 10
-wsolution .false.
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 -with-blaslapack-dir=/apps/INTEL/2017.4/mkl --with-debugging=0 --with-x=0 --with-shared-libraries=1 --with-mpi=1 --with-64-bit-indices
-----------------------------------------
Libraries compiled on 2018-06-04 18:55:32 on login1 
Machine characteristics: Linux-4.4.103-92.56-default-x86_64-with-SuSE-12-x86_64
Using PETSc directory: /gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------

Using C compiler: mpicc  -fPIC  -wd1572 -g -O3  
Using Fortran compiler: mpif90  -fPIC -g -O3    
-----------------------------------------

Using include paths: -I/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/include -I/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/lib -L/gpfs/scratch/upc26/upc26229/petsc_cell_agg_openmpi/release/petsc-3.9.0/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/apps/INTEL/2017.4/mkl/lib/intel64 -L/apps/INTEL/2017.4/mkl/lib/intel64 -Wl,-rpath,/usr/mpi/intel/openmpi-1.10.4-hfi/lib64 -L/usr/mpi/intel/openmpi-1.10.4-hfi/lib64 -Wl,-rpath,/gpfs/apps/MN4/INTEL/2018.0.128/compilers_and_libraries_2018.0.128/linux/compiler/lib/intel64_lin -L/gpfs/apps/MN4/INTEL/2018.0.128/compilers_and_libraries_2018.0.128/linux/compiler/lib/intel64_lin -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.8 -L/usr/lib64/gcc/x86_64-suse-linux/4.8 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s -lstdc++ -ldl
-----------------------------------------

Ending run at Wed Nov  7 01:31:37 CET 2018
Ending script at Wed Nov  7 01:31:37 CET 2018


More information about the petsc-users mailing list