[petsc-dev] Kokkos/Crusher perforance

Mark Adams mfadams at lbl.gov
Tue Jan 25 11:29:06 CST 2022


>
>
>
> > VecPointwiseMult     201 1.0 1.0471e-02 1.1 3.09e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   1  1  0  0  0 235882   290088      0 0.00e+00    0
> 0.00e+00 100
> > VecScatterBegin      200 1.0 1.8458e-01 1.1 0.00e+00 0.0 1.1e+04 6.6e+04
> 1.0e+00  2  0 99 79  0  19  0100100  0     0       0      1 2.04e-04    0
> 0.00e+00  0
> > VecScatterEnd        200 1.0 1.9007e-02 3.7 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0
> 0.00e+00  0
>
> I'm curious how these change with problem size. (To what extent are we
> latency vs bandwidth limited?)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20220125/27cc7fd6/attachment-0001.html>
-------------- next part --------------
DM Object: box 8 MPI processes
  type: plex
box in 3 dimensions:
  Number of 0-cells per rank: 35937 35937 35937 35937 35937 35937 35937 35937
  Number of 1-cells per rank: 104544 104544 104544 104544 104544 104544 104544 104544
  Number of 2-cells per rank: 101376 101376 101376 101376 101376 101376 101376 101376
  Number of 3-cells per rank: 32768 32768 32768 32768 32768 32768 32768 32768
Labels:
  celltype: 4 strata with value/size (0 (35937), 1 (104544), 4 (101376), 7 (32768))
  depth: 4 strata with value/size (0 (35937), 1 (104544), 2 (101376), 3 (32768))
  marker: 1 strata with value/size (1 (12474))
  Face Sets: 3 strata with value/size (1 (3969), 3 (3969), 6 (3969))
  Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=2048383, cols=2048383
    total: nonzeros=127263527, allocated nonzeros=127263527
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
  Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=2048383, cols=2048383
    total: nonzeros=127263527, allocated nonzeros=127263527
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
  Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=2048383, cols=2048383
    total: nonzeros=127263527, allocated nonzeros=127263527
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
**************************************** ***********************************************************************************************************************
***                                WIDEN YOUR WINDOW TO 160 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document                                 ***
****************************************************************************************************************************************************************

------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------



      ##########################################################
      #                                                        #
      #                       WARNING!!!                       #
      #                                                        #
      # This code was compiled with GPU support and you've     #
      # created PETSc/GPU objects, but you intentionally used  #
      # -use_gpu_aware_mpi 0, such that PETSc had to copy data #
      # from GPU to CPU for communication. To get meaningfull  #
      # timing results, please use GPU-aware MPI instead.      #
      ##########################################################


/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tests/data/../ex13 on a arch-olcf-crusher named crusher002 with 8 processors, by adams Tue Jan 25 10:11:42 2022
Using Petsc Development GIT revision: v3.16.3-696-g46640c56cb  GIT Date: 2022-01-25 09:20:51 -0500

                         Max       Max/Min     Avg       Total
Time (sec):           6.781e+01     1.000   6.781e+01
Objects:              1.920e+03     1.028   1.877e+03
Flop:                 2.402e+10     1.054   2.340e+10  1.872e+11
Flop/sec:             3.543e+08     1.054   3.451e+08  2.761e+09
MPI Messages:         4.778e+03     1.063   4.552e+03  3.642e+04
MPI Message Lengths:  1.120e+08     1.030   2.416e+04  8.799e+08
MPI Reductions:       1.988e+03     1.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flop
                            and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 6.7422e+01  99.4%  7.4725e+10  39.9%  1.402e+04  38.5%  2.884e+04       45.9%  7.630e+02  38.4%
 1:         PCSetUp: 1.5260e-02   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0%
 2:  KSP Solve only: 3.7025e-01   0.5%  1.1247e+11  60.1%  2.240e+04  61.5%  2.123e+04       54.1%  1.206e+03  60.7%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
   GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors)
   CpuToGpu Count: total number of CPU to GPU copies per processor
   CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor)
   GpuToCpu Count: total number of GPU to CPU copies per processor
   GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor)
   GPU %F: percent flops on GPU in this event
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total   GPU    - CpuToGpu -   - GpuToCpu - GPU
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s Mflop/s Count   Size   Count   Size  %F
---------------------------------------------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

PetscBarrier           5 1.0 1.5071e-01 1.0 0.00e+00 0.0 7.8e+02 9.9e+02 1.8e+01  0  0  2  0  1   0  0  6  0  2     0       0      0 0.00e+00    0 0.00e+00  0
BuildTwoSided         40 1.0 2.6699e-0122.5 0.00e+00 0.0 7.1e+02 4.0e+00 4.0e+01  0  0  2  0  2   0  0  5  0  5     0       0      0 0.00e+00    0 0.00e+00  0
BuildTwoSidedF         6 1.0 2.4194e-0125.8 0.00e+00 0.0 1.5e+02 4.8e+05 6.0e+00  0  0  0  8  0   0  0  1 18  1     0       0      0 0.00e+00    0 0.00e+00  0
MatMult            12109 1.0 1.2425e-01 1.1 6.56e+09 1.1 1.1e+04 2.1e+04 2.0e+00  0 27 32 27  0   0 68 82 59  0 409706   663872    401 5.95e+01  400 5.94e+01 100
MatAssemblyBegin      43 1.0 2.5993e-01 2.6 0.00e+00 0.0 1.5e+02 4.8e+05 6.0e+00  0  0  0  8  0   0  0  1 18  1     0       0      0 0.00e+00    0 0.00e+00  0
MatAssemblyEnd        43 1.0 1.8228e-01 4.1 1.16e+06 0.0 0.0e+00 0.0e+00 9.0e+00  0  0  0  0  0   0  0  0  0  1    25       0      0 0.00e+00    0 0.00e+00  0
MatZeroEntries         3 1.0 4.3496e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
MatView                1 1.0 7.9864e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSetUp               1 1.0 1.3476e-03 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSolve               1 1.0 2.7099e-01 1.3 7.24e+09 1.1 1.1e+04 2.1e+04 6.0e+02  0 30 31 27 30   0 75 81 59 79 207524   452242    401 5.95e+01  400 5.94e+01 100
SNESSolve              1 1.0 2.9168e+01 1.0 8.41e+09 1.1 1.1e+04 2.4e+04 6.1e+02 43 35 31 31 31  43 88 82 68 80  2251   451845    405 6.17e+01  406 6.37e+01 86
SNESSetUp              1 1.0 6.0051e+00 1.0 0.00e+00 0.0 3.6e+02 2.3e+05 1.8e+01  9  0  1 10  1   9  0  3 21  2     0       0      0 0.00e+00    0 0.00e+00  0
SNESFunctionEval       2 1.0 1.9772e+00 1.1 7.96e+08 1.0 1.1e+02 1.5e+04 3.0e+00  3  3  0  0  0   3  9  1  0  0  3222   23096      6 4.32e+00    6 4.29e+00  0
SNESJacobianEval       2 1.0 5.8925e+01 1.0 1.52e+09 1.0 1.1e+02 6.5e+05 2.0e+00 87  6  0  8  0  87 16  1 18  0   206       0      0 0.00e+00    6 4.29e+00  0
DMCreateInterp         1 1.0 8.8719e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01  0  0  0  0  1   0  0  1  0  2   748       0      0 0.00e+00    0 0.00e+00  0
DMCreateMat            1 1.0 6.0045e+00 1.0 0.00e+00 0.0 3.6e+02 2.3e+05 1.8e+01  9  0  1 10  1   9  0  3 21  2     0       0      0 0.00e+00    0 0.00e+00  0
Mesh Partition         1 1.0 6.9055e-04 1.1 0.00e+00 0.0 3.5e+01 1.1e+02 8.0e+00  0  0  0  0  0   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
Mesh Migration         1 1.0 2.4871e-01 1.0 0.00e+00 0.0 2.0e+02 8.2e+01 2.9e+01  0  0  1  0  1   0  0  1  0  4     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartSelf         1 1.0 8.2599e-0512.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartLblInv       1 1.0 3.1138e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartLblSF        1 1.0 2.4519e-04 3.8 0.00e+00 0.0 1.4e+01 5.6e+01 1.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartStrtSF       1 1.0 9.7758e-05 1.2 0.00e+00 0.0 7.0e+00 2.2e+02 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPointSF          1 1.0 2.1684e-04 1.1 0.00e+00 0.0 1.4e+01 2.7e+02 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexInterp          19 1.0 5.8227e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistribute       1 1.0 2.4963e-01 1.0 0.00e+00 0.0 2.5e+02 9.7e+01 3.7e+01  0  0  1  0  2   0  0  2  0  5     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistCones        1 1.0 1.0398e-04 1.0 0.00e+00 0.0 4.2e+01 1.4e+02 2.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistLabels       1 1.0 2.2918e-04 1.0 0.00e+00 0.0 1.0e+02 6.6e+01 2.4e+01  0  0  0  0  1   0  0  1  0  3     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistField        1 1.0 2.4822e-01 1.0 0.00e+00 0.0 4.9e+01 5.9e+01 2.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexStratify        33 1.0 5.5400e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+00  0  0  0  0  0   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexSymmetrize      33 1.0 1.2772e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPrealloc         1 1.0 5.9988e+00 1.0 0.00e+00 0.0 3.6e+02 2.3e+05 1.6e+01  9  0  1 10  1   9  0  3 21  2     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexResidualFE       2 1.0 1.5634e+00 1.0 7.87e+08 1.0 0.0e+00 0.0e+00 0.0e+00  2  3  0  0  0   2  8  0  0  0  4027       0      0 0.00e+00    0 0.00e+00  0
DMPlexJacobianFE       2 1.0 5.8811e+01 1.0 1.51e+09 1.0 7.6e+01 9.7e+05 2.0e+00 87  6  0  8  0  87 16  1 18  0   205       0      0 0.00e+00    0 0.00e+00  0
DMPlexInterpFE         1 1.0 8.6671e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01  0  0  0  0  1   0  0  1  0  2   766       0      0 0.00e+00    0 0.00e+00  0
SFSetGraph            43 1.0 7.8551e-04 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFSetUp               34 1.0 5.0999e-02 1.2 0.00e+00 0.0 1.3e+03 2.4e+04 3.4e+01  0  0  3  3  2   0  0  9  7  4     0       0      0 0.00e+00    0 0.00e+00  0
SFBcastBegin          65 1.0 1.6080e-0147.2 0.00e+00 0.0 9.8e+02 1.4e+04 0.0e+00  0  0  3  2  0   0  0  7  3  0     0       0      1 2.44e-02   11 8.58e+00  0
SFBcastEnd            65 1.0 2.6412e-0137.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFReduceBegin         16 1.0 1.3437e-0172.5 5.24e+05 1.0 2.9e+02 1.0e+05 0.0e+00  0  0  1  3  0   0  0  2  7  0    30       0      2 4.10e+00    0 0.00e+00 100
SFReduceEnd           16 1.0 1.9717e-0125.8 2.50e+04 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      4 1.95e-01    0 0.00e+00 100
SFFetchOpBegin         2 1.0 5.4462e-04134.6 0.00e+00 0.0 3.8e+01 2.5e+05 0.0e+00  0  0  0  1  0   0  0  0  2  0     0       0      0 0.00e+00    0 0.00e+00  0
SFFetchOpEnd           2 1.0 2.9438e-03 1.8 0.00e+00 0.0 3.8e+01 2.5e+05 0.0e+00  0  0  0  1  0   0  0  0  2  0     0       0      0 0.00e+00    0 0.00e+00  0
SFCreateEmbed          8 1.0 9.4267e-02141.4 0.00e+00 0.0 1.4e+02 8.5e+02 0.0e+00  0  0  0  0  0   0  0  1  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFDistSection          9 1.0 3.2934e-03 2.0 0.00e+00 0.0 3.1e+02 6.5e+03 1.1e+01  0  0  1  0  1   0  0  2  0  1     0       0      0 0.00e+00    0 0.00e+00  0
SFSectionSF           16 1.0 2.0571e-02 1.7 0.00e+00 0.0 4.8e+02 2.0e+04 1.6e+01  0  0  1  1  1   0  0  3  2  2     0       0      0 0.00e+00    0 0.00e+00  0
SFRemoteOff            7 1.0 9.5613e-0245.9 0.00e+00 0.0 4.2e+02 1.6e+03 4.0e+00  0  0  1  0  0   0  0  3  0  1     0       0      0 0.00e+00    0 0.00e+00  0
SFPack               290 1.0 1.5873e-0160.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      2 9.87e-02    0 0.00e+00  0
SFUnpack             292 1.0 1.3361e-0164.1 5.49e+05 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    31       0      0 0.00e+00    0 0.00e+00 100
VecTDot              401 1.0 3.4812e-02 1.4 2.10e+08 1.0 0.0e+00 0.0e+00 4.0e+02  0  1  0  0 20   0  2  0  0 53 47191   100573      0 0.00e+00    0 0.00e+00 100
VecNorm              201 1.0 8.0375e-02 5.6 1.05e+08 1.0 0.0e+00 0.0e+00 2.0e+02  0  0  0  0 10   0  1  0  0 26 10245   76979      0 0.00e+00    0 0.00e+00 100
VecCopy                2 1.0 1.0589e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecSet                54 1.0 1.5410e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecAXPY              400 1.0 1.1236e-02 1.1 2.10e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0 145846   203600      0 0.00e+00    0 0.00e+00 100
VecAYPX              199 1.0 5.2408e-03 1.1 1.04e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0 155561   226571      0 0.00e+00    0 0.00e+00 100
VecPointwiseMult     201 1.0 6.0364e-03 1.1 5.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0 68207   98699      0 0.00e+00    0 0.00e+00 100
VecScatterBegin      201 1.0 2.0069e-02 1.5 0.00e+00 0.0 1.1e+04 2.1e+04 2.0e+00  0  0 32 27  0   0  0 82 59  0     0       0      1 7.43e-02  400 5.94e+01  0
VecScatterEnd        201 1.0 1.1790e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0    400 5.94e+01    0 0.00e+00  0
DualSpaceSetUp         2 1.0 2.6182e-03 1.0 1.80e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     6       0      0 0.00e+00    0 0.00e+00  0
FESetUp                2 1.0 9.8481e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
PCSetUp                1 1.0 4.3290e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
PCApply              201 1.0 2.7593e-02 1.0 5.27e+07 1.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  1  0  0  0 14921   40922      0 0.00e+00    0 0.00e+00 100

--- Event Stage 1: PCSetUp

PCSetUp                1 1.0 1.6107e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0 100  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

--- Event Stage 2: KSP Solve only

MatMult              400 1.0 2.1590e-01 1.1 1.31e+10 1.1 2.2e+04 2.1e+04 0.0e+00  0 54 62 54  0  56 91100100  0 471565   716907    800 1.19e+02  800 1.19e+02 100
MatView                2 1.0 8.4994e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSolve               2 1.0 4.0167e-01 1.2 1.45e+10 1.1 2.2e+04 2.1e+04 1.2e+03  1 60 62 54 61 100100100100100 280015   509401    800 1.19e+02  800 1.19e+02 100
SFPack               400 1.0 1.2314e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFUnpack             400 1.0 7.6822e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecTDot              802 1.0 6.8937e-02 1.4 4.20e+08 1.0 0.0e+00 0.0e+00 8.0e+02  0  2  0  0 40  15  3  0  0 67 47661   96803      0 0.00e+00    0 0.00e+00 100
VecNorm              402 1.0 9.4944e-02 3.9 2.11e+08 1.0 0.0e+00 0.0e+00 4.0e+02  0  1  0  0 20  17  1  0  0 33 17346   98036      0 0.00e+00    0 0.00e+00 100
VecCopy                4 1.0 2.0016e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecSet                 4 1.0 1.6835e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecAXPY              800 1.0 2.0465e-02 1.1 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   5  3  0  0  0 160145   232422      0 0.00e+00    0 0.00e+00 100
VecAYPX              398 1.0 1.0591e-02 1.1 2.09e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   3  1  0  0  0 153947   223967      0 0.00e+00    0 0.00e+00 100
VecPointwiseMult     402 1.0 1.1385e-02 1.1 1.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   3  1  0  0  0 72327   105141      0 0.00e+00    0 0.00e+00 100
VecScatterBegin      400 1.0 3.7700e-02 1.6 0.00e+00 0.0 2.2e+04 2.1e+04 0.0e+00  0  0 62 54  0   8  0100100  0     0       0      0 0.00e+00  800 1.19e+02  0
VecScatterEnd        400 1.0 2.1792e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   5  0  0  0  0     0       0    800 1.19e+02    0 0.00e+00  0
PCApply              402 1.0 1.1477e-02 1.1 1.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   3  1  0  0  0 71747   105141      0 0.00e+00    0 0.00e+00 100
---------------------------------------------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Container    32             32        18432     0.
                SNES     1              1         1540     0.
              DMSNES     1              1          688     0.
       Krylov Solver     1              1         1664     0.
     DMKSP interface     1              1          656     0.
              Matrix    75             75    195551600     0.
    Distributed Mesh    70             70      7826872     0.
            DM Label   172            172       108704     0.
          Quadrature   148            148        87616     0.
      Mesh Transform     5              5         3780     0.
           Index Set   633            633      1440932     0.
   IS L to G Mapping     2              2      1100416     0.
             Section   249            249       177288     0.
   Star Forest Graph   173            173       188592     0.
     Discrete System   116            116       111364     0.
           Weak Form   117            117        72072     0.
    GraphPartitioner    33             33        22704     0.
              Vector    54             54     19589336     0.
        Linear Space     5              5         3416     0.
          Dual Space    26             26        24336     0.
            FE Space     2              2         1576     0.
              Viewer     2              1          840     0.
      Preconditioner     1              1          872     0.
       Field over DM     1              1          704     0.

--- Event Stage 1: PCSetUp


--- Event Stage 2: KSP Solve only

========================================================================================================================
Average time to get PetscTime(): 3.4e-08
Average time for MPI_Barrier(): 2.611e-06
Average time for zero size MPI_Send(): 1.07531e-05
#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 5
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=cc --with-cxx=CC --with-fc=ftn --with-fortran-bindings=0 LIBS="-L/opt/cray/pe/mpich/8.1.12/gtl/lib -lmpi_gtl_hsa" --with-debugging=0 --COPTFLAGS="-g -O" --CXXOPTFLAGS="-g -O" --FOPTFLAGS=-g --with-mpiexec="srun -p batch -N 1 -A csc314_crusher -t 00:10:00" --with-hip --with-hipc=hipcc --download-hypre --with-hip-arch=gfx90a --download-kokkos --download-kokkos-kernels --with-kokkos-kernels-tpl=0 --download-p4est=1 --with-zlib-dir=/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4 PETSC_ARCH=arch-olcf-crusher
-----------------------------------------
Libraries compiled on 2022-01-25 14:29:13 on login2 
Machine characteristics: Linux-5.3.18-59.16_11.0.39-cray_shasta_c-x86_64-with-glibc2.3.4
Using PETSc directory: /gpfs/alpine/csc314/scratch/adams/petsc
Using PETSc arch: arch-olcf-crusher
-----------------------------------------

Using C compiler: cc  -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O   
Using Fortran compiler: ftn  -fPIC -g     
-----------------------------------------

Using include paths: -I/gpfs/alpine/csc314/scratch/adams/petsc/include -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/include -I/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/include -I/opt/rocm-4.5.0/include
-----------------------------------------

Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -lpetsc -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -Wl,-rpath,/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -L/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -Wl,-rpath,/opt/rocm-4.5.0/lib -L/opt/rocm-4.5.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/gtl/lib -L/opt/cray/pe/mpich/8.1.12/gtl/lib -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64 -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -L/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.16/lib -L/opt/cray/pe/pmi/6.0.16/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -L/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -Wl,-rpath,/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -L/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -L/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -lHYPRE -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lz -lhipsparse -lhipblas -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -ldl -lmpi_gtl_hsa -lmpifort_cray -lmpi_cray -ldsmml -lpmi -lpmi2 -lxpmem -lstdc++ -lpgas-shmem -lquadmath -lmodules -lfi -lcraymath -lf -lu -lcsup -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64 -lclang_rt.builtins-x86_64 -lquadmath -ldl -lmpi_gtl_hsa
-----------------------------------------



      ##########################################################
      #                                                        #
      #                       WARNING!!!                       #
      #                                                        #
      # This code was compiled with GPU support and you've     #
      # created PETSc/GPU objects, but you intentionally used  #
      # -use_gpu_aware_mpi 0, such that PETSc had to copy data #
      # from GPU to CPU for communication. To get meaningfull  #
      # timing results, please use GPU-aware MPI instead.      #
      ##########################################################


#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 5
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 14 unused database options. They are:
Option left: name:-mg_levels_esteig_ksp_max_it value: 10
Option left: name:-mg_levels_esteig_ksp_type value: cg
Option left: name:-mg_levels_ksp_chebyshev_esteig value: 0,0.05,0,1.05
Option left: name:-mg_levels_ksp_type value: chebyshev
Option left: name:-mg_levels_pc_type value: jacobi
Option left: name:-pc_gamg_coarse_eq_limit value: 100
Option left: name:-pc_gamg_coarse_grid_layout_type value: compact
Option left: name:-pc_gamg_esteig_ksp_max_it value: 10
Option left: name:-pc_gamg_esteig_ksp_type value: cg
Option left: name:-pc_gamg_process_eq_limit value: 400
Option left: name:-pc_gamg_repartition value: false
Option left: name:-pc_gamg_reuse_interpolation value: true
Option left: name:-pc_gamg_square_graph value: 0
Option left: name:-pc_gamg_threshold value: -0.01
-------------- next part --------------
DM Object: box 8 MPI processes
  type: plex
box in 3 dimensions:
  Number of 0-cells per rank: 729 729 729 729 729 729 729 729
  Number of 1-cells per rank: 1944 1944 1944 1944 1944 1944 1944 1944
  Number of 2-cells per rank: 1728 1728 1728 1728 1728 1728 1728 1728
  Number of 3-cells per rank: 512 512 512 512 512 512 512 512
Labels:
  celltype: 4 strata with value/size (0 (729), 1 (1944), 4 (1728), 7 (512))
  depth: 4 strata with value/size (0 (729), 1 (1944), 2 (1728), 3 (512))
  marker: 1 strata with value/size (1 (810))
  Face Sets: 3 strata with value/size (1 (225), 3 (225), 6 (225))
  Linear solve converged due to CONVERGED_RTOL iterations 60
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=29791, cols=29791
    total: nonzeros=1685159, allocated nonzeros=1685159
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
  Linear solve converged due to CONVERGED_RTOL iterations 60
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=29791, cols=29791
    total: nonzeros=1685159, allocated nonzeros=1685159
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
  Linear solve converged due to CONVERGED_RTOL iterations 60
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=29791, cols=29791
    total: nonzeros=1685159, allocated nonzeros=1685159
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
**************************************** ***********************************************************************************************************************
***                                WIDEN YOUR WINDOW TO 160 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document                                 ***
****************************************************************************************************************************************************************

------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------



      ##########################################################
      #                                                        #
      #                       WARNING!!!                       #
      #                                                        #
      # This code was compiled with GPU support and you've     #
      # created PETSc/GPU objects, but you intentionally used  #
      # -use_gpu_aware_mpi 0, such that PETSc had to copy data #
      # from GPU to CPU for communication. To get meaningfull  #
      # timing results, please use GPU-aware MPI instead.      #
      ##########################################################


/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tests/data/../ex13 on a arch-olcf-crusher named crusher002 with 8 processors, by adams Tue Jan 25 10:10:23 2022
Using Petsc Development GIT revision: v3.16.3-696-g46640c56cb  GIT Date: 2022-01-25 09:20:51 -0500

                         Max       Max/Min     Avg       Total
Time (sec):           2.126e+00     1.000   2.126e+00
Objects:              1.780e+03     1.031   1.737e+03
Flop:                 1.350e+08     1.184   1.240e+08  9.923e+08
Flop/sec:             6.348e+07     1.184   5.835e+07  4.668e+08
MPI Messages:         1.782e+03     1.170   1.574e+03  1.260e+04
MPI Message Lengths:  3.177e+06     1.120   1.871e+03  2.357e+07
MPI Reductions:       7.190e+02     1.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flop
                            and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 2.0779e+00  97.8%  5.4136e+08  54.6%  5.875e+03  46.6%  2.457e+03       61.3%  3.360e+02  46.7%
 1:         PCSetUp: 1.6083e-04   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0%
 2:  KSP Solve only: 4.7610e-02   2.2%  4.5091e+08  45.4%  6.720e+03  53.4%  1.359e+03       38.7%  3.640e+02  50.6%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
   GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors)
   CpuToGpu Count: total number of CPU to GPU copies per processor
   CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor)
   GpuToCpu Count: total number of GPU to CPU copies per processor
   GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor)
   GPU %F: percent flops on GPU in this event
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total   GPU    - CpuToGpu -   - GpuToCpu - GPU
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s Mflop/s Count   Size   Count   Size  %F
---------------------------------------------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

PetscBarrier           3 1.0 4.0715e-03 1.0 0.00e+00 0.0 4.8e+02 1.1e+02 1.2e+01  0  0  4  0  2   0  0  8  0  4     0       0      0 0.00e+00    0 0.00e+00  0
BuildTwoSided         36 1.0 1.7726e-01 6.7 0.00e+00 0.0 6.3e+02 4.0e+00 3.6e+01  4  0  5  0  5   5  0 11  0 11     0       0      0 0.00e+00    0 0.00e+00  0
BuildTwoSidedF         6 1.0 1.7505e-01 7.2 0.00e+00 0.0 1.5e+02 2.9e+04 6.0e+00  4  0  1 18  1   4  0  3 30  2     0       0      0 0.00e+00    0 0.00e+00  0
MatMult              737 1.0 1.6693e-02 1.1 2.86e+07 1.3 3.6e+03 1.3e+03 2.0e+00  1 20 29 20  0   1 37 62 32  1 12123   19363    121 1.15e+00  120 1.14e+00 100
MatAssemblyBegin      43 1.0 1.7518e-01 6.3 0.00e+00 0.0 1.5e+02 2.9e+04 6.0e+00  5  0  1 18  1   5  0  3 30  2     0       0      0 0.00e+00    0 0.00e+00  0
MatAssemblyEnd        43 1.0 4.8427e-02 1.3 6.81e+04 0.0 0.0e+00 0.0e+00 9.0e+00  2  0  0  0  1   2  0  0  0  3     5       0      0 0.00e+00    0 0.00e+00  0
MatZeroEntries         3 1.0 9.8439e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
MatView                1 1.0 1.1108e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSetUp               1 1.0 1.7866e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSolve               1 1.0 4.3944e-02 1.1 3.18e+07 1.3 3.5e+03 1.3e+03 1.8e+02  2 23 28 20 26   2 42 59 32 55  5131    8234    121 1.15e+00  120 1.14e+00 100
SNESSolve              1 1.0 8.3242e-01 1.0 5.16e+07 1.2 3.6e+03 1.9e+03 1.9e+02 39 39 29 29 26  40 71 61 47 56   461    8211    125 1.18e+00  126 1.21e+00 59
SNESSetUp              1 1.0 1.1244e-01 1.0 0.00e+00 0.0 3.6e+02 1.4e+04 1.8e+01  5  0  3 21  3   5  0  6 34  5     0       0      0 0.00e+00    0 0.00e+00  0
SNESFunctionEval       2 1.0 3.5627e-01 1.6 1.39e+07 1.0 1.1e+02 9.4e+02 3.0e+00 11 11  1  0  0  11 21  2  1  1   312     550      6 7.30e-02    6 7.15e-02  0
SNESJacobianEval       2 1.0 1.1017e+00 1.2 2.47e+07 1.0 1.1e+02 3.8e+04 2.0e+00 51 20  1 18  0  52 36  2 30  1   179       0      0 0.00e+00    6 7.15e-02  0
DMCreateInterp         1 1.0 2.6462e-02 1.0 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01  1  0  1  0  2   1  0  1  1  5    25       0      0 0.00e+00    0 0.00e+00  0
DMCreateMat            1 1.0 1.1233e-01 1.0 0.00e+00 0.0 3.6e+02 1.4e+04 1.8e+01  5  0  3 21  3   5  0  6 34  5     0       0      0 0.00e+00    0 0.00e+00  0
Mesh Partition         1 1.0 4.7895e-03 1.0 0.00e+00 0.0 3.5e+01 1.1e+02 8.0e+00  0  0  0  0  1   0  0  1  0  2     0       0      0 0.00e+00    0 0.00e+00  0
Mesh Migration         1 1.0 5.4815e-01 1.0 0.00e+00 0.0 2.0e+02 8.2e+01 2.9e+01 26  0  2  0  4  26  0  3  0  9     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartSelf         1 1.0 9.8660e-0513.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartLblInv       1 1.0 3.0135e-04 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartLblSF        1 1.0 1.1005e-03 1.6 0.00e+00 0.0 1.4e+01 5.6e+01 1.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartStrtSF       1 1.0 3.3112e-03 1.8 0.00e+00 0.0 7.0e+00 2.2e+02 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPointSF          1 1.0 2.1855e-04 1.1 0.00e+00 0.0 1.4e+01 2.7e+02 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexInterp          19 1.0 6.7416e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistribute       1 1.0 5.5320e-01 1.0 0.00e+00 0.0 2.5e+02 9.7e+01 3.7e+01 26  0  2  0  5  27  0  4  0 11     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistCones        1 1.0 1.5161e-04 1.1 0.00e+00 0.0 4.2e+01 1.4e+02 2.0e+00  0  0  0  0  0   0  0  1  0  1     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistLabels       1 1.0 2.9833e-04 1.0 0.00e+00 0.0 1.0e+02 6.6e+01 2.4e+01  0  0  1  0  3   0  0  2  0  7     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistField        1 1.0 5.4747e-01 1.0 0.00e+00 0.0 4.9e+01 5.9e+01 2.0e+00 26  0  0  0  0  26  0  1  0  1     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexStratify        31 1.0 1.3614e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  1   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexSymmetrize      31 1.0 2.2603e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPrealloc         1 1.0 1.1213e-01 1.0 0.00e+00 0.0 3.6e+02 1.4e+04 1.6e+01  5  0  3 21  2   5  0  6 34  5     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexResidualFE       2 1.0 2.4719e-02 1.0 1.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1 10  0  0  0   1 19  0  0  0  4098       0      0 0.00e+00    0 0.00e+00  0
DMPlexJacobianFE       2 1.0 1.0950e+00 1.2 2.35e+07 1.0 7.6e+01 5.6e+04 2.0e+00 48 19  1 18  0  49 35  1 29  1   171       0      0 0.00e+00    0 0.00e+00  0
DMPlexInterpFE         1 1.0 2.6414e-02 1.0 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01  1  0  1  0  2   1  0  1  1  5    25       0      0 0.00e+00    0 0.00e+00  0
SFSetGraph            37 1.0 4.4644e-05 4.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFSetUp               30 1.0 4.4996e-03 1.2 0.00e+00 0.0 1.1e+03 1.6e+03 3.0e+01  0  0  9  8  4   0  0 19 12  9     0       0      0 0.00e+00    0 0.00e+00  0
SFBcastBegin          59 1.0 1.9148e-0185.2 0.00e+00 0.0 8.6e+02 1.0e+03 0.0e+00  8  0  7  4  0   8  0 15  6  0     0       0      1 1.49e-03   11 1.43e-01  0
SFBcastEnd            59 1.0 1.9445e-0169.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  4  0  0  0  0   4  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFReduceBegin         14 1.0 1.3587e-01690.3 8.19e+03 1.2 2.5e+02 7.0e+03 0.0e+00  1  0  2  8  0   1  0  4 12  0     0       0      2 5.96e-02    0 0.00e+00 100
SFReduceEnd           14 1.0 1.2185e-0242.5 1.63e+03 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      4 1.19e-02    0 0.00e+00 100
SFFetchOpBegin         2 1.0 4.5447e-0515.3 0.00e+00 0.0 3.8e+01 1.5e+04 0.0e+00  0  0  0  2  0   0  0  1  4  0     0       0      0 0.00e+00    0 0.00e+00  0
SFFetchOpEnd           2 1.0 2.4357e-04 1.4 0.00e+00 0.0 3.8e+01 1.5e+04 0.0e+00  0  0  0  2  0   0  0  1  4  0     0       0      0 0.00e+00    0 0.00e+00  0
SFCreateEmbed          6 1.0 5.6716e-0347.7 0.00e+00 0.0 1.0e+02 8.0e+01 0.0e+00  0  0  1  0  0   0  0  2  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFDistSection          9 1.0 5.1380e-04 1.2 0.00e+00 0.0 3.1e+02 4.5e+02 1.1e+01  0  0  2  1  2   0  0  5  1  3     0       0      0 0.00e+00    0 0.00e+00  0
SFSectionSF           14 1.0 1.3564e-03 1.5 0.00e+00 0.0 4.0e+02 1.4e+03 1.4e+01  0  0  3  2  2   0  0  7  4  4     0       0      0 0.00e+00    0 0.00e+00  0
SFRemoteOff            5 1.0 5.8056e-0321.8 0.00e+00 0.0 2.7e+02 1.7e+02 2.0e+00  0  0  2  0  0   0  0  5  0  1     0       0      0 0.00e+00    0 0.00e+00  0
SFPack               142 1.0 1.8780e-01581.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  7  0  0  0  0   8  0  0  0  0     0       0      2 6.24e-03    0 0.00e+00  0
SFUnpack             144 1.0 1.3578e-011442.7 9.83e+03 1.5 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00 100
VecTDot              120 1.0 7.3964e-03 1.4 9.83e+05 1.2 0.0e+00 0.0e+00 1.2e+02  0  1  0  0 17   0  1  0  0 36   967    1955      0 0.00e+00    0 0.00e+00 100
VecNorm               61 1.0 8.1882e-03 2.4 5.00e+05 1.2 0.0e+00 0.0e+00 6.1e+01  0  0  0  0  8   0  1  0  0 18   444    1399      0 0.00e+00    0 0.00e+00 100
VecCopy                2 1.0 9.8118e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecSet                52 1.0 1.2121e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecAXPY              121 1.0 3.3206e-03 1.1 9.91e+05 1.2 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  2171    3088      0 0.00e+00    0 0.00e+00 100
VecAYPX               59 1.0 1.1340e-03 1.1 4.83e+05 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0  3100    5302      0 0.00e+00    0 0.00e+00 100
VecPointwiseMult      60 1.0 1.2228e-03 1.1 2.46e+05 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1462    2562      0 0.00e+00    0 0.00e+00 100
VecScatterBegin       61 1.0 3.5653e-03 1.6 0.00e+00 0.0 3.6e+03 1.3e+03 2.0e+00  0  0 29 20  0   0  0 62 32  1     0       0      1 4.76e-03  120 1.14e+00  0
VecScatterEnd         61 1.0 1.4893e-03 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0    120 1.14e+00    0 0.00e+00  0
DualSpaceSetUp         2 1.0 2.8444e-03 1.0 1.80e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     5       0      0 0.00e+00    0 0.00e+00  0
FESetUp                2 1.0 9.4143e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
PCSetUp                1 1.0 4.3280e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
PCApply               60 1.0 8.7004e-03 1.1 2.46e+05 1.2 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  1   205     224      0 0.00e+00    0 0.00e+00 100

--- Event Stage 1: PCSetUp

PCSetUp                1 1.0 1.7049e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0  98  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

--- Event Stage 2: KSP Solve only

MatMult              120 1.0 1.9656e-02 1.2 5.72e+07 1.3 6.7e+03 1.4e+03 0.0e+00  1 41 53 39  0  38 90100100  0 20576   43374    240 2.28e+00  240 2.28e+00 100
MatView                2 1.0 6.7079e-05 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
KSPSolve               2 1.0 4.9491e-02 1.1 6.36e+07 1.3 6.7e+03 1.4e+03 3.6e+02  2 45 53 39 50 100100100100 99  9111   17921    240 2.28e+00  240 2.28e+00 100
SFPack               120 1.0 3.4330e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFUnpack             120 1.0 1.4029e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecTDot              240 1.0 1.3717e-02 1.4 1.97e+06 1.2 0.0e+00 0.0e+00 2.4e+02  1  1  0  0 33  25  3  0  0 66  1042    2043      0 0.00e+00    0 0.00e+00 100
VecNorm              122 1.0 1.0250e-02 2.0 9.99e+05 1.2 0.0e+00 0.0e+00 1.2e+02  0  1  0  0 17  18  2  0  0 34   709    2009      0 0.00e+00    0 0.00e+00 100
VecCopy                4 1.0 1.4585e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecSet                 4 1.0 1.0522e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecAXPY              240 1.0 4.5384e-03 1.1 1.97e+06 1.2 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   9  3  0  0  0  3151    5457      0 0.00e+00    0 0.00e+00 100
VecAYPX              118 1.0 2.2194e-03 1.1 9.67e+05 1.2 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   5  2  0  0  0  3168    5373      0 0.00e+00    0 0.00e+00 100
VecPointwiseMult     120 1.0 2.3361e-03 1.1 4.92e+05 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   5  1  0  0  0  1530    2644      0 0.00e+00    0 0.00e+00 100
VecScatterBegin      120 1.0 5.8364e-03 1.6 0.00e+00 0.0 6.7e+03 1.4e+03 0.0e+00  0  0 53 39  0  11  0100100  0     0       0      0 0.00e+00  240 2.28e+00  0
VecScatterEnd        120 1.0 2.6910e-03 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   4  0  0  0  0     0       0    240 2.28e+00    0 0.00e+00  0
PCApply              120 1.0 2.3607e-03 1.1 4.92e+05 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   5  1  0  0  0  1514    2644      0 0.00e+00    0 0.00e+00 100
---------------------------------------------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Container    30             30        17280     0.
                SNES     1              1         1540     0.
              DMSNES     1              1          688     0.
       Krylov Solver     1              1         1664     0.
     DMKSP interface     1              1          656     0.
              Matrix    73             73      2596672     0.
    Distributed Mesh    66             66       479912     0.
            DM Label   156            156        98592     0.
          Quadrature   148            148        87616     0.
      Mesh Transform     3              3         2268     0.
           Index Set   569            569       564740     0.
   IS L to G Mapping     2              2        21568     0.
             Section   235            235       167320     0.
   Star Forest Graph   161            161       175056     0.
     Discrete System   106            106       101764     0.
           Weak Form   107            107        65912     0.
    GraphPartitioner    31             31        21328     0.
              Vector    52             52       385592     0.
        Linear Space     5              5         3416     0.
          Dual Space    26             26        24336     0.
            FE Space     2              2         1576     0.
              Viewer     2              1          840     0.
      Preconditioner     1              1          872     0.
       Field over DM     1              1          704     0.

--- Event Stage 1: PCSetUp


--- Event Stage 2: KSP Solve only

========================================================================================================================
Average time to get PetscTime(): 3.51e-08
Average time for MPI_Barrier(): 2.597e-06
Average time for zero size MPI_Send(): 1.01545e-05
#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 3
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=cc --with-cxx=CC --with-fc=ftn --with-fortran-bindings=0 LIBS="-L/opt/cray/pe/mpich/8.1.12/gtl/lib -lmpi_gtl_hsa" --with-debugging=0 --COPTFLAGS="-g -O" --CXXOPTFLAGS="-g -O" --FOPTFLAGS=-g --with-mpiexec="srun -p batch -N 1 -A csc314_crusher -t 00:10:00" --with-hip --with-hipc=hipcc --download-hypre --with-hip-arch=gfx90a --download-kokkos --download-kokkos-kernels --with-kokkos-kernels-tpl=0 --download-p4est=1 --with-zlib-dir=/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4 PETSC_ARCH=arch-olcf-crusher
-----------------------------------------
Libraries compiled on 2022-01-25 14:29:13 on login2 
Machine characteristics: Linux-5.3.18-59.16_11.0.39-cray_shasta_c-x86_64-with-glibc2.3.4
Using PETSc directory: /gpfs/alpine/csc314/scratch/adams/petsc
Using PETSc arch: arch-olcf-crusher
-----------------------------------------

Using C compiler: cc  -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O   
Using Fortran compiler: ftn  -fPIC -g     
-----------------------------------------

Using include paths: -I/gpfs/alpine/csc314/scratch/adams/petsc/include -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/include -I/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/include -I/opt/rocm-4.5.0/include
-----------------------------------------

Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -lpetsc -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -Wl,-rpath,/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -L/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -Wl,-rpath,/opt/rocm-4.5.0/lib -L/opt/rocm-4.5.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/gtl/lib -L/opt/cray/pe/mpich/8.1.12/gtl/lib -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64 -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -L/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.16/lib -L/opt/cray/pe/pmi/6.0.16/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -L/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -Wl,-rpath,/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -L/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -L/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -lHYPRE -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lz -lhipsparse -lhipblas -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -ldl -lmpi_gtl_hsa -lmpifort_cray -lmpi_cray -ldsmml -lpmi -lpmi2 -lxpmem -lstdc++ -lpgas-shmem -lquadmath -lmodules -lfi -lcraymath -lf -lu -lcsup -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64 -lclang_rt.builtins-x86_64 -lquadmath -ldl -lmpi_gtl_hsa
-----------------------------------------



      ##########################################################
      #                                                        #
      #                       WARNING!!!                       #
      #                                                        #
      # This code was compiled with GPU support and you've     #
      # created PETSc/GPU objects, but you intentionally used  #
      # -use_gpu_aware_mpi 0, such that PETSc had to copy data #
      # from GPU to CPU for communication. To get meaningfull  #
      # timing results, please use GPU-aware MPI instead.      #
      ##########################################################


#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 3
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 14 unused database options. They are:
Option left: name:-mg_levels_esteig_ksp_max_it value: 10
Option left: name:-mg_levels_esteig_ksp_type value: cg
Option left: name:-mg_levels_ksp_chebyshev_esteig value: 0,0.05,0,1.05
Option left: name:-mg_levels_ksp_type value: chebyshev
Option left: name:-mg_levels_pc_type value: jacobi
Option left: name:-pc_gamg_coarse_eq_limit value: 100
Option left: name:-pc_gamg_coarse_grid_layout_type value: compact
Option left: name:-pc_gamg_esteig_ksp_max_it value: 10
Option left: name:-pc_gamg_esteig_ksp_type value: cg
Option left: name:-pc_gamg_process_eq_limit value: 400
Option left: name:-pc_gamg_repartition value: false
Option left: name:-pc_gamg_reuse_interpolation value: true
Option left: name:-pc_gamg_square_graph value: 0
Option left: name:-pc_gamg_threshold value: -0.01
-------------- next part --------------
DM Object: box 8 MPI processes
  type: plex
box in 3 dimensions:
  Number of 0-cells per rank: 274625 274625 274625 274625 274625 274625 274625 274625
  Number of 1-cells per rank: 811200 811200 811200 811200 811200 811200 811200 811200
  Number of 2-cells per rank: 798720 798720 798720 798720 798720 798720 798720 798720
  Number of 3-cells per rank: 262144 262144 262144 262144 262144 262144 262144 262144
Labels:
  celltype: 4 strata with value/size (0 (274625), 1 (811200), 4 (798720), 7 (262144))
  depth: 4 strata with value/size (0 (274625), 1 (811200), 2 (798720), 3 (262144))
  marker: 1 strata with value/size (1 (49530))
  Face Sets: 3 strata with value/size (1 (16129), 3 (16129), 6 (16129))
  Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=16581375, cols=16581375
    total: nonzeros=1045678375, allocated nonzeros=1045678375
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
  Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=16581375, cols=16581375
    total: nonzeros=1045678375, allocated nonzeros=1045678375
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
  Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=16581375, cols=16581375
    total: nonzeros=1045678375, allocated nonzeros=1045678375
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
**************************************** ***********************************************************************************************************************
***                                WIDEN YOUR WINDOW TO 160 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document                                 ***
****************************************************************************************************************************************************************

------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------



      ##########################################################
      #                                                        #
      #                       WARNING!!!                       #
      #                                                        #
      # This code was compiled with GPU support and you've     #
      # created PETSc/GPU objects, but you intentionally used  #
      # -use_gpu_aware_mpi 0, such that PETSc had to copy data #
      # from GPU to CPU for communication. To get meaningfull  #
      # timing results, please use GPU-aware MPI instead.      #
      ##########################################################


/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tests/data/../ex13 on a arch-olcf-crusher named crusher002 with 8 processors, by adams Tue Jan 25 10:20:42 2022
Using Petsc Development GIT revision: v3.16.3-696-g46640c56cb  GIT Date: 2022-01-25 09:20:51 -0500

                         Max       Max/Min     Avg       Total
Time (sec):           5.394e+02     1.000   5.394e+02
Objects:              1.990e+03     1.027   1.947e+03
Flop:                 1.940e+11     1.027   1.915e+11  1.532e+12
Flop/sec:             3.596e+08     1.027   3.549e+08  2.839e+09
MPI Messages:         4.806e+03     1.066   4.571e+03  3.657e+04
MPI Message Lengths:  4.434e+08     1.015   9.611e+04  3.515e+09
MPI Reductions:       1.991e+03     1.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flop
                            and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 5.3766e+02  99.7%  6.0875e+11  39.7%  1.417e+04  38.7%  1.143e+05       46.1%  7.660e+02  38.5%
 1:         PCSetUp: 1.1813e-01   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0%
 2:  KSP Solve only: 1.6643e+00   0.3%  9.2287e+11  60.3%  2.240e+04  61.3%  8.459e+04       53.9%  1.206e+03  60.6%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
   GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors)
   CpuToGpu Count: total number of CPU to GPU copies per processor
   CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor)
   GpuToCpu Count: total number of GPU to CPU copies per processor
   GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor)
   GPU %F: percent flops on GPU in this event
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total   GPU    - CpuToGpu -   - GpuToCpu - GPU
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s Mflop/s Count   Size   Count   Size  %F
---------------------------------------------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

PetscBarrier           6 1.0 1.1691e+00 1.0 0.00e+00 0.0 9.3e+02 3.2e+03 2.1e+01  0  0  3  0  1   0  0  7  0  3     0       0      0 0.00e+00    0 0.00e+00  0
BuildTwoSided         42 1.0 3.1965e+0014.8 0.00e+00 0.0 7.5e+02 4.0e+00 4.2e+01  0  0  2  0  2   0  0  5  0  5     0       0      0 0.00e+00    0 0.00e+00  0
BuildTwoSidedF         6 1.0 3.1399e+0018.9 0.00e+00 0.0 1.5e+02 2.0e+06 6.0e+00  0  0  0  8  0   0  0  1 18  1     0       0      0 0.00e+00    0 0.00e+00  0
MatMult            48589 1.0 7.2097e-01 1.0 5.31e+10 1.0 1.1e+04 8.3e+04 2.0e+00  0 27 31 27  0   0 69 81 59  0 580168   780320    401 2.37e+02  400 2.37e+02 100
MatAssemblyBegin      43 1.0 3.2614e+0011.1 0.00e+00 0.0 1.5e+02 2.0e+06 6.0e+00  0  0  0  8  0   0  0  1 18  1     0       0      0 0.00e+00    0 0.00e+00  0
MatAssemblyEnd        43 1.0 7.7601e-01 3.4 4.67e+06 0.0 0.0e+00 0.0e+00 9.0e+00  0  0  0  0  0   0  0  0  0  1    24       0      0 0.00e+00    0 0.00e+00  0
MatZeroEntries         3 1.0 2.2877e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
MatView                1 1.0 9.9070e-05 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSetUp               1 1.0 4.5815e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSolve               1 1.0 1.2261e+00 1.3 5.85e+10 1.0 1.1e+04 8.4e+04 6.0e+02  0 30 31 27 30   0 76 80 59 79 376334   727165    401 2.37e+02  400 2.37e+02 100
SNESSolve              1 1.0 2.3154e+02 1.0 6.79e+10 1.0 1.1e+04 9.6e+04 6.1e+02 43 35 31 31 31  43 88 81 68 80  2317   726995    405 2.54e+02  406 2.71e+02 86
SNESSetUp              1 1.0 4.9846e+01 1.0 0.00e+00 0.0 3.6e+02 9.4e+05 1.8e+01  9  0  1 10  1   9  0  3 21  2     0       0      0 0.00e+00    0 0.00e+00  0
SNESFunctionEval       2 1.0 1.3484e+01 1.0 6.33e+09 1.0 1.1e+02 6.2e+04 3.0e+00  2  3  0  0  0   2  8  1  0  0  3756   123291      6 3.40e+01    6 3.39e+01  0
SNESJacobianEval       2 1.0 4.7172e+02 1.0 1.21e+10 1.0 1.1e+02 2.6e+06 2.0e+00 87  6  0  9  0  88 16  1 19  0   205       0      0 0.00e+00    6 3.39e+01  0
DMCreateInterp         1 1.0 9.5002e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01  0  0  0  0  1   0  0  1  0  2   698       0      0 0.00e+00    0 0.00e+00  0
DMCreateMat            1 1.0 4.9843e+01 1.0 0.00e+00 0.0 3.6e+02 9.4e+05 1.8e+01  9  0  1 10  1   9  0  3 21  2     0       0      0 0.00e+00    0 0.00e+00  0
Mesh Partition         1 1.0 6.8415e-04 1.1 0.00e+00 0.0 3.5e+01 1.1e+02 8.0e+00  0  0  0  0  0   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
Mesh Migration         1 1.0 2.5632e-01 1.0 0.00e+00 0.0 2.0e+02 8.2e+01 2.9e+01  0  0  1  0  1   0  0  1  0  4     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartSelf         1 1.0 8.3180e-0513.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartLblInv       1 1.0 3.1569e-04 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartLblSF        1 1.0 2.4072e-04 4.1 0.00e+00 0.0 1.4e+01 5.6e+01 1.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartStrtSF       1 1.0 1.0937e-04 1.1 0.00e+00 0.0 7.0e+00 2.2e+02 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPointSF          1 1.0 2.2380e-04 1.1 0.00e+00 0.0 1.4e+01 2.7e+02 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexInterp          19 1.0 5.7251e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistribute       1 1.0 2.5723e-01 1.0 0.00e+00 0.0 2.5e+02 9.7e+01 3.7e+01  0  0  1  0  2   0  0  2  0  5     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistCones        1 1.0 9.8228e-05 1.0 0.00e+00 0.0 4.2e+01 1.4e+02 2.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistLabels       1 1.0 2.5452e-04 1.0 0.00e+00 0.0 1.0e+02 6.6e+01 2.4e+01  0  0  0  0  1   0  0  1  0  3     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistField        1 1.0 2.5579e-01 1.0 0.00e+00 0.0 4.9e+01 5.9e+01 2.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexStratify        34 1.0 3.4337e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  0   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexSymmetrize      34 1.0 9.6014e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPrealloc         1 1.0 4.9804e+01 1.0 0.00e+00 0.0 3.6e+02 9.4e+05 1.6e+01  9  0  1 10  1   9  0  3 21  2     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexResidualFE       2 1.0 1.2648e+01 1.0 6.29e+09 1.0 0.0e+00 0.0e+00 0.0e+00  2  3  0  0  0   2  8  0  0  0  3980       0      0 0.00e+00    0 0.00e+00  0
DMPlexJacobianFE       2 1.0 4.7124e+02 1.0 1.21e+10 1.0 7.6e+01 3.9e+06 2.0e+00 87  6  0  8  0  88 16  1 18  0   205       0      0 0.00e+00    0 0.00e+00  0
DMPlexInterpFE         1 1.0 9.2564e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01  0  0  0  0  1   0  0  1  0  2   717       0      0 0.00e+00    0 0.00e+00  0
SFSetGraph            46 1.0 4.0703e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFSetUp               36 1.0 2.4184e-01 1.2 0.00e+00 0.0 1.3e+03 9.1e+04 3.6e+01  0  0  4  3  2   0  0  9  7  5     0       0      0 0.00e+00    0 0.00e+00  0
SFBcastBegin          68 1.0 1.7661e-0111.8 0.00e+00 0.0 1.0e+03 5.4e+04 0.0e+00  0  0  3  2  0   0  0  7  3  0     0       0      1 9.79e-02   11 6.79e+01  0
SFBcastEnd            68 1.0 5.8333e-0120.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFReduceBegin         17 1.0 1.4363e-0120.0 4.19e+06 1.0 3.1e+02 3.9e+05 0.0e+00  0  0  1  3  0   0  0  2  7  0   231       0      2 3.32e+01    0 0.00e+00 100
SFReduceEnd           17 1.0 1.0363e+00 6.1 9.91e+04 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      4 7.83e-01    0 0.00e+00 100
SFFetchOpBegin         2 1.0 2.0131e-03151.1 0.00e+00 0.0 3.8e+01 1.0e+06 0.0e+00  0  0  0  1  0   0  0  0  2  0     0       0      0 0.00e+00    0 0.00e+00  0
SFFetchOpEnd           2 1.0 1.2970e-02 1.7 0.00e+00 0.0 3.8e+01 1.0e+06 0.0e+00  0  0  0  1  0   0  0  0  2  0     0       0      0 0.00e+00    0 0.00e+00  0
SFCreateEmbed          9 1.0 3.6514e-01101.5 0.00e+00 0.0 1.6e+02 2.9e+03 0.0e+00  0  0  0  0  0   0  0  1  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFDistSection          9 1.0 1.5104e-02 3.0 0.00e+00 0.0 3.1e+02 2.6e+04 1.1e+01  0  0  1  0  1   0  0  2  0  1     0       0      0 0.00e+00    0 0.00e+00  0
SFSectionSF           17 1.0 7.6311e-02 2.0 0.00e+00 0.0 5.2e+02 7.6e+04 1.7e+01  0  0  1  1  1   0  0  4  2  2     0       0      0 0.00e+00    0 0.00e+00  0
SFRemoteOff            8 1.0 3.7036e-0138.1 0.00e+00 0.0 4.9e+02 5.3e+03 5.0e+00  0  0  1  0  0   0  0  3  0  1     0       0      0 0.00e+00    0 0.00e+00  0
SFPack               294 1.0 1.6329e-0116.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      2 3.94e-01    0 0.00e+00  0
SFUnpack             296 1.0 1.4535e-0111.0 4.29e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   231       0      0 0.00e+00    0 0.00e+00 100
VecTDot              401 1.0 7.2354e-02 2.0 1.68e+09 1.0 0.0e+00 0.0e+00 4.0e+02  0  1  0  0 20   0  2  0  0 52 183796   456574      0 0.00e+00    0 0.00e+00 100
VecNorm              201 1.0 2.9087e-0115.8 8.43e+08 1.0 0.0e+00 0.0e+00 2.0e+02  0  0  0  0 10   0  1  0  0 26 22917   448641      0 0.00e+00    0 0.00e+00 100
VecCopy                2 1.0 1.5781e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecSet                55 1.0 5.0356e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecAXPY              400 1.0 2.6926e-02 1.0 1.68e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0 492649   564421      0 0.00e+00    0 0.00e+00 100
VecAYPX              199 1.0 1.3526e-02 1.1 8.35e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0 487921   563124      0 0.00e+00    0 0.00e+00 100
VecPointwiseMult     201 1.0 1.4574e-02 1.1 4.22e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0 228688   263803      0 0.00e+00    0 0.00e+00 100
VecScatterBegin      201 1.0 4.6771e-02 1.9 0.00e+00 0.0 1.1e+04 8.3e+04 2.0e+00  0  0 31 27  0   0  0 81 59  0     0       0      1 2.96e-01  400 2.37e+02  0
VecScatterEnd        201 1.0 3.5618e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0    400 2.37e+02    0 0.00e+00  0
DualSpaceSetUp         2 1.0 2.5830e-03 1.0 1.80e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     6       0      0 0.00e+00    0 0.00e+00  0
FESetUp                2 1.0 1.0546e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
PCSetUp                1 1.0 4.4890e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
PCApply              201 1.0 1.4387e-01 1.0 4.22e+08 1.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  1  0  0  0 23166   164721      0 0.00e+00    0 0.00e+00 100

--- Event Stage 1: PCSetUp

PCSetUp                1 1.0 1.1929e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0 100  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

--- Event Stage 2: KSP Solve only

MatMult              400 1.0 1.2937e+00 1.0 1.06e+11 1.0 2.2e+04 8.5e+04 0.0e+00  0 55 61 54  0  76 91100100  0 646634   788762    800 4.74e+02  800 4.74e+02 100
MatView                2 1.0 8.6066e-05 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSolve               2 1.0 1.7967e+00 1.2 1.17e+11 1.0 2.2e+04 8.5e+04 1.2e+03  0 60 61 54 60 100100100100100 513632   747377    800 4.74e+02  800 4.74e+02 100
SFPack               400 1.0 1.1277e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFUnpack             400 1.0 6.3893e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecTDot              802 1.0 1.2564e-01 1.6 3.36e+09 1.0 0.0e+00 0.0e+00 8.0e+02  0  2  0  0 40   6  3  0  0 67 211684   463965      0 0.00e+00    0 0.00e+00 100
VecNorm              402 1.0 3.2115e-01 8.5 1.69e+09 1.0 0.0e+00 0.0e+00 4.0e+02  0  1  0  0 20  11  1  0  0 33 41511   486763      0 0.00e+00    0 0.00e+00 100
VecCopy                4 1.0 2.8629e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecSet                 4 1.0 1.9327e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecAXPY              800 1.0 5.3435e-02 1.1 3.36e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   3  3  0  0  0 496498   571546      0 0.00e+00    0 0.00e+00 100
VecAYPX              398 1.0 2.6610e-02 1.1 1.67e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   2  1  0  0  0 496007   563024      0 0.00e+00    0 0.00e+00 100
VecPointwiseMult     402 1.0 2.7997e-02 1.1 8.43e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   2  1  0  0  0 238087   276194      0 0.00e+00    0 0.00e+00 100
VecScatterBegin      400 1.0 8.6966e-02 2.1 0.00e+00 0.0 2.2e+04 8.5e+04 0.0e+00  0  0 61 54  0   4  0100100  0     0       0      0 0.00e+00  800 4.74e+02  0
VecScatterEnd        400 1.0 6.0717e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   3  0  0  0  0     0       0    800 4.74e+02    0 0.00e+00  0
PCApply              402 1.0 2.8082e-02 1.1 8.43e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   2  1  0  0  0 237365   276194      0 0.00e+00    0 0.00e+00 100
---------------------------------------------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Container    33             33        19008     0.
                SNES     1              1         1540     0.
              DMSNES     1              1          688     0.
       Krylov Solver     1              1         1664     0.
     DMKSP interface     1              1          656     0.
              Matrix    76             76   1627827176     0.
    Distributed Mesh    72             72     58958528     0.
            DM Label   180            180       113760     0.
          Quadrature   148            148        87616     0.
      Mesh Transform     6              6         4536     0.
           Index Set   665            665      4081364     0.
   IS L to G Mapping     2              2      8588672     0.
             Section   256            256       182272     0.
   Star Forest Graph   179            179       195360     0.
     Discrete System   121            121       116164     0.
           Weak Form   122            122        75152     0.
    GraphPartitioner    34             34        23392     0.
              Vector    55             55    157135208     0.
        Linear Space     5              5         3416     0.
          Dual Space    26             26        24336     0.
            FE Space     2              2         1576     0.
              Viewer     2              1          840     0.
      Preconditioner     1              1          872     0.
       Field over DM     1              1          704     0.

--- Event Stage 1: PCSetUp


--- Event Stage 2: KSP Solve only

========================================================================================================================
Average time to get PetscTime(): 3.51e-08
Average time for MPI_Barrier(): 2.5748e-06
Average time for zero size MPI_Send(): 1.00542e-05
#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 6
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=cc --with-cxx=CC --with-fc=ftn --with-fortran-bindings=0 LIBS="-L/opt/cray/pe/mpich/8.1.12/gtl/lib -lmpi_gtl_hsa" --with-debugging=0 --COPTFLAGS="-g -O" --CXXOPTFLAGS="-g -O" --FOPTFLAGS=-g --with-mpiexec="srun -p batch -N 1 -A csc314_crusher -t 00:10:00" --with-hip --with-hipc=hipcc --download-hypre --with-hip-arch=gfx90a --download-kokkos --download-kokkos-kernels --with-kokkos-kernels-tpl=0 --download-p4est=1 --with-zlib-dir=/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4 PETSC_ARCH=arch-olcf-crusher
-----------------------------------------
Libraries compiled on 2022-01-25 14:29:13 on login2 
Machine characteristics: Linux-5.3.18-59.16_11.0.39-cray_shasta_c-x86_64-with-glibc2.3.4
Using PETSc directory: /gpfs/alpine/csc314/scratch/adams/petsc
Using PETSc arch: arch-olcf-crusher
-----------------------------------------

Using C compiler: cc  -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O   
Using Fortran compiler: ftn  -fPIC -g     
-----------------------------------------

Using include paths: -I/gpfs/alpine/csc314/scratch/adams/petsc/include -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/include -I/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/include -I/opt/rocm-4.5.0/include
-----------------------------------------

Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -lpetsc -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -Wl,-rpath,/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -L/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -Wl,-rpath,/opt/rocm-4.5.0/lib -L/opt/rocm-4.5.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/gtl/lib -L/opt/cray/pe/mpich/8.1.12/gtl/lib -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64 -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -L/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.16/lib -L/opt/cray/pe/pmi/6.0.16/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -L/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -Wl,-rpath,/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -L/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -L/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -lHYPRE -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lz -lhipsparse -lhipblas -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -ldl -lmpi_gtl_hsa -lmpifort_cray -lmpi_cray -ldsmml -lpmi -lpmi2 -lxpmem -lstdc++ -lpgas-shmem -lquadmath -lmodules -lfi -lcraymath -lf -lu -lcsup -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64 -lclang_rt.builtins-x86_64 -lquadmath -ldl -lmpi_gtl_hsa
-----------------------------------------



      ##########################################################
      #                                                        #
      #                       WARNING!!!                       #
      #                                                        #
      # This code was compiled with GPU support and you've     #
      # created PETSc/GPU objects, but you intentionally used  #
      # -use_gpu_aware_mpi 0, such that PETSc had to copy data #
      # from GPU to CPU for communication. To get meaningfull  #
      # timing results, please use GPU-aware MPI instead.      #
      ##########################################################


#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 6
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 14 unused database options. They are:
Option left: name:-mg_levels_esteig_ksp_max_it value: 10
Option left: name:-mg_levels_esteig_ksp_type value: cg
Option left: name:-mg_levels_ksp_chebyshev_esteig value: 0,0.05,0,1.05
Option left: name:-mg_levels_ksp_type value: chebyshev
Option left: name:-mg_levels_pc_type value: jacobi
Option left: name:-pc_gamg_coarse_eq_limit value: 100
Option left: name:-pc_gamg_coarse_grid_layout_type value: compact
Option left: name:-pc_gamg_esteig_ksp_max_it value: 10
Option left: name:-pc_gamg_esteig_ksp_type value: cg
Option left: name:-pc_gamg_process_eq_limit value: 400
Option left: name:-pc_gamg_repartition value: false
Option left: name:-pc_gamg_reuse_interpolation value: true
Option left: name:-pc_gamg_square_graph value: 0
Option left: name:-pc_gamg_threshold value: -0.01
-------------- next part --------------
DM Object: box 8 MPI processes
  type: plex
box in 3 dimensions:
  Number of 0-cells per rank: 4913 4913 4913 4913 4913 4913 4913 4913
  Number of 1-cells per rank: 13872 13872 13872 13872 13872 13872 13872 13872
  Number of 2-cells per rank: 13056 13056 13056 13056 13056 13056 13056 13056
  Number of 3-cells per rank: 4096 4096 4096 4096 4096 4096 4096 4096
Labels:
  celltype: 4 strata with value/size (0 (4913), 1 (13872), 4 (13056), 7 (4096))
  depth: 4 strata with value/size (0 (4913), 1 (13872), 2 (13056), 3 (4096))
  marker: 1 strata with value/size (1 (3162))
  Face Sets: 3 strata with value/size (1 (961), 3 (961), 6 (961))
  Linear solve converged due to CONVERGED_RTOL iterations 122
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=250047, cols=250047
    total: nonzeros=15069223, allocated nonzeros=15069223
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
  Linear solve converged due to CONVERGED_RTOL iterations 122
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=250047, cols=250047
    total: nonzeros=15069223, allocated nonzeros=15069223
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
  Linear solve converged due to CONVERGED_RTOL iterations 122
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=200, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: jacobi
    type DIAGONAL
  linear system matrix = precond matrix:
  Mat Object: 8 MPI processes
    type: mpiaijkokkos
    rows=250047, cols=250047
    total: nonzeros=15069223, allocated nonzeros=15069223
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
**************************************** ***********************************************************************************************************************
***                                WIDEN YOUR WINDOW TO 160 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document                                 ***
****************************************************************************************************************************************************************

------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------



      ##########################################################
      #                                                        #
      #                       WARNING!!!                       #
      #                                                        #
      # This code was compiled with GPU support and you've     #
      # created PETSc/GPU objects, but you intentionally used  #
      # -use_gpu_aware_mpi 0, such that PETSc had to copy data #
      # from GPU to CPU for communication. To get meaningfull  #
      # timing results, please use GPU-aware MPI instead.      #
      ##########################################################


/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tests/data/../ex13 on a arch-olcf-crusher named crusher002 with 8 processors, by adams Tue Jan 25 10:10:33 2022
Using Petsc Development GIT revision: v3.16.3-696-g46640c56cb  GIT Date: 2022-01-25 09:20:51 -0500

                         Max       Max/Min     Avg       Total
Time (sec):           9.250e+00     1.000   9.250e+00
Objects:              1.850e+03     1.029   1.807e+03
Flop:                 1.914e+09     1.105   1.821e+09  1.456e+10
Flop/sec:             2.069e+08     1.105   1.968e+08  1.575e+09
MPI Messages:         3.112e+03     1.096   2.895e+03  2.316e+04
MPI Message Lengths:  1.951e+07     1.060   6.497e+03  1.505e+08
MPI Reductions:       1.280e+03     1.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flop
                            and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 9.1356e+00  98.8%  6.4176e+09  44.1%  9.499e+03  41.0%  8.147e+03       51.4%  5.250e+02  41.0%
 1:         PCSetUp: 1.2165e-03   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0%
 2:  KSP Solve only: 1.1320e-01   1.2%  8.1469e+09  55.9%  1.366e+04  59.0%  5.350e+03       48.6%  7.360e+02  57.5%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
   GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors)
   CpuToGpu Count: total number of CPU to GPU copies per processor
   CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor)
   GpuToCpu Count: total number of GPU to CPU copies per processor
   GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor)
   GPU %F: percent flops on GPU in this event
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total   GPU    - CpuToGpu -   - GpuToCpu - GPU
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s Mflop/s Count   Size   Count   Size  %F
---------------------------------------------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

PetscBarrier           4 1.0 2.0834e-02 1.0 0.00e+00 0.0 6.3e+02 3.2e+02 1.5e+01  0  0  3  0  1   0  0  7  0  3     0       0      0 0.00e+00    0 0.00e+00  0
BuildTwoSided         38 1.0 1.8371e-0131.9 0.00e+00 0.0 6.7e+02 4.0e+00 3.8e+01  1  0  3  0  3   1  0  7  0  7     0       0      0 0.00e+00    0 0.00e+00  0
BuildTwoSidedF         6 1.0 1.7678e-0138.7 0.00e+00 0.0 1.5e+02 1.2e+05 6.0e+00  1  0  1 12  0   1  0  2 23  1     0       0      0 0.00e+00    0 0.00e+00  0
MatMult             3007 1.0 3.3355e-02 1.1 4.88e+08 1.1 7.1e+03 5.2e+03 2.0e+00  0 25 31 24  0   0 57 75 47  0 110256   208584    245 9.16e+00  244 9.14e+00 100
MatAssemblyBegin      43 1.0 1.7686e-01 6.6 0.00e+00 0.0 1.5e+02 1.2e+05 6.0e+00  1  0  1 12  0   1  0  2 23  1     0       0      0 0.00e+00    0 0.00e+00  0
MatAssemblyEnd        43 1.0 4.7823e-02 3.0 2.84e+05 0.0 0.0e+00 0.0e+00 9.0e+00  0  0  0  0  1   0  0  0  0  2    23       0      0 0.00e+00    0 0.00e+00  0
MatZeroEntries         3 1.0 1.2971e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
MatView                1 1.0 7.6647e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSetUp               1 1.0 2.3969e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSolve               1 1.0 8.4992e-02 1.2 5.40e+08 1.1 6.9e+03 5.3e+03 3.7e+02  1 28 30 24 29   1 63 73 47 70 47928   96102    245 9.16e+00  244 9.14e+00 100
SNESSolve              1 1.0 4.0456e+00 1.0 6.89e+08 1.1 7.1e+03 6.5e+03 3.8e+02 44 36 31 31 29  44 82 75 59 71  1301   95872    249 9.44e+00  250 9.69e+00 77
SNESSetUp              1 1.0 8.1269e-01 1.0 0.00e+00 0.0 3.6e+02 5.7e+04 1.8e+01  9  0  2 14  1   9  0  4 27  3     0       0      0 0.00e+00    0 0.00e+00  0
SNESFunctionEval       2 1.0 5.1573e-01 1.4 1.01e+08 1.0 1.1e+02 3.8e+03 3.0e+00  4  6  0  0  0   4 13  1  1  1  1574    3698      6 5.55e-01    6 5.48e-01  0
SNESJacobianEval       2 1.0 7.5947e+00 1.0 1.91e+08 1.0 1.1e+02 1.6e+05 2.0e+00 82 10  0 12  0  83 24  1 23  0   201       0      0 0.00e+00    6 5.48e-01  0
DMCreateInterp         1 1.0 8.4876e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01  0  0  0  0  1   0  0  1  0  3   782       0      0 0.00e+00    0 0.00e+00  0
DMCreateMat            1 1.0 8.1254e-01 1.0 0.00e+00 0.0 3.6e+02 5.7e+04 1.8e+01  9  0  2 14  1   9  0  4 27  3     0       0      0 0.00e+00    0 0.00e+00  0
Mesh Partition         1 1.0 6.8623e-04 1.1 0.00e+00 0.0 3.5e+01 1.1e+02 8.0e+00  0  0  0  0  1   0  0  0  0  2     0       0      0 0.00e+00    0 0.00e+00  0
Mesh Migration         1 1.0 2.4168e-01 1.0 0.00e+00 0.0 2.0e+02 8.2e+01 2.9e+01  3  0  1  0  2   3  0  2  0  6     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartSelf         1 1.0 8.3200e-0513.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartLblInv       1 1.0 3.1282e-04 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartLblSF        1 1.0 2.3499e-04 3.8 0.00e+00 0.0 1.4e+01 5.6e+01 1.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPartStrtSF       1 1.0 1.0389e-04 1.1 0.00e+00 0.0 7.0e+00 2.2e+02 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPointSF          1 1.0 2.1879e-04 1.1 0.00e+00 0.0 1.4e+01 2.7e+02 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexInterp          19 1.0 5.7343e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistribute       1 1.0 2.4259e-01 1.0 0.00e+00 0.0 2.5e+02 9.7e+01 3.7e+01  3  0  1  0  3   3  0  3  0  7     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistCones        1 1.0 1.0470e-04 1.0 0.00e+00 0.0 4.2e+01 1.4e+02 2.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistLabels       1 1.0 2.8320e-04 1.0 0.00e+00 0.0 1.0e+02 6.6e+01 2.4e+01  0  0  0  0  2   0  0  1  0  5     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexDistField        1 1.0 2.4112e-01 1.0 0.00e+00 0.0 4.9e+01 5.9e+01 2.0e+00  3  0  0  0  0   3  0  1  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexStratify        32 1.0 1.7826e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  0   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexSymmetrize      32 1.0 1.5651e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexPrealloc         1 1.0 8.1109e-01 1.0 0.00e+00 0.0 3.6e+02 5.7e+04 1.6e+01  9  0  2 14  1   9  0  4 27  3     0       0      0 0.00e+00    0 0.00e+00  0
DMPlexResidualFE       2 1.0 1.9393e-01 1.0 9.87e+07 1.0 0.0e+00 0.0e+00 0.0e+00  2  5  0  0  0   2 12  0  0  0  4071       0      0 0.00e+00    0 0.00e+00  0
DMPlexJacobianFE       2 1.0 7.5685e+00 1.0 1.88e+08 1.0 7.6e+01 2.4e+05 2.0e+00 81 10  0 12  0  82 23  1 23  0   199       0      0 0.00e+00    0 0.00e+00  0
DMPlexInterpFE         1 1.0 8.1691e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01  0  0  0  0  1   0  0  1  0  3   812       0      0 0.00e+00    0 0.00e+00  0
SFSetGraph            40 1.0 1.6763e-04 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFSetUp               32 1.0 1.3187e-02 1.1 0.00e+00 0.0 1.2e+03 6.2e+03 3.2e+01  0  0  5  5  2   0  0 13 10  6     0       0      0 0.00e+00    0 0.00e+00  0
SFBcastBegin          62 1.0 1.5768e-01255.0 0.00e+00 0.0 9.2e+02 3.8e+03 0.0e+00  1  0  4  2  0   1  0 10  4  0     0       0      1 6.05e-03   11 1.10e+00  0
SFBcastEnd            62 1.0 1.8428e-0159.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFReduceBegin         15 1.0 1.3456e-01385.3 6.55e+04 1.1 2.7e+02 2.7e+04 0.0e+00  0  0  1  5  0   0  0  3  9  0     4       0      2 5.00e-01    0 0.00e+00 100
SFReduceEnd           15 1.0 5.2284e-0221.6 6.34e+03 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      4 4.84e-02    0 0.00e+00 100
SFFetchOpBegin         2 1.0 1.3656e-0443.1 0.00e+00 0.0 3.8e+01 6.1e+04 0.0e+00  0  0  0  2  0   0  0  0  3  0     0       0      0 0.00e+00    0 0.00e+00  0
SFFetchOpEnd           2 1.0 9.1371e-04 2.1 0.00e+00 0.0 3.8e+01 6.1e+04 0.0e+00  0  0  0  2  0   0  0  0  3  0     0       0      0 0.00e+00    0 0.00e+00  0
SFCreateEmbed          7 1.0 2.3430e-02116.3 0.00e+00 0.0 1.2e+02 2.5e+02 0.0e+00  0  0  1  0  0   0  0  1  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFDistSection          9 1.0 1.0085e-03 1.6 0.00e+00 0.0 3.1e+02 1.7e+03 1.1e+01  0  0  1  0  1   0  0  3  1  2     0       0      0 0.00e+00    0 0.00e+00  0
SFSectionSF           15 1.0 5.5941e-03 1.3 0.00e+00 0.0 4.4e+02 5.4e+03 1.5e+01  0  0  2  2  1   0  0  5  3  3     0       0      0 0.00e+00    0 0.00e+00  0
SFRemoteOff            6 1.0 2.3731e-0243.8 0.00e+00 0.0 3.4e+02 5.0e+02 3.0e+00  0  0  1  0  0   0  0  4  0  1     0       0      0 0.00e+00    0 0.00e+00  0
SFPack               208 1.0 1.5759e-01204.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      2 2.48e-02    0 0.00e+00  0
SFUnpack             210 1.0 1.3462e-01282.9 7.19e+04 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     4       0      0 0.00e+00    0 0.00e+00 100
VecTDot              244 1.0 1.5553e-02 1.4 1.60e+07 1.1 0.0e+00 0.0e+00 2.4e+02  0  1  0  0 19   0  2  0  0 46  7846   15819      0 0.00e+00    0 0.00e+00 100
VecNorm              123 1.0 2.2600e-02 3.5 8.06e+06 1.1 0.0e+00 0.0e+00 1.2e+02  0  0  0  0 10   0  1  0  0 23  2722   13020      0 0.00e+00    0 0.00e+00 100
VecCopy                2 1.0 9.1104e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecSet                53 1.0 1.7366e-03 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecAXPY              245 1.0 5.8172e-03 1.1 1.61e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0 21062   31344      0 0.00e+00    0 0.00e+00 100
VecAYPX              121 1.0 2.4363e-03 1.1 7.93e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0 24837   41528      0 0.00e+00    0 0.00e+00 100
VecPointwiseMult     122 1.0 2.5573e-03 1.1 4.00e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 11929   20567      0 0.00e+00    0 0.00e+00 100
VecScatterBegin      123 1.0 7.9911e-03 1.3 0.00e+00 0.0 7.1e+03 5.2e+03 2.0e+00  0  0 31 24  0   0  0 75 47  0     0       0      1 1.87e-02  244 9.14e+00  0
VecScatterEnd        123 1.0 4.0082e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0    244 9.14e+00    0 0.00e+00  0
DualSpaceSetUp         2 1.0 2.5765e-03 1.0 1.80e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     6       0      0 0.00e+00    0 0.00e+00  0
FESetUp                2 1.0 1.0391e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
PCSetUp                1 1.0 4.4890e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
PCApply              122 1.0 9.7869e-03 1.1 4.00e+06 1.1 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0  3117    4254      0 0.00e+00    0 0.00e+00 100

--- Event Stage 1: PCSetUp

PCSetUp                1 1.0 1.3192e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0 100  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

--- Event Stage 2: KSP Solve only

MatMult              244 1.0 5.1175e-02 1.2 9.76e+08 1.1 1.4e+04 5.3e+03 0.0e+00  1 50 59 49  0  43 90100100  0 143700   290154    488 1.83e+01  488 1.83e+01 100
MatView                2 1.0 7.5625e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSolve               2 1.0 1.2088e-01 1.2 1.08e+09 1.1 1.4e+04 5.3e+03 7.3e+02  1 56 59 49 57 100100100100100 67396   134725    488 1.83e+01  488 1.83e+01 100
SFPack               244 1.0 8.5343e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFUnpack             244 1.0 3.2090e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecTDot              488 1.0 2.8601e-02 1.3 3.20e+07 1.1 0.0e+00 0.0e+00 4.9e+02  0  2  0  0 38  22  3  0  0 66  8533   15861      0 0.00e+00    0 0.00e+00 100
VecNorm              246 1.0 2.7710e-02 2.5 1.61e+07 1.1 0.0e+00 0.0e+00 2.5e+02  0  1  0  0 19  18  2  0  0 33  4440   15567      0 0.00e+00    0 0.00e+00 100
VecCopy                4 1.0 1.7787e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecSet                 4 1.0 1.3394e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecAXPY              488 1.0 9.7102e-03 1.1 3.20e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   8  3  0  0  0 25133   41488      0 0.00e+00    0 0.00e+00 100
VecAYPX              242 1.0 5.0139e-03 1.1 1.59e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   4  1  0  0  0 24138   41738      0 0.00e+00    0 0.00e+00 100
VecPointwiseMult     244 1.0 5.1417e-03 1.1 8.00e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   4  1  0  0  0 11866   21109      0 0.00e+00    0 0.00e+00 100
VecScatterBegin      244 1.0 1.4535e-02 1.3 0.00e+00 0.0 1.4e+04 5.3e+03 0.0e+00  0  0 59 49  0  11  0100100  0     0       0      0 0.00e+00  488 1.83e+01  0
VecScatterEnd        244 1.0 7.1686e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   5  0  0  0  0     0       0    488 1.83e+01    0 0.00e+00  0
PCApply              244 1.0 5.1900e-03 1.1 8.00e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   4  1  0  0  0 11756   21109      0 0.00e+00    0 0.00e+00 100
---------------------------------------------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Container    31             31        17856     0.
                SNES     1              1         1540     0.
              DMSNES     1              1          688     0.
       Krylov Solver     1              1         1664     0.
     DMKSP interface     1              1          656     0.
              Matrix    74             74     22701688     0.
    Distributed Mesh    68             68      1326768     0.
            DM Label   164            164       103648     0.
          Quadrature   148            148        87616     0.
      Mesh Transform     4              4         3024     0.
           Index Set   601            601       757748     0.
   IS L to G Mapping     2              2       145664     0.
             Section   242            242       172304     0.
   Star Forest Graph   167            167       181824     0.
     Discrete System   111            111       106564     0.
           Weak Form   112            112        68992     0.
    GraphPartitioner    32             32        22016     0.
              Vector    53             53      2498888     0.
        Linear Space     5              5         3416     0.
          Dual Space    26             26        24336     0.
            FE Space     2              2         1576     0.
              Viewer     2              1          840     0.
      Preconditioner     1              1          872     0.
       Field over DM     1              1          704     0.

--- Event Stage 1: PCSetUp


--- Event Stage 2: KSP Solve only

========================================================================================================================
Average time to get PetscTime(): 3.51e-08
Average time for MPI_Barrier(): 2.7974e-06
Average time for zero size MPI_Send(): 9.373e-06
#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 4
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=cc --with-cxx=CC --with-fc=ftn --with-fortran-bindings=0 LIBS="-L/opt/cray/pe/mpich/8.1.12/gtl/lib -lmpi_gtl_hsa" --with-debugging=0 --COPTFLAGS="-g -O" --CXXOPTFLAGS="-g -O" --FOPTFLAGS=-g --with-mpiexec="srun -p batch -N 1 -A csc314_crusher -t 00:10:00" --with-hip --with-hipc=hipcc --download-hypre --with-hip-arch=gfx90a --download-kokkos --download-kokkos-kernels --with-kokkos-kernels-tpl=0 --download-p4est=1 --with-zlib-dir=/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4 PETSC_ARCH=arch-olcf-crusher
-----------------------------------------
Libraries compiled on 2022-01-25 14:29:13 on login2 
Machine characteristics: Linux-5.3.18-59.16_11.0.39-cray_shasta_c-x86_64-with-glibc2.3.4
Using PETSc directory: /gpfs/alpine/csc314/scratch/adams/petsc
Using PETSc arch: arch-olcf-crusher
-----------------------------------------

Using C compiler: cc  -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O   
Using Fortran compiler: ftn  -fPIC -g     
-----------------------------------------

Using include paths: -I/gpfs/alpine/csc314/scratch/adams/petsc/include -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/include -I/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/include -I/opt/rocm-4.5.0/include
-----------------------------------------

Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -lpetsc -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -Wl,-rpath,/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -L/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -Wl,-rpath,/opt/rocm-4.5.0/lib -L/opt/rocm-4.5.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/gtl/lib -L/opt/cray/pe/mpich/8.1.12/gtl/lib -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64 -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -L/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.16/lib -L/opt/cray/pe/pmi/6.0.16/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -L/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -Wl,-rpath,/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -L/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -L/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -lHYPRE -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lz -lhipsparse -lhipblas -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -ldl -lmpi_gtl_hsa -lmpifort_cray -lmpi_cray -ldsmml -lpmi -lpmi2 -lxpmem -lstdc++ -lpgas-shmem -lquadmath -lmodules -lfi -lcraymath -lf -lu -lcsup -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64 -lclang_rt.builtins-x86_64 -lquadmath -ldl -lmpi_gtl_hsa
-----------------------------------------



      ##########################################################
      #                                                        #
      #                       WARNING!!!                       #
      #                                                        #
      # This code was compiled with GPU support and you've     #
      # created PETSc/GPU objects, but you intentionally used  #
      # -use_gpu_aware_mpi 0, such that PETSc had to copy data #
      # from GPU to CPU for communication. To get meaningfull  #
      # timing results, please use GPU-aware MPI instead.      #
      ##########################################################


#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 4
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 14 unused database options. They are:
Option left: name:-mg_levels_esteig_ksp_max_it value: 10
Option left: name:-mg_levels_esteig_ksp_type value: cg
Option left: name:-mg_levels_ksp_chebyshev_esteig value: 0,0.05,0,1.05
Option left: name:-mg_levels_ksp_type value: chebyshev
Option left: name:-mg_levels_pc_type value: jacobi
Option left: name:-pc_gamg_coarse_eq_limit value: 100
Option left: name:-pc_gamg_coarse_grid_layout_type value: compact
Option left: name:-pc_gamg_esteig_ksp_max_it value: 10
Option left: name:-pc_gamg_esteig_ksp_type value: cg
Option left: name:-pc_gamg_process_eq_limit value: 400
Option left: name:-pc_gamg_repartition value: false
Option left: name:-pc_gamg_reuse_interpolation value: true
Option left: name:-pc_gamg_square_graph value: 0
Option left: name:-pc_gamg_threshold value: -0.01


More information about the petsc-dev mailing list