[petsc-dev] Kokkos/Crusher perforance
Mark Adams
mfadams at lbl.gov
Sat Jan 22 06:52:42 CST 2022
On Fri, Jan 21, 2022 at 9:55 PM Barry Smith <bsmith at petsc.dev> wrote:
>
> Interesting, Is this with all native Kokkos kernels or do some kokkos
> kernels use rocm?
>
Ah, good question. I often run with tpl=0 but I did not specify here on
Crusher. In looking at the log files I see
-I/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/externalpackages/git.kokkos-kernels/src/impl/tpls
Here is a run with tpls turned off. These tpl includes are gone.
It looks pretty much the same. A little slower but that could be noise.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20220122/82f17595/attachment-0001.html>
-------------- next part --------------
DM Object: box 64 MPI processes
type: plex
box in 3 dimensions:
Number of 0-cells per rank: 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937 35937
Number of 1-cells per rank: 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544 104544
Number of 2-cells per rank: 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376 101376
Number of 3-cells per rank: 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768 32768
Labels:
celltype: 4 strata with value/size (0 (35937), 1 (104544), 4 (101376), 7 (32768))
depth: 4 strata with value/size (0 (35937), 1 (104544), 2 (101376), 3 (32768))
marker: 1 strata with value/size (1 (12474))
Face Sets: 3 strata with value/size (1 (3969), 3 (3969), 6 (3969))
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 64 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 64 MPI processes
type: jacobi
type DIAGONAL
linear system matrix = precond matrix:
Mat Object: 64 MPI processes
type: mpiaijkokkos
rows=16581375, cols=16581375
total: nonzeros=1045678375, allocated nonzeros=1045678375
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 64 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 64 MPI processes
type: jacobi
type DIAGONAL
linear system matrix = precond matrix:
Mat Object: 64 MPI processes
type: mpiaijkokkos
rows=16581375, cols=16581375
total: nonzeros=1045678375, allocated nonzeros=1045678375
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 64 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 64 MPI processes
type: jacobi
type DIAGONAL
linear system matrix = precond matrix:
Mat Object: 64 MPI processes
type: mpiaijkokkos
rows=16581375, cols=16581375
total: nonzeros=1045678375, allocated nonzeros=1045678375
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tests/data/../ex13 on a arch-olcf-crusher named crusher001 with 64 processors, by adams Fri Jan 21 23:48:31 2022
Using Petsc Development GIT revision: v3.16.3-665-g1012189b9a GIT Date: 2022-01-21 16:28:20 +0000
Max Max/Min Avg Total
Time (sec): 7.919e+01 1.000 7.918e+01
Objects: 2.088e+03 1.164 1.852e+03
Flop: 2.448e+10 1.074 2.393e+10 1.532e+12
Flop/sec: 3.091e+08 1.074 3.023e+08 1.935e+10
MPI Messages: 1.651e+04 3.673 9.388e+03 6.009e+05
MPI Message Lengths: 2.278e+08 2.093 1.788e+04 1.074e+10
MPI Reductions: 1.988e+03 1.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 7.4289e+01 93.8% 6.0889e+11 39.8% 2.265e+05 37.7% 2.175e+04 45.8% 7.630e+02 38.4%
1: PCSetUp: 3.1604e-02 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
2: KSP Solve only: 4.8576e+00 6.1% 9.2287e+11 60.2% 3.744e+05 62.3% 1.554e+04 54.2% 1.206e+03 60.7%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors)
CpuToGpu Count: total number of CPU to GPU copies per processor
CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor)
GpuToCpu Count: total number of GPU to CPU copies per processor
GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor)
GPU %F: percent flops on GPU in this event
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F
---------------------------------------------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
PetscBarrier 5 1.0 2.0665e-01 1.1 0.00e+00 0.0 1.1e+04 8.0e+02 1.8e+01 0 0 2 0 1 0 0 5 0 2 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSided 40 1.0 2.6017e+0010.5 0.00e+00 0.0 9.9e+03 4.0e+00 4.0e+01 3 0 2 0 2 3 0 4 0 5 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSidedF 6 1.0 2.5318e+0010.9 0.00e+00 0.0 2.2e+03 4.0e+05 6.0e+00 3 0 0 8 0 3 0 1 18 1 0 0 0 0.00e+00 0 0.00e+00 0
MatMult 1210960.2 1.2055e+00 2.1 6.71e+09 1.1 1.9e+05 1.5e+04 2.0e+00 1 27 32 27 0 1 69 85 59 0 346972 0 1 1.14e-01 0 0.00e+00 100
MatAssemblyBegin 43 1.0 2.6856e+00 6.9 0.00e+00 0.0 2.2e+03 4.0e+05 6.0e+00 3 0 0 8 0 3 0 1 18 1 0 0 0 0.00e+00 0 0.00e+00 0
MatAssemblyEnd 43 1.0 4.6070e-01 2.5 1.18e+06 0.0 0.0e+00 0.0e+00 9.0e+00 0 0 0 0 0 1 0 0 0 1 120 0 0 0.00e+00 0 0.00e+00 0
MatZeroEntries 3 1.0 5.4884e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
MatView 1 1.0 2.5364e-03 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSetUp 1 1.0 2.4612e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 1 1.0 3.1074e+00 1.0 7.39e+09 1.1 1.9e+05 1.5e+04 6.0e+02 4 30 31 27 30 4 76 83 59 79 148494 1022941 1 1.14e-01 0 0.00e+00 100
SNESSolve 1 1.0 2.7026e+01 1.0 8.56e+09 1.1 1.9e+05 1.8e+04 6.1e+02 34 35 32 31 31 36 88 84 69 80 19853 1022661 3 2.36e+00 2 3.62e+00 86
SNESSetUp 1 1.0 7.6240e+00 1.0 0.00e+00 0.0 5.3e+03 1.9e+05 1.8e+01 10 0 1 9 1 10 0 2 21 2 0 0 0 0.00e+00 0 0.00e+00 0
SNESFunctionEval 2 1.0 6.2213e+00 1.1 7.96e+08 1.0 1.7e+03 1.3e+04 3.0e+00 8 3 0 0 0 8 8 1 0 0 8149 21036 3 4.32e+00 2 3.62e+00 0
SNESJacobianEval 2 1.0 5.7439e+01 1.0 1.52e+09 1.0 1.7e+03 5.4e+05 2.0e+00 72 6 0 8 0 77 16 1 18 0 1683 0 0 0.00e+00 2 3.62e+00 0
DMCreateInterp 1 1.0 1.0837e-02 1.0 8.29e+04 1.0 1.1e+03 8.0e+02 1.6e+01 0 0 0 0 1 0 0 0 0 2 490 0 0 0.00e+00 0 0.00e+00 0
DMCreateMat 1 1.0 7.6222e+00 1.0 0.00e+00 0.0 5.3e+03 1.9e+05 1.8e+01 10 0 1 9 1 10 0 2 21 2 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Partition 1 1.0 2.5208e-02 1.0 0.00e+00 0.0 3.2e+02 1.1e+02 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Migration 1 1.0 9.2974e-03 1.0 0.00e+00 0.0 1.8e+03 8.3e+01 2.9e+01 0 0 0 0 1 0 0 1 0 4 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartSelf 1 1.0 8.4227e-0493.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblInv 1 1.0 1.0979e-03 4.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblSF 1 1.0 4.5747e-03 1.7 0.00e+00 0.0 1.3e+02 5.6e+01 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartStrtSF 1 1.0 1.8253e-02 1.7 0.00e+00 0.0 6.3e+01 2.2e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPointSF 1 1.0 1.9011e-03 1.1 0.00e+00 0.0 1.3e+02 2.7e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterp 19 1.0 1.0434e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistribute 1 1.0 3.6410e-02 1.0 0.00e+00 0.0 2.2e+03 9.7e+01 3.7e+01 0 0 0 0 2 0 0 1 0 5 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistCones 1 1.0 1.1016e-03 1.2 0.00e+00 0.0 3.8e+02 1.4e+02 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistLabels 1 1.0 1.5538e-03 1.0 0.00e+00 0.0 9.0e+02 6.6e+01 2.4e+01 0 0 0 0 1 0 0 0 0 3 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistField 1 1.0 6.3540e-03 1.0 0.00e+00 0.0 4.4e+02 5.9e+01 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexStratify 33 1.0 1.4687e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexSymmetrize 33 1.0 1.9498e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPrealloc 1 1.0 7.6108e+00 1.0 0.00e+00 0.0 5.3e+03 1.9e+05 1.6e+01 10 0 1 9 1 10 0 2 21 2 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexResidualFE 2 1.0 3.7908e+00 1.1 7.87e+08 1.0 0.0e+00 0.0e+00 0.0e+00 5 3 0 0 0 5 8 0 0 0 13285 0 0 0.00e+00 0 0.00e+00 0
DMPlexJacobianFE 2 1.0 5.7067e+01 1.0 1.51e+09 1.0 1.1e+03 8.0e+05 2.0e+00 72 6 0 8 0 77 16 0 18 0 1689 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterpFE 1 1.0 1.0649e-02 1.0 8.29e+04 1.0 1.1e+03 8.0e+02 1.6e+01 0 0 0 0 1 0 0 0 0 2 498 0 0 0.00e+00 0 0.00e+00 0
SFSetGraph 43 1.0 1.0816e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFSetUp 34 1.0 1.5032e-01 1.5 0.00e+00 0.0 1.8e+04 2.1e+04 3.4e+01 0 0 3 3 2 0 0 8 7 4 0 0 0 0.00e+00 0 0.00e+00 0
SFBcastBegin 65 1.0 2.2730e+00145.4 0.00e+00 0.0 1.3e+04 1.3e+04 0.0e+00 2 0 2 2 0 2 0 6 3 0 0 0 1 1.68e-01 4 7.24e+00 0
SFBcastEnd 65 1.0 1.7421e+0062.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 6.24e-03 0 0.00e+00 0
SFReduceBegin 16 1.0 1.9556e-0184.2 5.24e+05 1.0 4.2e+03 8.5e+04 0.0e+00 0 0 1 3 0 0 0 2 7 0 170 0 2 4.15e+00 0 0.00e+00 100
SFReduceEnd 16 1.0 9.7152e-0132.7 2.50e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1 0 0 0.00e+00 0 0.00e+00 100
SFFetchOpBegin 2 1.0 3.1814e-03104.2 0.00e+00 0.0 5.6e+02 2.0e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 0 0 0.00e+00 0 0.00e+00 0
SFFetchOpEnd 2 1.0 2.8296e-02 3.6 0.00e+00 0.0 5.6e+02 2.0e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 0 0 0.00e+00 0 0.00e+00 0
SFCreateEmbed 8 1.0 1.0733e-0172.8 0.00e+00 0.0 2.0e+03 7.0e+02 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFDistSection 9 1.0 1.0892e-02 2.3 0.00e+00 0.0 4.1e+03 5.9e+03 1.1e+01 0 0 1 0 1 0 0 2 0 1 0 0 0 0.00e+00 0 0.00e+00 0
SFSectionSF 16 1.0 5.2589e-02 2.2 0.00e+00 0.0 5.8e+03 2.0e+04 1.6e+01 0 0 1 1 1 0 0 3 2 2 0 0 0 0.00e+00 0 0.00e+00 0
SFRemoteOff 7 1.0 1.2178e-0124.0 0.00e+00 0.0 6.1e+03 1.3e+03 4.0e+00 0 0 1 0 0 0 0 3 0 1 0 0 0 0.00e+00 0 0.00e+00 0
SFPack 290 1.0 7.5146e-01155.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 2 1.51e-01 0 0.00e+00 0
SFUnpack 292 1.0 1.9789e-0158.9 5.49e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 174 0 0 6.24e-03 0 0.00e+00 100
VecTDot 401 1.0 1.4788e+00 1.8 2.10e+08 1.0 0.0e+00 0.0e+00 4.0e+02 2 1 0 0 20 2 2 0 0 53 8992 109803 0 0.00e+00 0 0.00e+00 100
VecNorm 201 1.0 7.4026e-01 2.4 1.05e+08 1.0 0.0e+00 0.0e+00 2.0e+02 0 0 0 0 10 0 1 0 0 26 9004 127483 0 0.00e+00 0 0.00e+00 100
VecCopy 2 1.0 1.4854e-0310.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 54 1.0 8.7686e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 400 1.0 3.9120e-0120.9 2.10e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 33909 73190 0 0.00e+00 0 0.00e+00 100
VecAYPX 199 1.0 1.3597e-01 6.9 1.04e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 48535 138139 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 201 1.0 1.4152e-0110.2 5.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 23550 69371 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 201 1.0 6.5846e-0117.0 0.00e+00 0.0 1.9e+05 1.5e+04 2.0e+00 0 0 32 27 0 0 0 85 59 0 0 0 1 1.14e-01 0 0.00e+00 0
VecScatterEnd 201 1.0 6.6968e-01 9.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DualSpaceSetUp 2 1.0 5.2698e-03 1.2 1.80e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 22 0 0 0.00e+00 0 0.00e+00 0
FESetUp 2 1.0 3.3009e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCSetUp 1 1.0 9.6290e-06 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCApply 201 1.0 1.9920e-01 2.9 5.27e+07 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 1 0 0 0 16731 47897 0 0.00e+00 0 0.00e+00 100
--- Event Stage 1: PCSetUp
PCSetUp 1 1.0 3.6638e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 100 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
--- Event Stage 2: KSP Solve only
MatMult 400 1.0 1.3375e+00 1.3 1.34e+10 1.1 3.7e+05 1.6e+04 0.0e+00 1 55 62 54 0 24 91100100 0 625440 0 0 0.00e+00 0 0.00e+00 100
MatView 2 1.0 4.3457e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 2 1.0 4.9810e+00 1.0 1.48e+10 1.1 3.7e+05 1.6e+04 1.2e+03 6 60 62 54 61 100100100100100 185277 1102535 0 0.00e+00 0 0.00e+00 100
SFPack 400 1.0 2.6830e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFUnpack 400 1.0 2.2198e-04 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecTDot 802 1.0 2.0418e+00 1.2 4.20e+08 1.0 0.0e+00 0.0e+00 8.0e+02 2 2 0 0 40 38 3 0 0 67 13026 112538 0 0.00e+00 0 0.00e+00 100
VecNorm 402 1.0 1.4270e+00 2.4 2.11e+08 1.0 0.0e+00 0.0e+00 4.0e+02 1 1 0 0 20 14 1 0 0 33 9343 134367 0 0.00e+00 0 0.00e+00 100
VecCopy 4 1.0 5.9396e-0324.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 4 1.0 3.7188e-0313.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 800 1.0 7.4812e-0121.6 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 14 3 0 0 0 35463 73999 0 0.00e+00 0 0.00e+00 100
VecAYPX 398 1.0 2.5369e-01 6.5 2.09e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 4 1 0 0 0 52028 142028 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 402 1.0 2.9605e-01 3.6 1.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 5 1 0 0 0 22515 70608 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 400 1.0 1.6791e-01 6.0 0.00e+00 0.0 3.7e+05 1.6e+04 0.0e+00 0 0 62 54 0 2 0100100 0 0 0 0 0.00e+00 0 0.00e+00 0
VecScatterEnd 400 1.0 1.0057e+00 7.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 5 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCApply 402 1.0 2.9638e-01 3.6 1.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 5 1 0 0 0 22490 70608 0 0.00e+00 0 0.00e+00 100
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 32 32 18432 0.
SNES 1 1 1540 0.
DMSNES 1 1 688 0.
Krylov Solver 1 1 1664 0.
DMKSP interface 1 1 656 0.
Matrix 75 75 195551600 0.
Distributed Mesh 70 70 7840024 0.
DM Label 172 172 108704 0.
Quadrature 148 148 87616 0.
Mesh Transform 5 5 3780 0.
Index Set 801 801 1598436 0.
IS L to G Mapping 2 2 1102568 0.
Section 249 249 177288 0.
Star Forest Graph 173 173 188592 0.
Discrete System 116 116 111364 0.
Weak Form 117 117 72072 0.
GraphPartitioner 33 33 22704 0.
Vector 54 54 19591688 0.
Linear Space 5 5 3416 0.
Dual Space 26 26 24336 0.
FE Space 2 2 1576 0.
Viewer 2 1 840 0.
Preconditioner 1 1 872 0.
Field over DM 1 1 704 0.
--- Event Stage 1: PCSetUp
--- Event Stage 2: KSP Solve only
========================================================================================================================
Average time to get PetscTime(): 5.31e-08
Average time for MPI_Barrier(): 4.0698e-06
Average time for zero size MPI_Send(): 9.52547e-06
#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 4,4,4
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 5
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 4,4,4
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi true
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=cc --with-cxx=CC --with-fc=ftn --with-fortran-bindings=0 LIBS="-L/opt/cray/pe/mpich/8.1.12/gtl/lib -lmpi_gtl_hsa" --with-debugging=0 --with-mpiexec="srun -p batch -N 1 -A csc314_crusher -t 00:10:00" --with-hip --with-hipc=hipcc --download-hypre --download-hypre-configure-arguments=--enable-unified-memory --with-hip-arch=gfx90a --download-kokkos --download-kokkos-kernels --with-kokkos-kernels-tpl=0 PETSC_ARCH=arch-olcf-crusher
-----------------------------------------
Libraries compiled on 2022-01-22 03:06:22 on login2
Machine characteristics: Linux-5.3.18-59.16_11.0.39-cray_shasta_c-x86_64-with-glibc2.3.4
Using PETSc directory: /gpfs/alpine/csc314/scratch/adams/petsc
Using PETSc arch: arch-olcf-crusher
-----------------------------------------
Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O3
Using Fortran compiler: ftn -fPIC
-----------------------------------------
Using include paths: -I/gpfs/alpine/csc314/scratch/adams/petsc/include -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/include -I/opt/rocm-4.5.0/include
-----------------------------------------
Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -lpetsc -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -Wl,-rpath,/opt/rocm-4.5.0/lib -L/opt/rocm-4.5.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/gtl/lib -L/opt/cray/pe/mpich/8.1.12/gtl/lib -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64 -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -L/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.16/lib -L/opt/cray/pe/pmi/6.0.16/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -L/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -Wl,-rpath,/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -L/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -L/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib -lHYPRE -lkokkoskernels -lkokkoscontainers -lkokkoscore -lhipsparse -lhipblas -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -ldl -lmpi_gtl_hsa -lmpifort_cray -lmpi_cray -ldsmml -lpmi -lpmi2 -lxpmem -lstdc++ -lpgas-shmem -lquadmath -lcrayacc_amdgpu -lopenacc -lmodules -lfi -lcraymath -lf -lu -lcsup -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64 -lclang_rt.builtins-x86_64 -lquadmath -ldl -lmpi_gtl_hsa
-----------------------------------------
#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 4,4,4
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 5
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 4,4,4
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi true
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 14 unused database options. They are:
Option left: name:-mg_levels_esteig_ksp_max_it value: 10
Option left: name:-mg_levels_esteig_ksp_type value: cg
Option left: name:-mg_levels_ksp_chebyshev_esteig value: 0,0.05,0,1.05
Option left: name:-mg_levels_ksp_type value: chebyshev
Option left: name:-mg_levels_pc_type value: jacobi
Option left: name:-pc_gamg_coarse_eq_limit value: 100
Option left: name:-pc_gamg_coarse_grid_layout_type value: compact
Option left: name:-pc_gamg_esteig_ksp_max_it value: 10
Option left: name:-pc_gamg_esteig_ksp_type value: cg
Option left: name:-pc_gamg_process_eq_limit value: 400
Option left: name:-pc_gamg_repartition value: false
Option left: name:-pc_gamg_reuse_interpolation value: true
Option left: name:-pc_gamg_square_graph value: 0
Option left: name:-pc_gamg_threshold value: -0.01
-------------- next part --------------
DM Object: box 64 MPI processes
type: plex
box in 3 dimensions:
Number of 0-cells per rank: 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625 274625
Number of 1-cells per rank: 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200 811200
Number of 2-cells per rank: 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720 798720
Number of 3-cells per rank: 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144 262144
Labels:
celltype: 4 strata with value/size (0 (274625), 1 (811200), 4 (798720), 7 (262144))
depth: 4 strata with value/size (0 (274625), 1 (811200), 2 (798720), 3 (262144))
marker: 1 strata with value/size (1 (49530))
Face Sets: 3 strata with value/size (1 (16129), 3 (16129), 6 (16129))
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 64 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 64 MPI processes
type: jacobi
type DIAGONAL
linear system matrix = precond matrix:
Mat Object: 64 MPI processes
type: mpiaijkokkos
rows=133432831, cols=133432831
total: nonzeros=8477185319, allocated nonzeros=8477185319
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 64 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 64 MPI processes
type: jacobi
type DIAGONAL
linear system matrix = precond matrix:
Mat Object: 64 MPI processes
type: mpiaijkokkos
rows=133432831, cols=133432831
total: nonzeros=8477185319, allocated nonzeros=8477185319
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 64 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 64 MPI processes
type: jacobi
type DIAGONAL
linear system matrix = precond matrix:
Mat Object: 64 MPI processes
type: mpiaijkokkos
rows=133432831, cols=133432831
total: nonzeros=8477185319, allocated nonzeros=8477185319
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tests/data/../ex13 on a arch-olcf-crusher named crusher001 with 64 processors, by adams Fri Jan 21 23:57:45 2022
Using Petsc Development GIT revision: v3.16.3-665-g1012189b9a GIT Date: 2022-01-21 16:28:20 +0000
Max Max/Min Avg Total
Time (sec): 5.510e+02 1.000 5.510e+02
Objects: 2.158e+03 1.163 1.919e+03
Flop: 1.958e+11 1.036 1.936e+11 1.239e+13
Flop/sec: 3.554e+08 1.036 3.514e+08 2.249e+10
MPI Messages: 1.656e+04 3.672 9.423e+03 6.031e+05
MPI Message Lengths: 8.942e+08 2.047 7.055e+04 4.255e+10
MPI Reductions: 1.991e+03 1.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 5.3734e+02 97.5% 4.9150e+12 39.7% 2.287e+05 37.9% 8.566e+04 46.0% 7.660e+02 38.5%
1: PCSetUp: 2.4256e-01 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
2: KSP Solve only: 1.3391e+01 2.4% 7.4764e+12 60.3% 3.744e+05 62.1% 6.133e+04 54.0% 1.206e+03 60.6%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors)
CpuToGpu Count: total number of CPU to GPU copies per processor
CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor)
GpuToCpu Count: total number of GPU to CPU copies per processor
GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor)
GPU %F: percent flops on GPU in this event
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F
---------------------------------------------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
PetscBarrier 6 1.0 1.5685e+00 1.0 0.00e+00 0.0 1.4e+04 2.6e+03 2.1e+01 0 0 2 0 1 0 0 6 0 3 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSided 42 1.0 5.7396e+00 6.2 0.00e+00 0.0 1.0e+04 4.0e+00 4.2e+01 1 0 2 0 2 1 0 5 0 5 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSidedF 6 1.0 5.6774e+00 6.4 0.00e+00 0.0 2.2e+03 1.6e+06 6.0e+00 1 0 0 8 0 1 0 1 18 1 0 0 0 0.00e+00 0 0.00e+00 0
MatMult 48589241.7 5.0181e+00 1.2 5.37e+10 1.0 1.9e+05 6.0e+04 2.0e+00 1 27 32 27 0 1 69 84 59 0 675743 0 1 4.48e-01 0 0.00e+00 100
MatAssemblyBegin 43 1.0 6.3288e+00 4.1 0.00e+00 0.0 2.2e+03 1.6e+06 6.0e+00 1 0 0 8 0 1 0 1 18 1 0 0 0 0.00e+00 0 0.00e+00 0
MatAssemblyEnd 43 1.0 2.0404e+00 2.2 4.71e+06 0.0 0.0e+00 0.0e+00 9.0e+00 0 0 0 0 0 0 0 0 0 1 110 0 0 0.00e+00 0 0.00e+00 0
MatZeroEntries 3 1.0 5.1707e-03 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
MatView 1 1.0 2.6576e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSetUp 1 1.0 1.8340e-02 4.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 1 1.0 7.6610e+00 1.1 5.91e+10 1.0 1.9e+05 6.1e+04 6.0e+02 1 30 31 27 30 1 76 83 59 79 487953 2891172 1 4.48e-01 0 0.00e+00 100
SNESSolve 1 1.0 1.8233e+02 1.0 6.85e+10 1.0 1.9e+05 7.0e+04 6.1e+02 33 35 32 31 31 34 88 84 68 80 23790 2890722 3 1.83e+01 2 2.92e+01 86
SNESSetUp 1 1.0 5.8476e+01 1.0 0.00e+00 0.0 5.3e+03 7.7e+05 1.8e+01 11 0 1 10 1 11 0 2 21 2 0 0 0 0.00e+00 0 0.00e+00 0
SNESFunctionEval 2 1.0 3.3646e+01 1.1 6.33e+09 1.0 1.7e+03 5.1e+04 3.0e+00 6 3 0 0 0 6 8 1 0 0 12010 152293 3 3.46e+01 2 2.92e+01 0
SNESJacobianEval 2 1.0 4.3741e+02 1.0 1.21e+10 1.0 1.7e+03 2.2e+06 2.0e+00 79 6 0 9 0 81 16 1 19 0 1766 0 0 0.00e+00 2 2.92e+01 0
DMCreateInterp 1 1.0 2.9766e-03 1.1 8.29e+04 1.0 1.1e+03 8.0e+02 1.6e+01 0 0 0 0 1 0 0 0 0 2 1783 0 0 0.00e+00 0 0.00e+00 0
DMCreateMat 1 1.0 5.8460e+01 1.0 0.00e+00 0.0 5.3e+03 7.7e+05 1.8e+01 11 0 1 10 1 11 0 2 21 2 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Partition 1 1.0 3.4383e-03 1.4 0.00e+00 0.0 3.2e+02 1.1e+02 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Migration 1 1.0 8.5458e-03 1.0 0.00e+00 0.0 1.8e+03 8.3e+01 2.9e+01 0 0 0 0 1 0 0 1 0 4 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartSelf 1 1.0 8.8702e-04132.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblInv 1 1.0 1.2961e-03 4.9 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblSF 1 1.0 4.1699e-04 2.9 0.00e+00 0.0 1.3e+02 5.6e+01 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartStrtSF 1 1.0 3.4840e-04 1.6 0.00e+00 0.0 6.3e+01 2.2e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPointSF 1 1.0 1.6911e-03 1.1 0.00e+00 0.0 1.3e+02 2.7e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterp 19 1.0 1.3618e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistribute 1 1.0 1.3544e-02 1.1 0.00e+00 0.0 2.2e+03 9.7e+01 3.7e+01 0 0 0 0 2 0 0 1 0 5 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistCones 1 1.0 6.2767e-04 1.0 0.00e+00 0.0 3.8e+02 1.4e+02 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistLabels 1 1.0 1.4716e-03 1.0 0.00e+00 0.0 9.0e+02 6.6e+01 2.4e+01 0 0 0 0 1 0 0 0 0 3 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistField 1 1.0 6.0844e-03 1.0 0.00e+00 0.0 4.4e+02 5.9e+01 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexStratify 34 1.0 1.1206e-01 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexSymmetrize 34 1.0 1.7210e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPrealloc 1 1.0 5.8338e+01 1.0 0.00e+00 0.0 5.3e+03 7.7e+05 1.6e+01 11 0 1 10 1 11 0 2 21 2 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexResidualFE 2 1.0 3.1420e+01 1.1 6.29e+09 1.0 0.0e+00 0.0e+00 0.0e+00 5 3 0 0 0 5 8 0 0 0 12818 0 0 0.00e+00 0 0.00e+00 0
DMPlexJacobianFE 2 1.0 4.3674e+02 1.0 1.21e+10 1.0 1.1e+03 3.2e+06 2.0e+00 79 6 0 8 0 81 16 0 18 0 1767 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterpFE 1 1.0 2.8815e-03 1.1 8.29e+04 1.0 1.1e+03 8.0e+02 1.6e+01 0 0 0 0 1 0 0 0 0 2 1842 0 0 0.00e+00 0 0.00e+00 0
SFSetGraph 46 1.0 9.6775e-03 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFSetUp 36 1.0 7.7526e-01 1.3 0.00e+00 0.0 1.9e+04 7.8e+04 3.6e+01 0 0 3 3 2 0 0 8 7 5 0 0 0 0.00e+00 0 0.00e+00 0
SFBcastBegin 68 1.0 1.3188e+0062.9 0.00e+00 0.0 1.4e+04 4.8e+04 0.0e+00 0 0 2 2 0 0 0 6 3 0 0 0 1 1.20e+00 4 5.83e+01 0
SFBcastEnd 68 1.0 3.1105e+0013.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 2.48e-02 0 0.00e+00 0
SFReduceBegin 17 1.0 2.0149e-0121.8 4.19e+06 1.0 4.5e+03 3.2e+05 0.0e+00 0 0 1 3 0 0 0 2 7 0 1324 0 2 3.34e+01 0 0.00e+00 100
SFReduceEnd 17 1.0 4.1475e+0023.7 9.91e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1 0 0 0.00e+00 0 0.00e+00 100
SFFetchOpBegin 2 1.0 1.7922e-02255.6 0.00e+00 0.0 5.6e+02 8.2e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 0 0 0.00e+00 0 0.00e+00 0
SFFetchOpEnd 2 1.0 1.2086e-01 2.7 0.00e+00 0.0 5.6e+02 8.2e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 0 0 0.00e+00 0 0.00e+00 0
SFCreateEmbed 9 1.0 4.7403e-0130.7 0.00e+00 0.0 2.3e+03 2.4e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFDistSection 9 1.0 6.8144e-02 2.9 0.00e+00 0.0 4.1e+03 2.3e+04 1.1e+01 0 0 1 0 1 0 0 2 0 1 0 0 0 0.00e+00 0 0.00e+00 0
SFSectionSF 17 1.0 2.5789e-01 2.9 0.00e+00 0.0 6.4e+03 7.4e+04 1.7e+01 0 0 1 1 1 0 0 3 2 2 0 0 0 0.00e+00 0 0.00e+00 0
SFRemoteOff 8 1.0 5.0936e-0113.1 0.00e+00 0.0 7.3e+03 4.3e+03 5.0e+00 0 0 1 0 0 0 0 3 0 1 0 0 0 0.00e+00 0 0.00e+00 0
SFPack 294 1.0 6.9303e-0124.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 2 5.96e-01 0 0.00e+00 0
SFUnpack 296 1.0 2.1804e-01 9.4 4.29e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1246 0 0 2.48e-02 0 0.00e+00 100
VecTDot 401 1.0 1.4922e+00 1.9 1.68e+09 1.0 0.0e+00 0.0e+00 4.0e+02 0 1 0 0 20 0 2 0 0 52 71714 324873 0 0.00e+00 0 0.00e+00 100
VecNorm 201 1.0 9.2573e-01 3.8 8.43e+08 1.0 0.0e+00 0.0e+00 2.0e+02 0 0 0 0 10 0 1 0 0 26 57943 538637 0 0.00e+00 0 0.00e+00 100
VecCopy 2 1.0 1.6417e-03 8.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 55 1.0 3.4422e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 400 1.0 4.3082e-0112.1 1.68e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 247773 439060 0 0.00e+00 0 0.00e+00 100
VecAYPX 199 1.0 9.2046e-0130.5 8.35e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 57695 69274 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 201 1.0 2.0488e-01 8.9 4.22e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 130904 305411 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 201 1.0 8.3090e-01 4.9 0.00e+00 0.0 1.9e+05 6.0e+04 2.0e+00 0 0 32 27 0 0 0 84 59 0 0 0 1 4.48e-01 0 0.00e+00 0
VecScatterEnd 201 1.0 2.9748e+00 6.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DualSpaceSetUp 2 1.0 3.8129e-03 1.3 1.80e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 30 0 0 0.00e+00 0 0.00e+00 0
FESetUp 2 1.0 5.1432e-03 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCSetUp 1 1.0 1.3717e-05 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCApply 201 1.0 5.3934e-01 1.5 4.22e+08 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 1 0 0 0 49728 237264 0 0.00e+00 0 0.00e+00 100
--- Event Stage 1: PCSetUp
PCSetUp 1 1.0 2.7733e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 100 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
--- Event Stage 2: KSP Solve only
MatMult 400 1.0 9.4001e+00 1.2 1.07e+11 1.0 3.7e+05 6.1e+04 0.0e+00 2 55 62 54 0 65 91100100 0 721451 0 0 0.00e+00 0 0.00e+00 100
MatView 2 1.0 4.4729e-03 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 2 1.0 1.3945e+01 1.1 1.18e+11 1.0 3.7e+05 6.1e+04 1.2e+03 2 60 62 54 60 100100100100100 536128 2881308 0 0.00e+00 0 0.00e+00 100
SFPack 400 1.0 2.4445e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFUnpack 400 1.0 1.2255e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecTDot 802 1.0 2.7256e+00 1.9 3.36e+09 1.0 0.0e+00 0.0e+00 8.0e+02 0 2 0 0 40 14 3 0 0 67 78523 335526 0 0.00e+00 0 0.00e+00 100
VecNorm 402 1.0 1.9145e+00 3.7 1.69e+09 1.0 0.0e+00 0.0e+00 4.0e+02 0 1 0 0 20 6 1 0 0 33 56035 533339 0 0.00e+00 0 0.00e+00 100
VecCopy 4 1.0 6.3156e-03 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 4 1.0 3.8228e-0315.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 800 1.0 9.0587e-0111.1 3.36e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 6 3 0 0 0 235676 444654 0 0.00e+00 0 0.00e+00 100
VecAYPX 398 1.0 1.9393e+0029.6 1.67e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 6 1 0 0 0 54767 65448 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 402 1.0 3.5580e-01 6.2 8.43e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 1 0 0 0 150758 318605 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 400 1.0 1.3900e+0028.9 0.00e+00 0.0 3.7e+05 6.1e+04 0.0e+00 0 0 62 54 0 7 0100100 0 0 0 0 0.00e+00 0 0.00e+00 0
VecScatterEnd 400 1.0 5.8686e+00 6.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCApply 402 1.0 3.5612e-01 6.1 8.43e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 1 0 0 0 150622 318605 0 0.00e+00 0 0.00e+00 100
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 33 33 19008 0.
SNES 1 1 1540 0.
DMSNES 1 1 688 0.
Krylov Solver 1 1 1664 0.
DMKSP interface 1 1 656 0.
Matrix 76 76 1627827176 0.
Distributed Mesh 72 72 58971680 0.
DM Label 180 180 113760 0.
Quadrature 148 148 87616 0.
Mesh Transform 6 6 4536 0.
Index Set 833 833 4238868 0.
IS L to G Mapping 2 2 8590824 0.
Section 256 256 182272 0.
Star Forest Graph 179 179 195360 0.
Discrete System 121 121 116164 0.
Weak Form 122 122 75152 0.
GraphPartitioner 34 34 23392 0.
Vector 55 55 157137560 0.
Linear Space 5 5 3416 0.
Dual Space 26 26 24336 0.
FE Space 2 2 1576 0.
Viewer 2 1 840 0.
Preconditioner 1 1 872 0.
Field over DM 1 1 704 0.
--- Event Stage 1: PCSetUp
--- Event Stage 2: KSP Solve only
========================================================================================================================
Average time to get PetscTime(): 5.31e-08
Average time for MPI_Barrier(): 3.631e-06
Average time for zero size MPI_Send(): 1.02498e-05
#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 4,4,4
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 6
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 4,4,4
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi true
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=cc --with-cxx=CC --with-fc=ftn --with-fortran-bindings=0 LIBS="-L/opt/cray/pe/mpich/8.1.12/gtl/lib -lmpi_gtl_hsa" --with-debugging=0 --with-mpiexec="srun -p batch -N 1 -A csc314_crusher -t 00:10:00" --with-hip --with-hipc=hipcc --download-hypre --download-hypre-configure-arguments=--enable-unified-memory --with-hip-arch=gfx90a --download-kokkos --download-kokkos-kernels --with-kokkos-kernels-tpl=0 PETSC_ARCH=arch-olcf-crusher
-----------------------------------------
Libraries compiled on 2022-01-22 03:06:22 on login2
Machine characteristics: Linux-5.3.18-59.16_11.0.39-cray_shasta_c-x86_64-with-glibc2.3.4
Using PETSc directory: /gpfs/alpine/csc314/scratch/adams/petsc
Using PETSc arch: arch-olcf-crusher
-----------------------------------------
Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O3
Using Fortran compiler: ftn -fPIC
-----------------------------------------
Using include paths: -I/gpfs/alpine/csc314/scratch/adams/petsc/include -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/include -I/opt/rocm-4.5.0/include
-----------------------------------------
Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -lpetsc -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -Wl,-rpath,/opt/rocm-4.5.0/lib -L/opt/rocm-4.5.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/gtl/lib -L/opt/cray/pe/mpich/8.1.12/gtl/lib -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64 -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/21.08.1.2/CRAY/9.0/x86_64/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -L/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.16/lib -L/opt/cray/pe/pmi/6.0.16/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -L/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -Wl,-rpath,/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -L/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -L/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib -lHYPRE -lkokkoskernels -lkokkoscontainers -lkokkoscore -lhipsparse -lhipblas -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -ldl -lmpi_gtl_hsa -lmpifort_cray -lmpi_cray -ldsmml -lpmi -lpmi2 -lxpmem -lstdc++ -lpgas-shmem -lquadmath -lcrayacc_amdgpu -lopenacc -lmodules -lfi -lcraymath -lf -lu -lcsup -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64 -lclang_rt.builtins-x86_64 -lquadmath -ldl -lmpi_gtl_hsa
-----------------------------------------
#PETSc Option Table entries:
-benchmark_it 2
-dm_distribute
-dm_mat_type aijkokkos
-dm_plex_box_faces 4,4,4
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 6
-dm_vec_type kokkos
-dm_view
-ksp_converged_reason
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-ksp_view
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-options_left
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 4,4,4
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi true
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 14 unused database options. They are:
Option left: name:-mg_levels_esteig_ksp_max_it value: 10
Option left: name:-mg_levels_esteig_ksp_type value: cg
Option left: name:-mg_levels_ksp_chebyshev_esteig value: 0,0.05,0,1.05
Option left: name:-mg_levels_ksp_type value: chebyshev
Option left: name:-mg_levels_pc_type value: jacobi
Option left: name:-pc_gamg_coarse_eq_limit value: 100
Option left: name:-pc_gamg_coarse_grid_layout_type value: compact
Option left: name:-pc_gamg_esteig_ksp_max_it value: 10
Option left: name:-pc_gamg_esteig_ksp_type value: cg
Option left: name:-pc_gamg_process_eq_limit value: 400
Option left: name:-pc_gamg_repartition value: false
Option left: name:-pc_gamg_reuse_interpolation value: true
Option left: name:-pc_gamg_square_graph value: 0
Option left: name:-pc_gamg_threshold value: -0.01
More information about the petsc-dev
mailing list