[petsc-dev] Kokkos/Crusher perforance
Mark Adams
mfadams at lbl.gov
Tue Jan 25 11:29:06 CST 2022
> > VecPointwiseMult 201 1.0 1.0471e-02 1.1 3.09e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 1 0 0 0 1 1 0 0 0 235882 290088 0 0.00e+00 0
> 0.00e+00 100
> > VecScatterBegin 200 1.0 1.8458e-01 1.1 0.00e+00 0.0 1.1e+04 6.6e+04
> 1.0e+00 2 0 99 79 0 19 0100100 0 0 0 1 2.04e-04 0
> 0.00e+00 0
> > VecScatterEnd 200 1.0 1.9007e-02 3.7 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0
> 0.00e+00 0
> I'm curious how these change with problem size. (To what extent are we
> latency vs bandwidth limited?)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20220125/27cc7fd6/attachment-0001.html>
-------------- next part --------------
DM Object: box 8 MPI processes
type: plex
box in 3 dimensions:
Number of 0-cells per rank: 35937 35937 35937 35937 35937 35937 35937 35937
Number of 1-cells per rank: 104544 104544 104544 104544 104544 104544 104544 104544
Number of 2-cells per rank: 101376 101376 101376 101376 101376 101376 101376 101376
Number of 3-cells per rank: 32768 32768 32768 32768 32768 32768 32768 32768
celltype: 4 strata with value/size (0 (35937), 1 (104544), 4 (101376), 7 (32768))
depth: 4 strata with value/size (0 (35937), 1 (104544), 2 (101376), 3 (32768))
marker: 1 strata with value/size (1 (12474))
Face Sets: 3 strata with value/size (1 (3969), 3 (3969), 6 (3969))
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=2048383, cols=2048383
total: nonzeros=127263527, allocated nonzeros=127263527
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=2048383, cols=2048383
total: nonzeros=127263527, allocated nonzeros=127263527
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=2048383, cols=2048383
total: nonzeros=127263527, allocated nonzeros=127263527
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
**************************************** ***********************************************************************************************************************
*** WIDEN YOUR WINDOW TO 160 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------
# #
# WARNING!!! #
# #
# This code was compiled with GPU support and you've #
# created PETSc/GPU objects, but you intentionally used #
# -use_gpu_aware_mpi 0, such that PETSc had to copy data #
# from GPU to CPU for communication. To get meaningfull #
# timing results, please use GPU-aware MPI instead. #
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tests/data/../ex13 on a arch-olcf-crusher named crusher002 with 8 processors, by adams Tue Jan 25 10:11:42 2022
Using Petsc Development GIT revision: v3.16.3-696-g46640c56cb GIT Date: 2022-01-25 09:20:51 -0500
Max Max/Min Avg Total
Time (sec): 6.781e+01 1.000 6.781e+01
Objects: 1.920e+03 1.028 1.877e+03
Flop: 2.402e+10 1.054 2.340e+10 1.872e+11
Flop/sec: 3.543e+08 1.054 3.451e+08 2.761e+09
MPI Messages: 4.778e+03 1.063 4.552e+03 3.642e+04
MPI Message Lengths: 1.120e+08 1.030 2.416e+04 8.799e+08
MPI Reductions: 1.988e+03 1.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 6.7422e+01 99.4% 7.4725e+10 39.9% 1.402e+04 38.5% 2.884e+04 45.9% 7.630e+02 38.4%
1: PCSetUp: 1.5260e-02 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
2: KSP Solve only: 3.7025e-01 0.5% 1.1247e+11 60.1% 2.240e+04 61.5% 2.123e+04 54.1% 1.206e+03 60.7%
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors)
CpuToGpu Count: total number of CPU to GPU copies per processor
CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor)
GpuToCpu Count: total number of GPU to CPU copies per processor
GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor)
GPU %F: percent flops on GPU in this event
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F
--- Event Stage 0: Main Stage
PetscBarrier 5 1.0 1.5071e-01 1.0 0.00e+00 0.0 7.8e+02 9.9e+02 1.8e+01 0 0 2 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSided 40 1.0 2.6699e-0122.5 0.00e+00 0.0 7.1e+02 4.0e+00 4.0e+01 0 0 2 0 2 0 0 5 0 5 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSidedF 6 1.0 2.4194e-0125.8 0.00e+00 0.0 1.5e+02 4.8e+05 6.0e+00 0 0 0 8 0 0 0 1 18 1 0 0 0 0.00e+00 0 0.00e+00 0
MatMult 12109 1.0 1.2425e-01 1.1 6.56e+09 1.1 1.1e+04 2.1e+04 2.0e+00 0 27 32 27 0 0 68 82 59 0 409706 663872 401 5.95e+01 400 5.94e+01 100
MatAssemblyBegin 43 1.0 2.5993e-01 2.6 0.00e+00 0.0 1.5e+02 4.8e+05 6.0e+00 0 0 0 8 0 0 0 1 18 1 0 0 0 0.00e+00 0 0.00e+00 0
MatAssemblyEnd 43 1.0 1.8228e-01 4.1 1.16e+06 0.0 0.0e+00 0.0e+00 9.0e+00 0 0 0 0 0 0 0 0 0 1 25 0 0 0.00e+00 0 0.00e+00 0
MatZeroEntries 3 1.0 4.3496e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
MatView 1 1.0 7.9864e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSetUp 1 1.0 1.3476e-03 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 1 1.0 2.7099e-01 1.3 7.24e+09 1.1 1.1e+04 2.1e+04 6.0e+02 0 30 31 27 30 0 75 81 59 79 207524 452242 401 5.95e+01 400 5.94e+01 100
SNESSolve 1 1.0 2.9168e+01 1.0 8.41e+09 1.1 1.1e+04 2.4e+04 6.1e+02 43 35 31 31 31 43 88 82 68 80 2251 451845 405 6.17e+01 406 6.37e+01 86
SNESSetUp 1 1.0 6.0051e+00 1.0 0.00e+00 0.0 3.6e+02 2.3e+05 1.8e+01 9 0 1 10 1 9 0 3 21 2 0 0 0 0.00e+00 0 0.00e+00 0
SNESFunctionEval 2 1.0 1.9772e+00 1.1 7.96e+08 1.0 1.1e+02 1.5e+04 3.0e+00 3 3 0 0 0 3 9 1 0 0 3222 23096 6 4.32e+00 6 4.29e+00 0
SNESJacobianEval 2 1.0 5.8925e+01 1.0 1.52e+09 1.0 1.1e+02 6.5e+05 2.0e+00 87 6 0 8 0 87 16 1 18 0 206 0 0 0.00e+00 6 4.29e+00 0
DMCreateInterp 1 1.0 8.8719e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01 0 0 0 0 1 0 0 1 0 2 748 0 0 0.00e+00 0 0.00e+00 0
DMCreateMat 1 1.0 6.0045e+00 1.0 0.00e+00 0.0 3.6e+02 2.3e+05 1.8e+01 9 0 1 10 1 9 0 3 21 2 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Partition 1 1.0 6.9055e-04 1.1 0.00e+00 0.0 3.5e+01 1.1e+02 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Migration 1 1.0 2.4871e-01 1.0 0.00e+00 0.0 2.0e+02 8.2e+01 2.9e+01 0 0 1 0 1 0 0 1 0 4 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartSelf 1 1.0 8.2599e-0512.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblInv 1 1.0 3.1138e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblSF 1 1.0 2.4519e-04 3.8 0.00e+00 0.0 1.4e+01 5.6e+01 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartStrtSF 1 1.0 9.7758e-05 1.2 0.00e+00 0.0 7.0e+00 2.2e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPointSF 1 1.0 2.1684e-04 1.1 0.00e+00 0.0 1.4e+01 2.7e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterp 19 1.0 5.8227e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistribute 1 1.0 2.4963e-01 1.0 0.00e+00 0.0 2.5e+02 9.7e+01 3.7e+01 0 0 1 0 2 0 0 2 0 5 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistCones 1 1.0 1.0398e-04 1.0 0.00e+00 0.0 4.2e+01 1.4e+02 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistLabels 1 1.0 2.2918e-04 1.0 0.00e+00 0.0 1.0e+02 6.6e+01 2.4e+01 0 0 0 0 1 0 0 1 0 3 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistField 1 1.0 2.4822e-01 1.0 0.00e+00 0.0 4.9e+01 5.9e+01 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexStratify 33 1.0 5.5400e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexSymmetrize 33 1.0 1.2772e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPrealloc 1 1.0 5.9988e+00 1.0 0.00e+00 0.0 3.6e+02 2.3e+05 1.6e+01 9 0 1 10 1 9 0 3 21 2 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexResidualFE 2 1.0 1.5634e+00 1.0 7.87e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 3 0 0 0 2 8 0 0 0 4027 0 0 0.00e+00 0 0.00e+00 0
DMPlexJacobianFE 2 1.0 5.8811e+01 1.0 1.51e+09 1.0 7.6e+01 9.7e+05 2.0e+00 87 6 0 8 0 87 16 1 18 0 205 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterpFE 1 1.0 8.6671e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01 0 0 0 0 1 0 0 1 0 2 766 0 0 0.00e+00 0 0.00e+00 0
SFSetGraph 43 1.0 7.8551e-04 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFSetUp 34 1.0 5.0999e-02 1.2 0.00e+00 0.0 1.3e+03 2.4e+04 3.4e+01 0 0 3 3 2 0 0 9 7 4 0 0 0 0.00e+00 0 0.00e+00 0
SFBcastBegin 65 1.0 1.6080e-0147.2 0.00e+00 0.0 9.8e+02 1.4e+04 0.0e+00 0 0 3 2 0 0 0 7 3 0 0 0 1 2.44e-02 11 8.58e+00 0
SFBcastEnd 65 1.0 2.6412e-0137.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFReduceBegin 16 1.0 1.3437e-0172.5 5.24e+05 1.0 2.9e+02 1.0e+05 0.0e+00 0 0 1 3 0 0 0 2 7 0 30 0 2 4.10e+00 0 0.00e+00 100
SFReduceEnd 16 1.0 1.9717e-0125.8 2.50e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 4 1.95e-01 0 0.00e+00 100
SFFetchOpBegin 2 1.0 5.4462e-04134.6 0.00e+00 0.0 3.8e+01 2.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 0 0 0.00e+00 0 0.00e+00 0
SFFetchOpEnd 2 1.0 2.9438e-03 1.8 0.00e+00 0.0 3.8e+01 2.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 0 0 0.00e+00 0 0.00e+00 0
SFCreateEmbed 8 1.0 9.4267e-02141.4 0.00e+00 0.0 1.4e+02 8.5e+02 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFDistSection 9 1.0 3.2934e-03 2.0 0.00e+00 0.0 3.1e+02 6.5e+03 1.1e+01 0 0 1 0 1 0 0 2 0 1 0 0 0 0.00e+00 0 0.00e+00 0
SFSectionSF 16 1.0 2.0571e-02 1.7 0.00e+00 0.0 4.8e+02 2.0e+04 1.6e+01 0 0 1 1 1 0 0 3 2 2 0 0 0 0.00e+00 0 0.00e+00 0
SFRemoteOff 7 1.0 9.5613e-0245.9 0.00e+00 0.0 4.2e+02 1.6e+03 4.0e+00 0 0 1 0 0 0 0 3 0 1 0 0 0 0.00e+00 0 0.00e+00 0
SFPack 290 1.0 1.5873e-0160.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 2 9.87e-02 0 0.00e+00 0
SFUnpack 292 1.0 1.3361e-0164.1 5.49e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 31 0 0 0.00e+00 0 0.00e+00 100
VecTDot 401 1.0 3.4812e-02 1.4 2.10e+08 1.0 0.0e+00 0.0e+00 4.0e+02 0 1 0 0 20 0 2 0 0 53 47191 100573 0 0.00e+00 0 0.00e+00 100
VecNorm 201 1.0 8.0375e-02 5.6 1.05e+08 1.0 0.0e+00 0.0e+00 2.0e+02 0 0 0 0 10 0 1 0 0 26 10245 76979 0 0.00e+00 0 0.00e+00 100
VecCopy 2 1.0 1.0589e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 54 1.0 1.5410e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 400 1.0 1.1236e-02 1.1 2.10e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 145846 203600 0 0.00e+00 0 0.00e+00 100
VecAYPX 199 1.0 5.2408e-03 1.1 1.04e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 155561 226571 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 201 1.0 6.0364e-03 1.1 5.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 68207 98699 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 201 1.0 2.0069e-02 1.5 0.00e+00 0.0 1.1e+04 2.1e+04 2.0e+00 0 0 32 27 0 0 0 82 59 0 0 0 1 7.43e-02 400 5.94e+01 0
VecScatterEnd 201 1.0 1.1790e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 400 5.94e+01 0 0.00e+00 0
DualSpaceSetUp 2 1.0 2.6182e-03 1.0 1.80e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6 0 0 0.00e+00 0 0.00e+00 0
FESetUp 2 1.0 9.8481e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCSetUp 1 1.0 4.3290e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCApply 201 1.0 2.7593e-02 1.0 5.27e+07 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 1 0 0 0 14921 40922 0 0.00e+00 0 0.00e+00 100
--- Event Stage 1: PCSetUp
PCSetUp 1 1.0 1.6107e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 100 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
--- Event Stage 2: KSP Solve only
MatMult 400 1.0 2.1590e-01 1.1 1.31e+10 1.1 2.2e+04 2.1e+04 0.0e+00 0 54 62 54 0 56 91100100 0 471565 716907 800 1.19e+02 800 1.19e+02 100
MatView 2 1.0 8.4994e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 2 1.0 4.0167e-01 1.2 1.45e+10 1.1 2.2e+04 2.1e+04 1.2e+03 1 60 62 54 61 100100100100100 280015 509401 800 1.19e+02 800 1.19e+02 100
SFPack 400 1.0 1.2314e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFUnpack 400 1.0 7.6822e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecTDot 802 1.0 6.8937e-02 1.4 4.20e+08 1.0 0.0e+00 0.0e+00 8.0e+02 0 2 0 0 40 15 3 0 0 67 47661 96803 0 0.00e+00 0 0.00e+00 100
VecNorm 402 1.0 9.4944e-02 3.9 2.11e+08 1.0 0.0e+00 0.0e+00 4.0e+02 0 1 0 0 20 17 1 0 0 33 17346 98036 0 0.00e+00 0 0.00e+00 100
VecCopy 4 1.0 2.0016e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 4 1.0 1.6835e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 800 1.0 2.0465e-02 1.1 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 5 3 0 0 0 160145 232422 0 0.00e+00 0 0.00e+00 100
VecAYPX 398 1.0 1.0591e-02 1.1 2.09e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 3 1 0 0 0 153947 223967 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 402 1.0 1.1385e-02 1.1 1.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 3 1 0 0 0 72327 105141 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 400 1.0 3.7700e-02 1.6 0.00e+00 0.0 2.2e+04 2.1e+04 0.0e+00 0 0 62 54 0 8 0100100 0 0 0 0 0.00e+00 800 1.19e+02 0
VecScatterEnd 400 1.0 2.1792e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 5 0 0 0 0 0 0 800 1.19e+02 0 0.00e+00 0
PCApply 402 1.0 1.1477e-02 1.1 1.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 3 1 0 0 0 71747 105141 0 0.00e+00 0 0.00e+00 100
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 32 32 18432 0.
SNES 1 1 1540 0.
DMSNES 1 1 688 0.
Krylov Solver 1 1 1664 0.
DMKSP interface 1 1 656 0.
Matrix 75 75 195551600 0.
Distributed Mesh 70 70 7826872 0.
DM Label 172 172 108704 0.
Quadrature 148 148 87616 0.
Mesh Transform 5 5 3780 0.
Index Set 633 633 1440932 0.
IS L to G Mapping 2 2 1100416 0.
Section 249 249 177288 0.
Star Forest Graph 173 173 188592 0.
Discrete System 116 116 111364 0.
Weak Form 117 117 72072 0.
GraphPartitioner 33 33 22704 0.
Vector 54 54 19589336 0.
Linear Space 5 5 3416 0.
Dual Space 26 26 24336 0.
FE Space 2 2 1576 0.
Viewer 2 1 840 0.
Preconditioner 1 1 872 0.
Field over DM 1 1 704 0.
--- Event Stage 1: PCSetUp
--- Event Stage 2: KSP Solve only
Average time to get PetscTime(): 3.4e-08
Average time for MPI_Barrier(): 2.611e-06
Average time for zero size MPI_Send(): 1.07531e-05
#PETSc Option Table entries:
-benchmark_it 2
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 5
-dm_vec_type kokkos
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=cc --with-cxx=CC --with-fc=ftn --with-fortran-bindings=0 LIBS="-L/opt/cray/pe/mpich/8.1.12/gtl/lib -lmpi_gtl_hsa" --with-debugging=0 --COPTFLAGS="-g -O" --CXXOPTFLAGS="-g -O" --FOPTFLAGS=-g --with-mpiexec="srun -p batch -N 1 -A csc314_crusher -t 00:10:00" --with-hip --with-hipc=hipcc --download-hypre --with-hip-arch=gfx90a --download-kokkos --download-kokkos-kernels --with-kokkos-kernels-tpl=0 --download-p4est=1 --with-zlib-dir=/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4 PETSC_ARCH=arch-olcf-crusher
Libraries compiled on 2022-01-25 14:29:13 on login2
Machine characteristics: Linux-5.3.18-59.16_11.0.39-cray_shasta_c-x86_64-with-glibc2.3.4
Using PETSc directory: /gpfs/alpine/csc314/scratch/adams/petsc
Using PETSc arch: arch-olcf-crusher
Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O
Using Fortran compiler: ftn -fPIC -g
Using include paths: -I/gpfs/alpine/csc314/scratch/adams/petsc/include -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/include -I/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/include -I/opt/rocm-4.5.0/include
Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -lpetsc -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -Wl,-rpath,/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -L/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -Wl,-rpath,/opt/rocm-4.5.0/lib -L/opt/rocm-4.5.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/gtl/lib -L/opt/cray/pe/mpich/8.1.12/gtl/lib -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64 -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/ -L/opt/cray/pe/libsci/ -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -L/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.16/lib -L/opt/cray/pe/pmi/6.0.16/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -L/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -Wl,-rpath,/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -L/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -L/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -lHYPRE -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lz -lhipsparse -lhipblas -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -ldl -lmpi_gtl_hsa -lmpifort_cray -lmpi_cray -ldsmml -lpmi -lpmi2 -lxpmem -lstdc++ -lpgas-shmem -lquadmath -lmodules -lfi -lcraymath -lf -lu -lcsup -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64 -lclang_rt.builtins-x86_64 -lquadmath -ldl -lmpi_gtl_hsa
# #
# WARNING!!! #
# #
# This code was compiled with GPU support and you've #
# created PETSc/GPU objects, but you intentionally used #
# -use_gpu_aware_mpi 0, such that PETSc had to copy data #
# from GPU to CPU for communication. To get meaningfull #
# timing results, please use GPU-aware MPI instead. #
#PETSc Option Table entries:
-benchmark_it 2
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 5
-dm_vec_type kokkos
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 14 unused database options. They are:
Option left: name:-mg_levels_esteig_ksp_max_it value: 10
Option left: name:-mg_levels_esteig_ksp_type value: cg
Option left: name:-mg_levels_ksp_chebyshev_esteig value: 0,0.05,0,1.05
Option left: name:-mg_levels_ksp_type value: chebyshev
Option left: name:-mg_levels_pc_type value: jacobi
Option left: name:-pc_gamg_coarse_eq_limit value: 100
Option left: name:-pc_gamg_coarse_grid_layout_type value: compact
Option left: name:-pc_gamg_esteig_ksp_max_it value: 10
Option left: name:-pc_gamg_esteig_ksp_type value: cg
Option left: name:-pc_gamg_process_eq_limit value: 400
Option left: name:-pc_gamg_repartition value: false
Option left: name:-pc_gamg_reuse_interpolation value: true
Option left: name:-pc_gamg_square_graph value: 0
Option left: name:-pc_gamg_threshold value: -0.01
-------------- next part --------------
DM Object: box 8 MPI processes
type: plex
box in 3 dimensions:
Number of 0-cells per rank: 729 729 729 729 729 729 729 729
Number of 1-cells per rank: 1944 1944 1944 1944 1944 1944 1944 1944
Number of 2-cells per rank: 1728 1728 1728 1728 1728 1728 1728 1728
Number of 3-cells per rank: 512 512 512 512 512 512 512 512
celltype: 4 strata with value/size (0 (729), 1 (1944), 4 (1728), 7 (512))
depth: 4 strata with value/size (0 (729), 1 (1944), 2 (1728), 3 (512))
marker: 1 strata with value/size (1 (810))
Face Sets: 3 strata with value/size (1 (225), 3 (225), 6 (225))
Linear solve converged due to CONVERGED_RTOL iterations 60
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=29791, cols=29791
total: nonzeros=1685159, allocated nonzeros=1685159
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve converged due to CONVERGED_RTOL iterations 60
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=29791, cols=29791
total: nonzeros=1685159, allocated nonzeros=1685159
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve converged due to CONVERGED_RTOL iterations 60
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=29791, cols=29791
total: nonzeros=1685159, allocated nonzeros=1685159
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
**************************************** ***********************************************************************************************************************
*** WIDEN YOUR WINDOW TO 160 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------
# #
# WARNING!!! #
# #
# This code was compiled with GPU support and you've #
# created PETSc/GPU objects, but you intentionally used #
# -use_gpu_aware_mpi 0, such that PETSc had to copy data #
# from GPU to CPU for communication. To get meaningfull #
# timing results, please use GPU-aware MPI instead. #
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tests/data/../ex13 on a arch-olcf-crusher named crusher002 with 8 processors, by adams Tue Jan 25 10:10:23 2022
Using Petsc Development GIT revision: v3.16.3-696-g46640c56cb GIT Date: 2022-01-25 09:20:51 -0500
Max Max/Min Avg Total
Time (sec): 2.126e+00 1.000 2.126e+00
Objects: 1.780e+03 1.031 1.737e+03
Flop: 1.350e+08 1.184 1.240e+08 9.923e+08
Flop/sec: 6.348e+07 1.184 5.835e+07 4.668e+08
MPI Messages: 1.782e+03 1.170 1.574e+03 1.260e+04
MPI Message Lengths: 3.177e+06 1.120 1.871e+03 2.357e+07
MPI Reductions: 7.190e+02 1.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 2.0779e+00 97.8% 5.4136e+08 54.6% 5.875e+03 46.6% 2.457e+03 61.3% 3.360e+02 46.7%
1: PCSetUp: 1.6083e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
2: KSP Solve only: 4.7610e-02 2.2% 4.5091e+08 45.4% 6.720e+03 53.4% 1.359e+03 38.7% 3.640e+02 50.6%
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors)
CpuToGpu Count: total number of CPU to GPU copies per processor
CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor)
GpuToCpu Count: total number of GPU to CPU copies per processor
GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor)
GPU %F: percent flops on GPU in this event
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F
--- Event Stage 0: Main Stage
PetscBarrier 3 1.0 4.0715e-03 1.0 0.00e+00 0.0 4.8e+02 1.1e+02 1.2e+01 0 0 4 0 2 0 0 8 0 4 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSided 36 1.0 1.7726e-01 6.7 0.00e+00 0.0 6.3e+02 4.0e+00 3.6e+01 4 0 5 0 5 5 0 11 0 11 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSidedF 6 1.0 1.7505e-01 7.2 0.00e+00 0.0 1.5e+02 2.9e+04 6.0e+00 4 0 1 18 1 4 0 3 30 2 0 0 0 0.00e+00 0 0.00e+00 0
MatMult 737 1.0 1.6693e-02 1.1 2.86e+07 1.3 3.6e+03 1.3e+03 2.0e+00 1 20 29 20 0 1 37 62 32 1 12123 19363 121 1.15e+00 120 1.14e+00 100
MatAssemblyBegin 43 1.0 1.7518e-01 6.3 0.00e+00 0.0 1.5e+02 2.9e+04 6.0e+00 5 0 1 18 1 5 0 3 30 2 0 0 0 0.00e+00 0 0.00e+00 0
MatAssemblyEnd 43 1.0 4.8427e-02 1.3 6.81e+04 0.0 0.0e+00 0.0e+00 9.0e+00 2 0 0 0 1 2 0 0 0 3 5 0 0 0.00e+00 0 0.00e+00 0
MatZeroEntries 3 1.0 9.8439e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
MatView 1 1.0 1.1108e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSetUp 1 1.0 1.7866e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 1 1.0 4.3944e-02 1.1 3.18e+07 1.3 3.5e+03 1.3e+03 1.8e+02 2 23 28 20 26 2 42 59 32 55 5131 8234 121 1.15e+00 120 1.14e+00 100
SNESSolve 1 1.0 8.3242e-01 1.0 5.16e+07 1.2 3.6e+03 1.9e+03 1.9e+02 39 39 29 29 26 40 71 61 47 56 461 8211 125 1.18e+00 126 1.21e+00 59
SNESSetUp 1 1.0 1.1244e-01 1.0 0.00e+00 0.0 3.6e+02 1.4e+04 1.8e+01 5 0 3 21 3 5 0 6 34 5 0 0 0 0.00e+00 0 0.00e+00 0
SNESFunctionEval 2 1.0 3.5627e-01 1.6 1.39e+07 1.0 1.1e+02 9.4e+02 3.0e+00 11 11 1 0 0 11 21 2 1 1 312 550 6 7.30e-02 6 7.15e-02 0
SNESJacobianEval 2 1.0 1.1017e+00 1.2 2.47e+07 1.0 1.1e+02 3.8e+04 2.0e+00 51 20 1 18 0 52 36 2 30 1 179 0 0 0.00e+00 6 7.15e-02 0
DMCreateInterp 1 1.0 2.6462e-02 1.0 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01 1 0 1 0 2 1 0 1 1 5 25 0 0 0.00e+00 0 0.00e+00 0
DMCreateMat 1 1.0 1.1233e-01 1.0 0.00e+00 0.0 3.6e+02 1.4e+04 1.8e+01 5 0 3 21 3 5 0 6 34 5 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Partition 1 1.0 4.7895e-03 1.0 0.00e+00 0.0 3.5e+01 1.1e+02 8.0e+00 0 0 0 0 1 0 0 1 0 2 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Migration 1 1.0 5.4815e-01 1.0 0.00e+00 0.0 2.0e+02 8.2e+01 2.9e+01 26 0 2 0 4 26 0 3 0 9 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartSelf 1 1.0 9.8660e-0513.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblInv 1 1.0 3.0135e-04 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblSF 1 1.0 1.1005e-03 1.6 0.00e+00 0.0 1.4e+01 5.6e+01 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartStrtSF 1 1.0 3.3112e-03 1.8 0.00e+00 0.0 7.0e+00 2.2e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPointSF 1 1.0 2.1855e-04 1.1 0.00e+00 0.0 1.4e+01 2.7e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterp 19 1.0 6.7416e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistribute 1 1.0 5.5320e-01 1.0 0.00e+00 0.0 2.5e+02 9.7e+01 3.7e+01 26 0 2 0 5 27 0 4 0 11 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistCones 1 1.0 1.5161e-04 1.1 0.00e+00 0.0 4.2e+01 1.4e+02 2.0e+00 0 0 0 0 0 0 0 1 0 1 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistLabels 1 1.0 2.9833e-04 1.0 0.00e+00 0.0 1.0e+02 6.6e+01 2.4e+01 0 0 1 0 3 0 0 2 0 7 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistField 1 1.0 5.4747e-01 1.0 0.00e+00 0.0 4.9e+01 5.9e+01 2.0e+00 26 0 0 0 0 26 0 1 0 1 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexStratify 31 1.0 1.3614e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 0 0 0 0 1 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexSymmetrize 31 1.0 2.2603e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPrealloc 1 1.0 1.1213e-01 1.0 0.00e+00 0.0 3.6e+02 1.4e+04 1.6e+01 5 0 3 21 2 5 0 6 34 5 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexResidualFE 2 1.0 2.4719e-02 1.0 1.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 10 0 0 0 1 19 0 0 0 4098 0 0 0.00e+00 0 0.00e+00 0
DMPlexJacobianFE 2 1.0 1.0950e+00 1.2 2.35e+07 1.0 7.6e+01 5.6e+04 2.0e+00 48 19 1 18 0 49 35 1 29 1 171 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterpFE 1 1.0 2.6414e-02 1.0 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01 1 0 1 0 2 1 0 1 1 5 25 0 0 0.00e+00 0 0.00e+00 0
SFSetGraph 37 1.0 4.4644e-05 4.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFSetUp 30 1.0 4.4996e-03 1.2 0.00e+00 0.0 1.1e+03 1.6e+03 3.0e+01 0 0 9 8 4 0 0 19 12 9 0 0 0 0.00e+00 0 0.00e+00 0
SFBcastBegin 59 1.0 1.9148e-0185.2 0.00e+00 0.0 8.6e+02 1.0e+03 0.0e+00 8 0 7 4 0 8 0 15 6 0 0 0 1 1.49e-03 11 1.43e-01 0
SFBcastEnd 59 1.0 1.9445e-0169.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFReduceBegin 14 1.0 1.3587e-01690.3 8.19e+03 1.2 2.5e+02 7.0e+03 0.0e+00 1 0 2 8 0 1 0 4 12 0 0 0 2 5.96e-02 0 0.00e+00 100
SFReduceEnd 14 1.0 1.2185e-0242.5 1.63e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 4 1.19e-02 0 0.00e+00 100
SFFetchOpBegin 2 1.0 4.5447e-0515.3 0.00e+00 0.0 3.8e+01 1.5e+04 0.0e+00 0 0 0 2 0 0 0 1 4 0 0 0 0 0.00e+00 0 0.00e+00 0
SFFetchOpEnd 2 1.0 2.4357e-04 1.4 0.00e+00 0.0 3.8e+01 1.5e+04 0.0e+00 0 0 0 2 0 0 0 1 4 0 0 0 0 0.00e+00 0 0.00e+00 0
SFCreateEmbed 6 1.0 5.6716e-0347.7 0.00e+00 0.0 1.0e+02 8.0e+01 0.0e+00 0 0 1 0 0 0 0 2 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFDistSection 9 1.0 5.1380e-04 1.2 0.00e+00 0.0 3.1e+02 4.5e+02 1.1e+01 0 0 2 1 2 0 0 5 1 3 0 0 0 0.00e+00 0 0.00e+00 0
SFSectionSF 14 1.0 1.3564e-03 1.5 0.00e+00 0.0 4.0e+02 1.4e+03 1.4e+01 0 0 3 2 2 0 0 7 4 4 0 0 0 0.00e+00 0 0.00e+00 0
SFRemoteOff 5 1.0 5.8056e-0321.8 0.00e+00 0.0 2.7e+02 1.7e+02 2.0e+00 0 0 2 0 0 0 0 5 0 1 0 0 0 0.00e+00 0 0.00e+00 0
SFPack 142 1.0 1.8780e-01581.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 7 0 0 0 0 8 0 0 0 0 0 0 2 6.24e-03 0 0.00e+00 0
SFUnpack 144 1.0 1.3578e-011442.7 9.83e+03 1.5 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 100
VecTDot 120 1.0 7.3964e-03 1.4 9.83e+05 1.2 0.0e+00 0.0e+00 1.2e+02 0 1 0 0 17 0 1 0 0 36 967 1955 0 0.00e+00 0 0.00e+00 100
VecNorm 61 1.0 8.1882e-03 2.4 5.00e+05 1.2 0.0e+00 0.0e+00 6.1e+01 0 0 0 0 8 0 1 0 0 18 444 1399 0 0.00e+00 0 0.00e+00 100
VecCopy 2 1.0 9.8118e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 52 1.0 1.2121e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 121 1.0 3.3206e-03 1.1 9.91e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 2171 3088 0 0.00e+00 0 0.00e+00 100
VecAYPX 59 1.0 1.1340e-03 1.1 4.83e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 3100 5302 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 60 1.0 1.2228e-03 1.1 2.46e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1462 2562 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 61 1.0 3.5653e-03 1.6 0.00e+00 0.0 3.6e+03 1.3e+03 2.0e+00 0 0 29 20 0 0 0 62 32 1 0 0 1 4.76e-03 120 1.14e+00 0
VecScatterEnd 61 1.0 1.4893e-03 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 120 1.14e+00 0 0.00e+00 0
DualSpaceSetUp 2 1.0 2.8444e-03 1.0 1.80e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 5 0 0 0.00e+00 0 0.00e+00 0
FESetUp 2 1.0 9.4143e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCSetUp 1 1.0 4.3280e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCApply 60 1.0 8.7004e-03 1.1 2.46e+05 1.2 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 1 205 224 0 0.00e+00 0 0.00e+00 100
--- Event Stage 1: PCSetUp
PCSetUp 1 1.0 1.7049e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 98 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
--- Event Stage 2: KSP Solve only
MatMult 120 1.0 1.9656e-02 1.2 5.72e+07 1.3 6.7e+03 1.4e+03 0.0e+00 1 41 53 39 0 38 90100100 0 20576 43374 240 2.28e+00 240 2.28e+00 100
MatView 2 1.0 6.7079e-05 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 2 1.0 4.9491e-02 1.1 6.36e+07 1.3 6.7e+03 1.4e+03 3.6e+02 2 45 53 39 50 100100100100 99 9111 17921 240 2.28e+00 240 2.28e+00 100
SFPack 120 1.0 3.4330e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFUnpack 120 1.0 1.4029e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecTDot 240 1.0 1.3717e-02 1.4 1.97e+06 1.2 0.0e+00 0.0e+00 2.4e+02 1 1 0 0 33 25 3 0 0 66 1042 2043 0 0.00e+00 0 0.00e+00 100
VecNorm 122 1.0 1.0250e-02 2.0 9.99e+05 1.2 0.0e+00 0.0e+00 1.2e+02 0 1 0 0 17 18 2 0 0 34 709 2009 0 0.00e+00 0 0.00e+00 100
VecCopy 4 1.0 1.4585e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 4 1.0 1.0522e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 240 1.0 4.5384e-03 1.1 1.97e+06 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 9 3 0 0 0 3151 5457 0 0.00e+00 0 0.00e+00 100
VecAYPX 118 1.0 2.2194e-03 1.1 9.67e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 5 2 0 0 0 3168 5373 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 120 1.0 2.3361e-03 1.1 4.92e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 5 1 0 0 0 1530 2644 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 120 1.0 5.8364e-03 1.6 0.00e+00 0.0 6.7e+03 1.4e+03 0.0e+00 0 0 53 39 0 11 0100100 0 0 0 0 0.00e+00 240 2.28e+00 0
VecScatterEnd 120 1.0 2.6910e-03 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 4 0 0 0 0 0 0 240 2.28e+00 0 0.00e+00 0
PCApply 120 1.0 2.3607e-03 1.1 4.92e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 5 1 0 0 0 1514 2644 0 0.00e+00 0 0.00e+00 100
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 30 30 17280 0.
SNES 1 1 1540 0.
DMSNES 1 1 688 0.
Krylov Solver 1 1 1664 0.
DMKSP interface 1 1 656 0.
Matrix 73 73 2596672 0.
Distributed Mesh 66 66 479912 0.
DM Label 156 156 98592 0.
Quadrature 148 148 87616 0.
Mesh Transform 3 3 2268 0.
Index Set 569 569 564740 0.
IS L to G Mapping 2 2 21568 0.
Section 235 235 167320 0.
Star Forest Graph 161 161 175056 0.
Discrete System 106 106 101764 0.
Weak Form 107 107 65912 0.
GraphPartitioner 31 31 21328 0.
Vector 52 52 385592 0.
Linear Space 5 5 3416 0.
Dual Space 26 26 24336 0.
FE Space 2 2 1576 0.
Viewer 2 1 840 0.
Preconditioner 1 1 872 0.
Field over DM 1 1 704 0.
--- Event Stage 1: PCSetUp
--- Event Stage 2: KSP Solve only
Average time to get PetscTime(): 3.51e-08
Average time for MPI_Barrier(): 2.597e-06
Average time for zero size MPI_Send(): 1.01545e-05
#PETSc Option Table entries:
-benchmark_it 2
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 3
-dm_vec_type kokkos
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=cc --with-cxx=CC --with-fc=ftn --with-fortran-bindings=0 LIBS="-L/opt/cray/pe/mpich/8.1.12/gtl/lib -lmpi_gtl_hsa" --with-debugging=0 --COPTFLAGS="-g -O" --CXXOPTFLAGS="-g -O" --FOPTFLAGS=-g --with-mpiexec="srun -p batch -N 1 -A csc314_crusher -t 00:10:00" --with-hip --with-hipc=hipcc --download-hypre --with-hip-arch=gfx90a --download-kokkos --download-kokkos-kernels --with-kokkos-kernels-tpl=0 --download-p4est=1 --with-zlib-dir=/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4 PETSC_ARCH=arch-olcf-crusher
Libraries compiled on 2022-01-25 14:29:13 on login2
Machine characteristics: Linux-5.3.18-59.16_11.0.39-cray_shasta_c-x86_64-with-glibc2.3.4
Using PETSc directory: /gpfs/alpine/csc314/scratch/adams/petsc
Using PETSc arch: arch-olcf-crusher
Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O
Using Fortran compiler: ftn -fPIC -g
Using include paths: -I/gpfs/alpine/csc314/scratch/adams/petsc/include -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/include -I/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/include -I/opt/rocm-4.5.0/include
Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -lpetsc -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -Wl,-rpath,/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -L/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -Wl,-rpath,/opt/rocm-4.5.0/lib -L/opt/rocm-4.5.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/gtl/lib -L/opt/cray/pe/mpich/8.1.12/gtl/lib -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64 -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/ -L/opt/cray/pe/libsci/ -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -L/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.16/lib -L/opt/cray/pe/pmi/6.0.16/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -L/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -Wl,-rpath,/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -L/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -L/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -lHYPRE -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lz -lhipsparse -lhipblas -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -ldl -lmpi_gtl_hsa -lmpifort_cray -lmpi_cray -ldsmml -lpmi -lpmi2 -lxpmem -lstdc++ -lpgas-shmem -lquadmath -lmodules -lfi -lcraymath -lf -lu -lcsup -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64 -lclang_rt.builtins-x86_64 -lquadmath -ldl -lmpi_gtl_hsa
# #
# WARNING!!! #
# #
# This code was compiled with GPU support and you've #
# created PETSc/GPU objects, but you intentionally used #
# -use_gpu_aware_mpi 0, such that PETSc had to copy data #
# from GPU to CPU for communication. To get meaningfull #
# timing results, please use GPU-aware MPI instead. #
#PETSc Option Table entries:
-benchmark_it 2
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 3
-dm_vec_type kokkos
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 14 unused database options. They are:
Option left: name:-mg_levels_esteig_ksp_max_it value: 10
Option left: name:-mg_levels_esteig_ksp_type value: cg
Option left: name:-mg_levels_ksp_chebyshev_esteig value: 0,0.05,0,1.05
Option left: name:-mg_levels_ksp_type value: chebyshev
Option left: name:-mg_levels_pc_type value: jacobi
Option left: name:-pc_gamg_coarse_eq_limit value: 100
Option left: name:-pc_gamg_coarse_grid_layout_type value: compact
Option left: name:-pc_gamg_esteig_ksp_max_it value: 10
Option left: name:-pc_gamg_esteig_ksp_type value: cg
Option left: name:-pc_gamg_process_eq_limit value: 400
Option left: name:-pc_gamg_repartition value: false
Option left: name:-pc_gamg_reuse_interpolation value: true
Option left: name:-pc_gamg_square_graph value: 0
Option left: name:-pc_gamg_threshold value: -0.01
-------------- next part --------------
DM Object: box 8 MPI processes
type: plex
box in 3 dimensions:
Number of 0-cells per rank: 274625 274625 274625 274625 274625 274625 274625 274625
Number of 1-cells per rank: 811200 811200 811200 811200 811200 811200 811200 811200
Number of 2-cells per rank: 798720 798720 798720 798720 798720 798720 798720 798720
Number of 3-cells per rank: 262144 262144 262144 262144 262144 262144 262144 262144
celltype: 4 strata with value/size (0 (274625), 1 (811200), 4 (798720), 7 (262144))
depth: 4 strata with value/size (0 (274625), 1 (811200), 2 (798720), 3 (262144))
marker: 1 strata with value/size (1 (49530))
Face Sets: 3 strata with value/size (1 (16129), 3 (16129), 6 (16129))
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=16581375, cols=16581375
total: nonzeros=1045678375, allocated nonzeros=1045678375
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=16581375, cols=16581375
total: nonzeros=1045678375, allocated nonzeros=1045678375
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve did not converge due to DIVERGED_ITS iterations 200
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=16581375, cols=16581375
total: nonzeros=1045678375, allocated nonzeros=1045678375
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
**************************************** ***********************************************************************************************************************
*** WIDEN YOUR WINDOW TO 160 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------
# #
# WARNING!!! #
# #
# This code was compiled with GPU support and you've #
# created PETSc/GPU objects, but you intentionally used #
# -use_gpu_aware_mpi 0, such that PETSc had to copy data #
# from GPU to CPU for communication. To get meaningfull #
# timing results, please use GPU-aware MPI instead. #
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tests/data/../ex13 on a arch-olcf-crusher named crusher002 with 8 processors, by adams Tue Jan 25 10:20:42 2022
Using Petsc Development GIT revision: v3.16.3-696-g46640c56cb GIT Date: 2022-01-25 09:20:51 -0500
Max Max/Min Avg Total
Time (sec): 5.394e+02 1.000 5.394e+02
Objects: 1.990e+03 1.027 1.947e+03
Flop: 1.940e+11 1.027 1.915e+11 1.532e+12
Flop/sec: 3.596e+08 1.027 3.549e+08 2.839e+09
MPI Messages: 4.806e+03 1.066 4.571e+03 3.657e+04
MPI Message Lengths: 4.434e+08 1.015 9.611e+04 3.515e+09
MPI Reductions: 1.991e+03 1.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 5.3766e+02 99.7% 6.0875e+11 39.7% 1.417e+04 38.7% 1.143e+05 46.1% 7.660e+02 38.5%
1: PCSetUp: 1.1813e-01 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
2: KSP Solve only: 1.6643e+00 0.3% 9.2287e+11 60.3% 2.240e+04 61.3% 8.459e+04 53.9% 1.206e+03 60.6%
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors)
CpuToGpu Count: total number of CPU to GPU copies per processor
CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor)
GpuToCpu Count: total number of GPU to CPU copies per processor
GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor)
GPU %F: percent flops on GPU in this event
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F
--- Event Stage 0: Main Stage
PetscBarrier 6 1.0 1.1691e+00 1.0 0.00e+00 0.0 9.3e+02 3.2e+03 2.1e+01 0 0 3 0 1 0 0 7 0 3 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSided 42 1.0 3.1965e+0014.8 0.00e+00 0.0 7.5e+02 4.0e+00 4.2e+01 0 0 2 0 2 0 0 5 0 5 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSidedF 6 1.0 3.1399e+0018.9 0.00e+00 0.0 1.5e+02 2.0e+06 6.0e+00 0 0 0 8 0 0 0 1 18 1 0 0 0 0.00e+00 0 0.00e+00 0
MatMult 48589 1.0 7.2097e-01 1.0 5.31e+10 1.0 1.1e+04 8.3e+04 2.0e+00 0 27 31 27 0 0 69 81 59 0 580168 780320 401 2.37e+02 400 2.37e+02 100
MatAssemblyBegin 43 1.0 3.2614e+0011.1 0.00e+00 0.0 1.5e+02 2.0e+06 6.0e+00 0 0 0 8 0 0 0 1 18 1 0 0 0 0.00e+00 0 0.00e+00 0
MatAssemblyEnd 43 1.0 7.7601e-01 3.4 4.67e+06 0.0 0.0e+00 0.0e+00 9.0e+00 0 0 0 0 0 0 0 0 0 1 24 0 0 0.00e+00 0 0.00e+00 0
MatZeroEntries 3 1.0 2.2877e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
MatView 1 1.0 9.9070e-05 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSetUp 1 1.0 4.5815e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 1 1.0 1.2261e+00 1.3 5.85e+10 1.0 1.1e+04 8.4e+04 6.0e+02 0 30 31 27 30 0 76 80 59 79 376334 727165 401 2.37e+02 400 2.37e+02 100
SNESSolve 1 1.0 2.3154e+02 1.0 6.79e+10 1.0 1.1e+04 9.6e+04 6.1e+02 43 35 31 31 31 43 88 81 68 80 2317 726995 405 2.54e+02 406 2.71e+02 86
SNESSetUp 1 1.0 4.9846e+01 1.0 0.00e+00 0.0 3.6e+02 9.4e+05 1.8e+01 9 0 1 10 1 9 0 3 21 2 0 0 0 0.00e+00 0 0.00e+00 0
SNESFunctionEval 2 1.0 1.3484e+01 1.0 6.33e+09 1.0 1.1e+02 6.2e+04 3.0e+00 2 3 0 0 0 2 8 1 0 0 3756 123291 6 3.40e+01 6 3.39e+01 0
SNESJacobianEval 2 1.0 4.7172e+02 1.0 1.21e+10 1.0 1.1e+02 2.6e+06 2.0e+00 87 6 0 9 0 88 16 1 19 0 205 0 0 0.00e+00 6 3.39e+01 0
DMCreateInterp 1 1.0 9.5002e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01 0 0 0 0 1 0 0 1 0 2 698 0 0 0.00e+00 0 0.00e+00 0
DMCreateMat 1 1.0 4.9843e+01 1.0 0.00e+00 0.0 3.6e+02 9.4e+05 1.8e+01 9 0 1 10 1 9 0 3 21 2 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Partition 1 1.0 6.8415e-04 1.1 0.00e+00 0.0 3.5e+01 1.1e+02 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Migration 1 1.0 2.5632e-01 1.0 0.00e+00 0.0 2.0e+02 8.2e+01 2.9e+01 0 0 1 0 1 0 0 1 0 4 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartSelf 1 1.0 8.3180e-0513.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblInv 1 1.0 3.1569e-04 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblSF 1 1.0 2.4072e-04 4.1 0.00e+00 0.0 1.4e+01 5.6e+01 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartStrtSF 1 1.0 1.0937e-04 1.1 0.00e+00 0.0 7.0e+00 2.2e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPointSF 1 1.0 2.2380e-04 1.1 0.00e+00 0.0 1.4e+01 2.7e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterp 19 1.0 5.7251e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistribute 1 1.0 2.5723e-01 1.0 0.00e+00 0.0 2.5e+02 9.7e+01 3.7e+01 0 0 1 0 2 0 0 2 0 5 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistCones 1 1.0 9.8228e-05 1.0 0.00e+00 0.0 4.2e+01 1.4e+02 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistLabels 1 1.0 2.5452e-04 1.0 0.00e+00 0.0 1.0e+02 6.6e+01 2.4e+01 0 0 0 0 1 0 0 1 0 3 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistField 1 1.0 2.5579e-01 1.0 0.00e+00 0.0 4.9e+01 5.9e+01 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexStratify 34 1.0 3.4337e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexSymmetrize 34 1.0 9.6014e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPrealloc 1 1.0 4.9804e+01 1.0 0.00e+00 0.0 3.6e+02 9.4e+05 1.6e+01 9 0 1 10 1 9 0 3 21 2 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexResidualFE 2 1.0 1.2648e+01 1.0 6.29e+09 1.0 0.0e+00 0.0e+00 0.0e+00 2 3 0 0 0 2 8 0 0 0 3980 0 0 0.00e+00 0 0.00e+00 0
DMPlexJacobianFE 2 1.0 4.7124e+02 1.0 1.21e+10 1.0 7.6e+01 3.9e+06 2.0e+00 87 6 0 8 0 88 16 1 18 0 205 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterpFE 1 1.0 9.2564e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01 0 0 0 0 1 0 0 1 0 2 717 0 0 0.00e+00 0 0.00e+00 0
SFSetGraph 46 1.0 4.0703e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFSetUp 36 1.0 2.4184e-01 1.2 0.00e+00 0.0 1.3e+03 9.1e+04 3.6e+01 0 0 4 3 2 0 0 9 7 5 0 0 0 0.00e+00 0 0.00e+00 0
SFBcastBegin 68 1.0 1.7661e-0111.8 0.00e+00 0.0 1.0e+03 5.4e+04 0.0e+00 0 0 3 2 0 0 0 7 3 0 0 0 1 9.79e-02 11 6.79e+01 0
SFBcastEnd 68 1.0 5.8333e-0120.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFReduceBegin 17 1.0 1.4363e-0120.0 4.19e+06 1.0 3.1e+02 3.9e+05 0.0e+00 0 0 1 3 0 0 0 2 7 0 231 0 2 3.32e+01 0 0.00e+00 100
SFReduceEnd 17 1.0 1.0363e+00 6.1 9.91e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 4 7.83e-01 0 0.00e+00 100
SFFetchOpBegin 2 1.0 2.0131e-03151.1 0.00e+00 0.0 3.8e+01 1.0e+06 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 0 0 0.00e+00 0 0.00e+00 0
SFFetchOpEnd 2 1.0 1.2970e-02 1.7 0.00e+00 0.0 3.8e+01 1.0e+06 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 0 0 0.00e+00 0 0.00e+00 0
SFCreateEmbed 9 1.0 3.6514e-01101.5 0.00e+00 0.0 1.6e+02 2.9e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFDistSection 9 1.0 1.5104e-02 3.0 0.00e+00 0.0 3.1e+02 2.6e+04 1.1e+01 0 0 1 0 1 0 0 2 0 1 0 0 0 0.00e+00 0 0.00e+00 0
SFSectionSF 17 1.0 7.6311e-02 2.0 0.00e+00 0.0 5.2e+02 7.6e+04 1.7e+01 0 0 1 1 1 0 0 4 2 2 0 0 0 0.00e+00 0 0.00e+00 0
SFRemoteOff 8 1.0 3.7036e-0138.1 0.00e+00 0.0 4.9e+02 5.3e+03 5.0e+00 0 0 1 0 0 0 0 3 0 1 0 0 0 0.00e+00 0 0.00e+00 0
SFPack 294 1.0 1.6329e-0116.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 2 3.94e-01 0 0.00e+00 0
SFUnpack 296 1.0 1.4535e-0111.0 4.29e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 231 0 0 0.00e+00 0 0.00e+00 100
VecTDot 401 1.0 7.2354e-02 2.0 1.68e+09 1.0 0.0e+00 0.0e+00 4.0e+02 0 1 0 0 20 0 2 0 0 52 183796 456574 0 0.00e+00 0 0.00e+00 100
VecNorm 201 1.0 2.9087e-0115.8 8.43e+08 1.0 0.0e+00 0.0e+00 2.0e+02 0 0 0 0 10 0 1 0 0 26 22917 448641 0 0.00e+00 0 0.00e+00 100
VecCopy 2 1.0 1.5781e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 55 1.0 5.0356e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 400 1.0 2.6926e-02 1.0 1.68e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 492649 564421 0 0.00e+00 0 0.00e+00 100
VecAYPX 199 1.0 1.3526e-02 1.1 8.35e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 487921 563124 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 201 1.0 1.4574e-02 1.1 4.22e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 228688 263803 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 201 1.0 4.6771e-02 1.9 0.00e+00 0.0 1.1e+04 8.3e+04 2.0e+00 0 0 31 27 0 0 0 81 59 0 0 0 1 2.96e-01 400 2.37e+02 0
VecScatterEnd 201 1.0 3.5618e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 400 2.37e+02 0 0.00e+00 0
DualSpaceSetUp 2 1.0 2.5830e-03 1.0 1.80e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6 0 0 0.00e+00 0 0.00e+00 0
FESetUp 2 1.0 1.0546e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCSetUp 1 1.0 4.4890e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCApply 201 1.0 1.4387e-01 1.0 4.22e+08 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 1 0 0 0 23166 164721 0 0.00e+00 0 0.00e+00 100
--- Event Stage 1: PCSetUp
PCSetUp 1 1.0 1.1929e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 100 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
--- Event Stage 2: KSP Solve only
MatMult 400 1.0 1.2937e+00 1.0 1.06e+11 1.0 2.2e+04 8.5e+04 0.0e+00 0 55 61 54 0 76 91100100 0 646634 788762 800 4.74e+02 800 4.74e+02 100
MatView 2 1.0 8.6066e-05 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 2 1.0 1.7967e+00 1.2 1.17e+11 1.0 2.2e+04 8.5e+04 1.2e+03 0 60 61 54 60 100100100100100 513632 747377 800 4.74e+02 800 4.74e+02 100
SFPack 400 1.0 1.1277e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFUnpack 400 1.0 6.3893e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecTDot 802 1.0 1.2564e-01 1.6 3.36e+09 1.0 0.0e+00 0.0e+00 8.0e+02 0 2 0 0 40 6 3 0 0 67 211684 463965 0 0.00e+00 0 0.00e+00 100
VecNorm 402 1.0 3.2115e-01 8.5 1.69e+09 1.0 0.0e+00 0.0e+00 4.0e+02 0 1 0 0 20 11 1 0 0 33 41511 486763 0 0.00e+00 0 0.00e+00 100
VecCopy 4 1.0 2.8629e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 4 1.0 1.9327e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 800 1.0 5.3435e-02 1.1 3.36e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 3 3 0 0 0 496498 571546 0 0.00e+00 0 0.00e+00 100
VecAYPX 398 1.0 2.6610e-02 1.1 1.67e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 2 1 0 0 0 496007 563024 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 402 1.0 2.7997e-02 1.1 8.43e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 1 0 0 0 238087 276194 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 400 1.0 8.6966e-02 2.1 0.00e+00 0.0 2.2e+04 8.5e+04 0.0e+00 0 0 61 54 0 4 0100100 0 0 0 0 0.00e+00 800 4.74e+02 0
VecScatterEnd 400 1.0 6.0717e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 3 0 0 0 0 0 0 800 4.74e+02 0 0.00e+00 0
PCApply 402 1.0 2.8082e-02 1.1 8.43e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 1 0 0 0 237365 276194 0 0.00e+00 0 0.00e+00 100
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 33 33 19008 0.
SNES 1 1 1540 0.
DMSNES 1 1 688 0.
Krylov Solver 1 1 1664 0.
DMKSP interface 1 1 656 0.
Matrix 76 76 1627827176 0.
Distributed Mesh 72 72 58958528 0.
DM Label 180 180 113760 0.
Quadrature 148 148 87616 0.
Mesh Transform 6 6 4536 0.
Index Set 665 665 4081364 0.
IS L to G Mapping 2 2 8588672 0.
Section 256 256 182272 0.
Star Forest Graph 179 179 195360 0.
Discrete System 121 121 116164 0.
Weak Form 122 122 75152 0.
GraphPartitioner 34 34 23392 0.
Vector 55 55 157135208 0.
Linear Space 5 5 3416 0.
Dual Space 26 26 24336 0.
FE Space 2 2 1576 0.
Viewer 2 1 840 0.
Preconditioner 1 1 872 0.
Field over DM 1 1 704 0.
--- Event Stage 1: PCSetUp
--- Event Stage 2: KSP Solve only
Average time to get PetscTime(): 3.51e-08
Average time for MPI_Barrier(): 2.5748e-06
Average time for zero size MPI_Send(): 1.00542e-05
#PETSc Option Table entries:
-benchmark_it 2
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 6
-dm_vec_type kokkos
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=cc --with-cxx=CC --with-fc=ftn --with-fortran-bindings=0 LIBS="-L/opt/cray/pe/mpich/8.1.12/gtl/lib -lmpi_gtl_hsa" --with-debugging=0 --COPTFLAGS="-g -O" --CXXOPTFLAGS="-g -O" --FOPTFLAGS=-g --with-mpiexec="srun -p batch -N 1 -A csc314_crusher -t 00:10:00" --with-hip --with-hipc=hipcc --download-hypre --with-hip-arch=gfx90a --download-kokkos --download-kokkos-kernels --with-kokkos-kernels-tpl=0 --download-p4est=1 --with-zlib-dir=/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4 PETSC_ARCH=arch-olcf-crusher
Libraries compiled on 2022-01-25 14:29:13 on login2
Machine characteristics: Linux-5.3.18-59.16_11.0.39-cray_shasta_c-x86_64-with-glibc2.3.4
Using PETSc directory: /gpfs/alpine/csc314/scratch/adams/petsc
Using PETSc arch: arch-olcf-crusher
Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O
Using Fortran compiler: ftn -fPIC -g
Using include paths: -I/gpfs/alpine/csc314/scratch/adams/petsc/include -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/include -I/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/include -I/opt/rocm-4.5.0/include
Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -lpetsc -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -Wl,-rpath,/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -L/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -Wl,-rpath,/opt/rocm-4.5.0/lib -L/opt/rocm-4.5.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/gtl/lib -L/opt/cray/pe/mpich/8.1.12/gtl/lib -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64 -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/ -L/opt/cray/pe/libsci/ -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -L/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.16/lib -L/opt/cray/pe/pmi/6.0.16/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -L/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -Wl,-rpath,/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -L/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -L/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -lHYPRE -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lz -lhipsparse -lhipblas -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -ldl -lmpi_gtl_hsa -lmpifort_cray -lmpi_cray -ldsmml -lpmi -lpmi2 -lxpmem -lstdc++ -lpgas-shmem -lquadmath -lmodules -lfi -lcraymath -lf -lu -lcsup -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64 -lclang_rt.builtins-x86_64 -lquadmath -ldl -lmpi_gtl_hsa
# #
# WARNING!!! #
# #
# This code was compiled with GPU support and you've #
# created PETSc/GPU objects, but you intentionally used #
# -use_gpu_aware_mpi 0, such that PETSc had to copy data #
# from GPU to CPU for communication. To get meaningfull #
# timing results, please use GPU-aware MPI instead. #
#PETSc Option Table entries:
-benchmark_it 2
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 6
-dm_vec_type kokkos
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 14 unused database options. They are:
Option left: name:-mg_levels_esteig_ksp_max_it value: 10
Option left: name:-mg_levels_esteig_ksp_type value: cg
Option left: name:-mg_levels_ksp_chebyshev_esteig value: 0,0.05,0,1.05
Option left: name:-mg_levels_ksp_type value: chebyshev
Option left: name:-mg_levels_pc_type value: jacobi
Option left: name:-pc_gamg_coarse_eq_limit value: 100
Option left: name:-pc_gamg_coarse_grid_layout_type value: compact
Option left: name:-pc_gamg_esteig_ksp_max_it value: 10
Option left: name:-pc_gamg_esteig_ksp_type value: cg
Option left: name:-pc_gamg_process_eq_limit value: 400
Option left: name:-pc_gamg_repartition value: false
Option left: name:-pc_gamg_reuse_interpolation value: true
Option left: name:-pc_gamg_square_graph value: 0
Option left: name:-pc_gamg_threshold value: -0.01
-------------- next part --------------
DM Object: box 8 MPI processes
type: plex
box in 3 dimensions:
Number of 0-cells per rank: 4913 4913 4913 4913 4913 4913 4913 4913
Number of 1-cells per rank: 13872 13872 13872 13872 13872 13872 13872 13872
Number of 2-cells per rank: 13056 13056 13056 13056 13056 13056 13056 13056
Number of 3-cells per rank: 4096 4096 4096 4096 4096 4096 4096 4096
celltype: 4 strata with value/size (0 (4913), 1 (13872), 4 (13056), 7 (4096))
depth: 4 strata with value/size (0 (4913), 1 (13872), 2 (13056), 3 (4096))
marker: 1 strata with value/size (1 (3162))
Face Sets: 3 strata with value/size (1 (961), 3 (961), 6 (961))
Linear solve converged due to CONVERGED_RTOL iterations 122
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=250047, cols=250047
total: nonzeros=15069223, allocated nonzeros=15069223
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve converged due to CONVERGED_RTOL iterations 122
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=250047, cols=250047
total: nonzeros=15069223, allocated nonzeros=15069223
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Linear solve converged due to CONVERGED_RTOL iterations 122
KSP Object: 8 MPI processes
type: cg
maximum iterations=200, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: jacobi
linear system matrix = precond matrix:
Mat Object: 8 MPI processes
type: mpiaijkokkos
rows=250047, cols=250047
total: nonzeros=15069223, allocated nonzeros=15069223
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
**************************************** ***********************************************************************************************************************
*** WIDEN YOUR WINDOW TO 160 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------
# #
# WARNING!!! #
# #
# This code was compiled with GPU support and you've #
# created PETSc/GPU objects, but you intentionally used #
# -use_gpu_aware_mpi 0, such that PETSc had to copy data #
# from GPU to CPU for communication. To get meaningfull #
# timing results, please use GPU-aware MPI instead. #
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tests/data/../ex13 on a arch-olcf-crusher named crusher002 with 8 processors, by adams Tue Jan 25 10:10:33 2022
Using Petsc Development GIT revision: v3.16.3-696-g46640c56cb GIT Date: 2022-01-25 09:20:51 -0500
Max Max/Min Avg Total
Time (sec): 9.250e+00 1.000 9.250e+00
Objects: 1.850e+03 1.029 1.807e+03
Flop: 1.914e+09 1.105 1.821e+09 1.456e+10
Flop/sec: 2.069e+08 1.105 1.968e+08 1.575e+09
MPI Messages: 3.112e+03 1.096 2.895e+03 2.316e+04
MPI Message Lengths: 1.951e+07 1.060 6.497e+03 1.505e+08
MPI Reductions: 1.280e+03 1.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 9.1356e+00 98.8% 6.4176e+09 44.1% 9.499e+03 41.0% 8.147e+03 51.4% 5.250e+02 41.0%
1: PCSetUp: 1.2165e-03 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
2: KSP Solve only: 1.1320e-01 1.2% 8.1469e+09 55.9% 1.366e+04 59.0% 5.350e+03 48.6% 7.360e+02 57.5%
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors)
CpuToGpu Count: total number of CPU to GPU copies per processor
CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor)
GpuToCpu Count: total number of GPU to CPU copies per processor
GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor)
GPU %F: percent flops on GPU in this event
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F
--- Event Stage 0: Main Stage
PetscBarrier 4 1.0 2.0834e-02 1.0 0.00e+00 0.0 6.3e+02 3.2e+02 1.5e+01 0 0 3 0 1 0 0 7 0 3 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSided 38 1.0 1.8371e-0131.9 0.00e+00 0.0 6.7e+02 4.0e+00 3.8e+01 1 0 3 0 3 1 0 7 0 7 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSidedF 6 1.0 1.7678e-0138.7 0.00e+00 0.0 1.5e+02 1.2e+05 6.0e+00 1 0 1 12 0 1 0 2 23 1 0 0 0 0.00e+00 0 0.00e+00 0
MatMult 3007 1.0 3.3355e-02 1.1 4.88e+08 1.1 7.1e+03 5.2e+03 2.0e+00 0 25 31 24 0 0 57 75 47 0 110256 208584 245 9.16e+00 244 9.14e+00 100
MatAssemblyBegin 43 1.0 1.7686e-01 6.6 0.00e+00 0.0 1.5e+02 1.2e+05 6.0e+00 1 0 1 12 0 1 0 2 23 1 0 0 0 0.00e+00 0 0.00e+00 0
MatAssemblyEnd 43 1.0 4.7823e-02 3.0 2.84e+05 0.0 0.0e+00 0.0e+00 9.0e+00 0 0 0 0 1 0 0 0 0 2 23 0 0 0.00e+00 0 0.00e+00 0
MatZeroEntries 3 1.0 1.2971e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
MatView 1 1.0 7.6647e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSetUp 1 1.0 2.3969e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 1 1.0 8.4992e-02 1.2 5.40e+08 1.1 6.9e+03 5.3e+03 3.7e+02 1 28 30 24 29 1 63 73 47 70 47928 96102 245 9.16e+00 244 9.14e+00 100
SNESSolve 1 1.0 4.0456e+00 1.0 6.89e+08 1.1 7.1e+03 6.5e+03 3.8e+02 44 36 31 31 29 44 82 75 59 71 1301 95872 249 9.44e+00 250 9.69e+00 77
SNESSetUp 1 1.0 8.1269e-01 1.0 0.00e+00 0.0 3.6e+02 5.7e+04 1.8e+01 9 0 2 14 1 9 0 4 27 3 0 0 0 0.00e+00 0 0.00e+00 0
SNESFunctionEval 2 1.0 5.1573e-01 1.4 1.01e+08 1.0 1.1e+02 3.8e+03 3.0e+00 4 6 0 0 0 4 13 1 1 1 1574 3698 6 5.55e-01 6 5.48e-01 0
SNESJacobianEval 2 1.0 7.5947e+00 1.0 1.91e+08 1.0 1.1e+02 1.6e+05 2.0e+00 82 10 0 12 0 83 24 1 23 0 201 0 0 0.00e+00 6 5.48e-01 0
DMCreateInterp 1 1.0 8.4876e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01 0 0 0 0 1 0 0 1 0 3 782 0 0 0.00e+00 0 0.00e+00 0
DMCreateMat 1 1.0 8.1254e-01 1.0 0.00e+00 0.0 3.6e+02 5.7e+04 1.8e+01 9 0 2 14 1 9 0 4 27 3 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Partition 1 1.0 6.8623e-04 1.1 0.00e+00 0.0 3.5e+01 1.1e+02 8.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0
Mesh Migration 1 1.0 2.4168e-01 1.0 0.00e+00 0.0 2.0e+02 8.2e+01 2.9e+01 3 0 1 0 2 3 0 2 0 6 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartSelf 1 1.0 8.3200e-0513.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblInv 1 1.0 3.1282e-04 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartLblSF 1 1.0 2.3499e-04 3.8 0.00e+00 0.0 1.4e+01 5.6e+01 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPartStrtSF 1 1.0 1.0389e-04 1.1 0.00e+00 0.0 7.0e+00 2.2e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPointSF 1 1.0 2.1879e-04 1.1 0.00e+00 0.0 1.4e+01 2.7e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterp 19 1.0 5.7343e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistribute 1 1.0 2.4259e-01 1.0 0.00e+00 0.0 2.5e+02 9.7e+01 3.7e+01 3 0 1 0 3 3 0 3 0 7 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistCones 1 1.0 1.0470e-04 1.0 0.00e+00 0.0 4.2e+01 1.4e+02 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistLabels 1 1.0 2.8320e-04 1.0 0.00e+00 0.0 1.0e+02 6.6e+01 2.4e+01 0 0 0 0 2 0 0 1 0 5 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexDistField 1 1.0 2.4112e-01 1.0 0.00e+00 0.0 4.9e+01 5.9e+01 2.0e+00 3 0 0 0 0 3 0 1 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexStratify 32 1.0 1.7826e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexSymmetrize 32 1.0 1.5651e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexPrealloc 1 1.0 8.1109e-01 1.0 0.00e+00 0.0 3.6e+02 5.7e+04 1.6e+01 9 0 2 14 1 9 0 4 27 3 0 0 0 0.00e+00 0 0.00e+00 0
DMPlexResidualFE 2 1.0 1.9393e-01 1.0 9.87e+07 1.0 0.0e+00 0.0e+00 0.0e+00 2 5 0 0 0 2 12 0 0 0 4071 0 0 0.00e+00 0 0.00e+00 0
DMPlexJacobianFE 2 1.0 7.5685e+00 1.0 1.88e+08 1.0 7.6e+01 2.4e+05 2.0e+00 81 10 0 12 0 82 23 1 23 0 199 0 0 0.00e+00 0 0.00e+00 0
DMPlexInterpFE 1 1.0 8.1691e-04 1.1 8.29e+04 1.0 7.6e+01 1.1e+03 1.6e+01 0 0 0 0 1 0 0 1 0 3 812 0 0 0.00e+00 0 0.00e+00 0
SFSetGraph 40 1.0 1.6763e-04 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFSetUp 32 1.0 1.3187e-02 1.1 0.00e+00 0.0 1.2e+03 6.2e+03 3.2e+01 0 0 5 5 2 0 0 13 10 6 0 0 0 0.00e+00 0 0.00e+00 0
SFBcastBegin 62 1.0 1.5768e-01255.0 0.00e+00 0.0 9.2e+02 3.8e+03 0.0e+00 1 0 4 2 0 1 0 10 4 0 0 0 1 6.05e-03 11 1.10e+00 0
SFBcastEnd 62 1.0 1.8428e-0159.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFReduceBegin 15 1.0 1.3456e-01385.3 6.55e+04 1.1 2.7e+02 2.7e+04 0.0e+00 0 0 1 5 0 0 0 3 9 0 4 0 2 5.00e-01 0 0.00e+00 100
SFReduceEnd 15 1.0 5.2284e-0221.6 6.34e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 4 4.84e-02 0 0.00e+00 100
SFFetchOpBegin 2 1.0 1.3656e-0443.1 0.00e+00 0.0 3.8e+01 6.1e+04 0.0e+00 0 0 0 2 0 0 0 0 3 0 0 0 0 0.00e+00 0 0.00e+00 0
SFFetchOpEnd 2 1.0 9.1371e-04 2.1 0.00e+00 0.0 3.8e+01 6.1e+04 0.0e+00 0 0 0 2 0 0 0 0 3 0 0 0 0 0.00e+00 0 0.00e+00 0
SFCreateEmbed 7 1.0 2.3430e-02116.3 0.00e+00 0.0 1.2e+02 2.5e+02 0.0e+00 0 0 1 0 0 0 0 1 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFDistSection 9 1.0 1.0085e-03 1.6 0.00e+00 0.0 3.1e+02 1.7e+03 1.1e+01 0 0 1 0 1 0 0 3 1 2 0 0 0 0.00e+00 0 0.00e+00 0
SFSectionSF 15 1.0 5.5941e-03 1.3 0.00e+00 0.0 4.4e+02 5.4e+03 1.5e+01 0 0 2 2 1 0 0 5 3 3 0 0 0 0.00e+00 0 0.00e+00 0
SFRemoteOff 6 1.0 2.3731e-0243.8 0.00e+00 0.0 3.4e+02 5.0e+02 3.0e+00 0 0 1 0 0 0 0 4 0 1 0 0 0 0.00e+00 0 0.00e+00 0
SFPack 208 1.0 1.5759e-01204.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 2 2.48e-02 0 0.00e+00 0
SFUnpack 210 1.0 1.3462e-01282.9 7.19e+04 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4 0 0 0.00e+00 0 0.00e+00 100
VecTDot 244 1.0 1.5553e-02 1.4 1.60e+07 1.1 0.0e+00 0.0e+00 2.4e+02 0 1 0 0 19 0 2 0 0 46 7846 15819 0 0.00e+00 0 0.00e+00 100
VecNorm 123 1.0 2.2600e-02 3.5 8.06e+06 1.1 0.0e+00 0.0e+00 1.2e+02 0 0 0 0 10 0 1 0 0 23 2722 13020 0 0.00e+00 0 0.00e+00 100
VecCopy 2 1.0 9.1104e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 53 1.0 1.7366e-03 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 245 1.0 5.8172e-03 1.1 1.61e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 21062 31344 0 0.00e+00 0 0.00e+00 100
VecAYPX 121 1.0 2.4363e-03 1.1 7.93e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 24837 41528 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 122 1.0 2.5573e-03 1.1 4.00e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11929 20567 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 123 1.0 7.9911e-03 1.3 0.00e+00 0.0 7.1e+03 5.2e+03 2.0e+00 0 0 31 24 0 0 0 75 47 0 0 0 1 1.87e-02 244 9.14e+00 0
VecScatterEnd 123 1.0 4.0082e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 244 9.14e+00 0 0.00e+00 0
DualSpaceSetUp 2 1.0 2.5765e-03 1.0 1.80e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6 0 0 0.00e+00 0 0.00e+00 0
FESetUp 2 1.0 1.0391e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCSetUp 1 1.0 4.4890e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCApply 122 1.0 9.7869e-03 1.1 4.00e+06 1.1 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 3117 4254 0 0.00e+00 0 0.00e+00 100
--- Event Stage 1: PCSetUp
PCSetUp 1 1.0 1.3192e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 100 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
--- Event Stage 2: KSP Solve only
MatMult 244 1.0 5.1175e-02 1.2 9.76e+08 1.1 1.4e+04 5.3e+03 0.0e+00 1 50 59 49 0 43 90100100 0 143700 290154 488 1.83e+01 488 1.83e+01 100
MatView 2 1.0 7.5625e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 2 1.0 1.2088e-01 1.2 1.08e+09 1.1 1.4e+04 5.3e+03 7.3e+02 1 56 59 49 57 100100100100100 67396 134725 488 1.83e+01 488 1.83e+01 100
SFPack 244 1.0 8.5343e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFUnpack 244 1.0 3.2090e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecTDot 488 1.0 2.8601e-02 1.3 3.20e+07 1.1 0.0e+00 0.0e+00 4.9e+02 0 2 0 0 38 22 3 0 0 66 8533 15861 0 0.00e+00 0 0.00e+00 100
VecNorm 246 1.0 2.7710e-02 2.5 1.61e+07 1.1 0.0e+00 0.0e+00 2.5e+02 0 1 0 0 19 18 2 0 0 33 4440 15567 0 0.00e+00 0 0.00e+00 100
VecCopy 4 1.0 1.7787e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 4 1.0 1.3394e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 488 1.0 9.7102e-03 1.1 3.20e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 8 3 0 0 0 25133 41488 0 0.00e+00 0 0.00e+00 100
VecAYPX 242 1.0 5.0139e-03 1.1 1.59e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 4 1 0 0 0 24138 41738 0 0.00e+00 0 0.00e+00 100
VecPointwiseMult 244 1.0 5.1417e-03 1.1 8.00e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 4 1 0 0 0 11866 21109 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 244 1.0 1.4535e-02 1.3 0.00e+00 0.0 1.4e+04 5.3e+03 0.0e+00 0 0 59 49 0 11 0100100 0 0 0 0 0.00e+00 488 1.83e+01 0
VecScatterEnd 244 1.0 7.1686e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 5 0 0 0 0 0 0 488 1.83e+01 0 0.00e+00 0
PCApply 244 1.0 5.1900e-03 1.1 8.00e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 4 1 0 0 0 11756 21109 0 0.00e+00 0 0.00e+00 100
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 31 31 17856 0.
SNES 1 1 1540 0.
DMSNES 1 1 688 0.
Krylov Solver 1 1 1664 0.
DMKSP interface 1 1 656 0.
Matrix 74 74 22701688 0.
Distributed Mesh 68 68 1326768 0.
DM Label 164 164 103648 0.
Quadrature 148 148 87616 0.
Mesh Transform 4 4 3024 0.
Index Set 601 601 757748 0.
IS L to G Mapping 2 2 145664 0.
Section 242 242 172304 0.
Star Forest Graph 167 167 181824 0.
Discrete System 111 111 106564 0.
Weak Form 112 112 68992 0.
GraphPartitioner 32 32 22016 0.
Vector 53 53 2498888 0.
Linear Space 5 5 3416 0.
Dual Space 26 26 24336 0.
FE Space 2 2 1576 0.
Viewer 2 1 840 0.
Preconditioner 1 1 872 0.
Field over DM 1 1 704 0.
--- Event Stage 1: PCSetUp
--- Event Stage 2: KSP Solve only
Average time to get PetscTime(): 3.51e-08
Average time for MPI_Barrier(): 2.7974e-06
Average time for zero size MPI_Send(): 9.373e-06
#PETSc Option Table entries:
-benchmark_it 2
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 4
-dm_vec_type kokkos
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=cc --with-cxx=CC --with-fc=ftn --with-fortran-bindings=0 LIBS="-L/opt/cray/pe/mpich/8.1.12/gtl/lib -lmpi_gtl_hsa" --with-debugging=0 --COPTFLAGS="-g -O" --CXXOPTFLAGS="-g -O" --FOPTFLAGS=-g --with-mpiexec="srun -p batch -N 1 -A csc314_crusher -t 00:10:00" --with-hip --with-hipc=hipcc --download-hypre --with-hip-arch=gfx90a --download-kokkos --download-kokkos-kernels --with-kokkos-kernels-tpl=0 --download-p4est=1 --with-zlib-dir=/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4 PETSC_ARCH=arch-olcf-crusher
Libraries compiled on 2022-01-25 14:29:13 on login2
Machine characteristics: Linux-5.3.18-59.16_11.0.39-cray_shasta_c-x86_64-with-glibc2.3.4
Using PETSc directory: /gpfs/alpine/csc314/scratch/adams/petsc
Using PETSc arch: arch-olcf-crusher
Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O
Using Fortran compiler: ftn -fPIC -g
Using include paths: -I/gpfs/alpine/csc314/scratch/adams/petsc/include -I/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/include -I/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/include -I/opt/rocm-4.5.0/include
Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -lpetsc -Wl,-rpath,/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -L/gpfs/alpine/csc314/scratch/adams/petsc/arch-olcf-crusher/lib -Wl,-rpath,/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -L/sw/crusher/spack-envs/base/opt/cray-sles15-zen3/cce-13.0.0/zlib-1.2.11-qx5p4iereg4sjvfi5uwk6jn56o6se2q4/lib -Wl,-rpath,/opt/rocm-4.5.0/lib -L/opt/rocm-4.5.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/gtl/lib -L/opt/cray/pe/mpich/8.1.12/gtl/lib -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64 -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/ -L/opt/cray/pe/libsci/ -Wl,-rpath,/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -L/opt/cray/pe/mpich/8.1.12/ofi/cray/10.0/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.16/lib -L/opt/cray/pe/pmi/6.0.16/lib -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -L/opt/cray/pe/cce/13.0.0/cce/x86_64/lib -Wl,-rpath,/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -L/opt/cray/xpmem/2.3.2-2.2_1.16__g9ea452c.shasta/lib64 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -L/opt/cray/pe/cce/13.0.0/cce-clang/x86_64/lib/clang/13.0.0/lib/linux -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0 -Wl,-rpath,/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -L/opt/cray/pe/cce/13.0.0/binutils/x86_64/x86_64-unknown-linux-gnu/lib -lHYPRE -lkokkoskernels -lkokkoscontainers -lkokkoscore -lp4est -lsc -lz -lhipsparse -lhipblas -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -ldl -lmpi_gtl_hsa -lmpifort_cray -lmpi_cray -ldsmml -lpmi -lpmi2 -lxpmem -lstdc++ -lpgas-shmem -lquadmath -lmodules -lfi -lcraymath -lf -lu -lcsup -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64 -lclang_rt.builtins-x86_64 -lquadmath -ldl -lmpi_gtl_hsa
# #
# WARNING!!! #
# #
# This code was compiled with GPU support and you've #
# created PETSc/GPU objects, but you intentionally used #
# -use_gpu_aware_mpi 0, such that PETSc had to copy data #
# from GPU to CPU for communication. To get meaningfull #
# timing results, please use GPU-aware MPI instead. #
#PETSc Option Table entries:
-benchmark_it 2
-dm_mat_type aijkokkos
-dm_plex_box_faces 2,2,2
-dm_plex_box_lower 0,0,0
-dm_plex_box_upper 1,1,1
-dm_plex_dim 3
-dm_plex_simplex 0
-dm_refine 4
-dm_vec_type kokkos
-ksp_max_it 200
-ksp_norm_type unpreconditioned
-ksp_rtol 1.e-12
-ksp_type cg
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.05
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-pc_gamg_coarse_eq_limit 100
-pc_gamg_coarse_grid_layout_type compact
-pc_gamg_esteig_ksp_max_it 10
-pc_gamg_esteig_ksp_type cg
-pc_gamg_process_eq_limit 400
-pc_gamg_repartition false
-pc_gamg_reuse_interpolation true
-pc_gamg_square_graph 0
-pc_gamg_threshold -0.01
-pc_type jacobi
-petscpartitioner_simple_node_grid 1,1,1
-petscpartitioner_simple_process_grid 2,2,2
-petscpartitioner_type simple
-potential_petscspace_degree 2
-snes_max_it 1
-snes_rtol 1.e-8
-snes_type ksponly
-use_gpu_aware_mpi false
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 14 unused database options. They are:
Option left: name:-mg_levels_esteig_ksp_max_it value: 10
Option left: name:-mg_levels_esteig_ksp_type value: cg
Option left: name:-mg_levels_ksp_chebyshev_esteig value: 0,0.05,0,1.05
Option left: name:-mg_levels_ksp_type value: chebyshev
Option left: name:-mg_levels_pc_type value: jacobi
Option left: name:-pc_gamg_coarse_eq_limit value: 100
Option left: name:-pc_gamg_coarse_grid_layout_type value: compact
Option left: name:-pc_gamg_esteig_ksp_max_it value: 10
Option left: name:-pc_gamg_esteig_ksp_type value: cg
Option left: name:-pc_gamg_process_eq_limit value: 400
Option left: name:-pc_gamg_repartition value: false
Option left: name:-pc_gamg_reuse_interpolation value: true
Option left: name:-pc_gamg_square_graph value: 0
Option left: name:-pc_gamg_threshold value: -0.01
More information about the petsc-dev
mailing list