[petsc-users] Performance of SLEPc's Krylov-Schur solver
Walker Andreas
awalker at student.ethz.ch
Fri May 1 03:07:55 CDT 2020
Hi Matthew,
I just ran the same program on a single core. You can see the output of -log_view below. As I see it, most functions have speedups of around 50 for 128 cores, also functions like matmult etc.
Best regards,
Andreas
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./Solver on a named eu-a6-011-09 with 1 processor, by awalker Fri May 1 04:03:07 2020
Using Petsc Release Version 3.10.5, Mar, 28, 2019
Max Max/Min Avg Total
Time (sec): 3.092e+04 1.000 3.092e+04
Objects: 6.099e+05 1.000 6.099e+05
Flop: 9.313e+13 1.000 9.313e+13 9.313e+13
Flop/sec: 3.012e+09 1.000 3.012e+09 3.012e+09
MPI Messages: 0.000e+00 0.000 0.000e+00 0.000e+00
MPI Message Lengths: 0.000e+00 0.000 0.000e+00 0.000e+00
MPI Reductions: 0.000e+00 0.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 3.0925e+04 100.0% 9.3134e+13 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 152338 1.0 8.2799e+03 1.0 8.20e+12 1.0 0.0e+00 0.0e+00 0.0e+00 27 9 0 0 0 27 9 0 0 0 990
MatMultAdd 609352 1.0 8.1229e+03 1.0 8.20e+12 1.0 0.0e+00 0.0e+00 0.0e+00 26 9 0 0 0 26 9 0 0 0 1010
MatConvert 30 1.0 1.5797e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatScale 10 1.0 4.7172e-02 1.0 6.73e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1426
MatAssemblyBegin 516 1.0 2.0695e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 516 1.0 2.8933e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 2 1.0 3.6038e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 10 1.0 2.4422e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAXPY 40 1.0 3.1595e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatMatMult 60 1.0 1.3723e+01 1.0 1.24e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 90
MatMatMultSym 100 1.0 1.3651e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatMatMultNum 100 1.0 7.5159e+00 1.0 2.06e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 274
MatMatMatMult 40 1.0 1.8674e+01 1.0 1.66e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 89
MatMatMatMultSym 40 1.0 1.1848e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatMatMatMultNum 40 1.0 6.8266e+00 1.0 1.66e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 243
MatPtAP 40 1.0 1.9042e+01 1.0 1.66e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 87
MatTrnMatMult 40 1.0 7.7990e+00 1.0 8.24e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 106
DMPlexStratify 1 1.0 5.1223e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
DMPlexPrealloc 2 1.0 1.5242e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 914053 1.0 1.4929e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyBegin 1 1.0 1.3411e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyEnd 1 1.0 8.0094e-08 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 1 1.0 2.6399e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSetRandom 10 1.0 8.6088e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
EPSSetUp 10 1.0 2.9988e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
EPSSolve 10 1.0 2.8695e+04 1.0 9.31e+13 1.0 0.0e+00 0.0e+00 0.0e+00 93100 0 0 0 93100 0 0 0 3246
STSetUp 10 1.0 9.7291e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
STApply 152338 1.0 8.2803e+03 1.0 8.20e+12 1.0 0.0e+00 0.0e+00 0.0e+00 27 9 0 0 0 27 9 0 0 0 990
BVCopy 1814 1.0 1.1076e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
BVMultVec 304639 1.0 9.8281e+03 1.0 3.34e+13 1.0 0.0e+00 0.0e+00 0.0e+00 32 36 0 0 0 32 36 0 0 0 3397
BVMultInPlace 1824 1.0 7.0999e+02 1.0 1.79e+13 1.0 0.0e+00 0.0e+00 0.0e+00 2 19 0 0 0 2 19 0 0 0 25213
BVDotVec 304639 1.0 9.8037e+03 1.0 3.36e+13 1.0 0.0e+00 0.0e+00 0.0e+00 32 36 0 0 0 32 36 0 0 0 3427
BVOrthogonalizeV 152348 1.0 1.9633e+04 1.0 6.70e+13 1.0 0.0e+00 0.0e+00 0.0e+00 63 72 0 0 0 63 72 0 0 0 3411
BVScale 152348 1.0 3.7888e+01 1.0 5.32e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1403
BVSetRandom 10 1.0 8.6364e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
DSSolve 1824 1.0 1.7363e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
DSVectors 2797 1.0 1.2353e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
DSOther 1824 1.0 9.8627e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 1 1 584 0.
Distributed Mesh 1 1 5184 0.
GraphPartitioner 1 1 624 0.
Matrix 320 320 3469402576 0.
Index Set 53 53 2777932 0.
IS L to G Mapping 1 1 249320 0.
Section 13 11 7920 0.
Star Forest Graph 6 6 4896 0.
Discrete System 1 1 936 0.
Vector 609405 609405 857220847896 0.
Vec Scatter 1 1 704 0.
Viewer 22 11 9328 0.
EPS Solver 10 10 86360 0.
Spectral Transform 10 10 8400 0.
Basis Vectors 10 10 530336 0.
PetscRandom 10 10 6540 0.
Region 10 10 6800 0.
Direct Solver 10 10 9838880 0.
Krylov Solver 10 10 13920 0.
Preconditioner 10 10 10080 0.
========================================================================================================================
Average time to get PetscTime(): 2.50991e-08
#PETSc Option Table entries:
-config=benchmark3.json
-eps_converged_reason
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --prefix=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit --with-ssl=0 --download-c2html=0 --download-sowing=0 --download-hwloc=0 CFLAGS="-ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2" FFLAGS= CXXFLAGS="-ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2" --with-cc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc --with-cxx=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpic++ --with-fc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90 --with-precision=double --with-scalar-type=real --with-shared-libraries=1 --with-debugging=0 --with-64-bit-indices=0 COPTFLAGS= FOPTFLAGS= CXXOPTFLAGS= --with-blaslapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib/libopenblas.so --with-x=0 --with-cxx-dialect=C++11 --with-boost=1 --with-clanguage=C --with-scalapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib/libscalapack.so --with-scalapack=1 --with-metis=1 --with-metis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk --with-hdf5=1 --with-hdf5-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5 --with-hypre=1 --with-hypre-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne --with-parmetis=1 --with-parmetis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4 --with-mumps=1 --with-mumps-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b --with-trilinos=1 --with-trilinos-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo --with-fftw=0 --with-cxx-dialect=C++11 --with-superlu_dist-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include --with-superlu_dist-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib/libsuperlu_dist.a --with-superlu_dist=1 --with-suitesparse-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include --with-suitesparse-lib="/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libumfpack.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libklu.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcholmod.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libbtf.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libccolamd.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcolamd.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcamd.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libamd.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libsuitesparseconfig.so /lib64/librt.so" --with-suitesparse=1 --with-zlib-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include --with-zlib-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib/libz.so --with-zlib=1
-----------------------------------------
Libraries compiled on 2020-01-22 15:21:53 on eu-c7-051-02
Machine characteristics: Linux-3.10.0-862.14.4.el7.x86_64-x86_64-with-centos-7.5.1804-Core
Using PETSc directory: /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit
Using PETSc arch:
-----------------------------------------
Using C compiler: /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc -ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2
Using Fortran compiler: /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90
-----------------------------------------
Using include paths: -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include
-----------------------------------------
Using C linker: /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc
Using Fortran linker: /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90
Using libraries: -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib -lpetsc -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib /lib64/librt.so -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib -Wl,-rpath,/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib -L/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib:/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0 -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0 -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib -lmuelu-adapters -lmuelu-interface -lmuelu -lstratimikos -lstratimikosbelos -lstratimikosaztecoo -lstratimikosamesos -lstratimikosml -lstratimikosifpack -lModeLaplace -lanasaziepetra -lanasazi -lmapvarlib -lsuplib_cpp -lsuplib_c -lsuplib -lsupes -laprepro_lib -lchaco -lio_info_lib -lIonit -lIotr -lIohb -lIogs -lIogn -lIovs -lIopg -lIoexo_fac -lIopx -lIofx -lIoex -lIoss -lnemesis -lexoIIv2for32 -lexodus_for -lexodus -lmapvarlib -lsuplib_cpp -lsuplib_c -lsuplib -lsupes -laprepro_lib -lchaco -lio_info_lib -lIonit -lIotr -lIohb -lIogs -lIogn -lIovs -lIopg -lIoexo_fac -lIopx -lIofx -lIoex -lIoss -lnemesis -lexoIIv2for32 -lexodus_for -lexodus -lbelosxpetra -lbelosepetra -lbelos -lml -lifpack -lpamgen_extras -lpamgen -lamesos -lgaleri-xpetra -lgaleri-epetra -laztecoo -lisorropia -lxpetra-sup -lxpetra -lthyraepetraext -lthyraepetra -lthyracore -lthyraepetraext -lthyraepetra -lthyracore -lepetraext -ltrilinosss -ltriutils -lzoltan -lepetra -lsacado -lrtop -lkokkoskernels -lteuchoskokkoscomm -lteuchoskokkoscompat -lteuchosremainder -lteuchosnumerics -lteuchoscomm -lteuchosparameterlist -lteuchosparser -lteuchoscore -lteuchoskokkoscomm -lteuchoskokkoscompat -lteuchosremainder -lteuchosnumerics -lteuchoscomm -lteuchosparameterlist -lteuchosparser -lteuchoscore -lkokkosalgorithms -lkokkoscontainers -lkokkoscore -lkokkosalgorithms -lkokkoscontainers -lkokkoscore -lgtest -lpthread -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -lsuperlu_dist -lHYPRE -lopenblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lparmetis -lmetis -lm -lz -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl
-----------------------------------------
Am 30.04.2020 um 17:14 schrieb Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>:
On Thu, Apr 30, 2020 at 10:55 AM Walker Andreas <awalker at student.ethz.ch<mailto:awalker at student.ethz.ch>> wrote:
Hello everyone,
I have used SLEPc successfully on a FEM-related project. Even though it is very powerful overall, the speedup I measure is a bit below my expectations. Compared to using a single core, the speedup is for example around 1.8 for two cores but only maybe 50-60 for 128 cores and maybe 70 or 80 for 256 cores. Some details about my problem:
- The problem is based on meshes with up to 400k degrees of freedom. DMPlex is used for organizing it.
- ParMetis is used to partition the mesh. This yields a stiffness matrix where the vast majority of entries is in the diagonal blocks (i.e. looking at the rows owned by a core, there is a very dense square-shaped region around the diagonal and some loosely scattered nozeroes in the other columns).
- The actual matrix from which I need eigenvalues is a 2x2 block matrix, saved as MATNEST - matrix. Each of these four matrices is computed based on the stiffness matrix and has a similar size and nonzero pattern. For a mesh of 200k dofs, one such matrix has a size of about 174kx174k and on average about 40 nonzeroes per row.
- I use the default Krylov-Schur solver and look for the 100 smallest eigenvalues
- The output of -log_view for the 200k-dof - mesh described above run on 128 cores is at the end of this mail.
I noticed that the problem matrices are not perfectly balanced, i.e. the number of rows per core might vary between 2500 and 3000, for example. But I am not sure if this is the main reason for the poor speedup.
I tried to reduce the subspace size but without effect. I also attempted to use the shift-and-invert spectral transformation but the MATNEST-type prevents this.
Are there any suggestions to improve the speedup further or is this the maximum speedup that I can expect?
Can you also give us the performance for this problem on one node using the same number of cores per node? Then we can calculate speedup
and look at which functions are not speeding up.
Thanks,
Matt
Thanks a lot in advance,
Andreas Walker
m&m group
D-MAVT
ETH Zurich
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./Solver on a named eu-g1-050-2 with 128 processors, by awalker Thu Apr 30 15:50:22 2020
Using Petsc Release Version 3.10.5, Mar, 28, 2019
Max Max/Min Avg Total
Time (sec): 6.209e+02 1.000 6.209e+02
Objects: 6.068e+05 1.001 6.063e+05
Flop: 9.230e+11 1.816 7.212e+11 9.231e+13
Flop/sec: 1.487e+09 1.816 1.161e+09 1.487e+11
MPI Messages: 1.451e+07 2.999 8.265e+06 1.058e+09
MPI Message Lengths: 6.062e+09 2.011 5.029e+02 5.321e+11
MPI Reductions: 1.512e+06 1.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 6.2090e+02 100.0% 9.2309e+13 100.0% 1.058e+09 100.0% 5.029e+02 100.0% 1.512e+06 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
BuildTwoSided 20 1.0 2.3249e-01 2.2 0.00e+00 0.0 2.2e+04 4.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
BuildTwoSidedF 317 1.0 8.5016e-01 4.8 0.00e+00 0.0 2.1e+04 1.4e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatMult 150986 1.0 2.1963e+02 1.3 8.07e+10 1.8 1.1e+09 5.0e+02 1.2e+06 31 9100100 80 31 9100100 80 37007
MatMultAdd 603944 1.0 1.6209e+02 1.4 8.07e+10 1.8 1.1e+09 5.0e+02 0.0e+00 23 9100100 0 23 9100100 0 50145
MatConvert 30 1.0 1.6488e-02 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatScale 10 1.0 1.0347e-03 3.9 6.68e+05 1.8 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 65036
MatAssemblyBegin 916 1.0 8.6715e-01 1.4 0.00e+00 0.0 2.1e+04 1.4e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 916 1.0 2.0682e-01 1.1 0.00e+00 0.0 4.7e+05 1.3e+02 1.5e+03 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 42 1.0 7.2787e-03 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 10 1.0 1.4816e+00 1.0 0.00e+00 0.0 6.4e+03 1.3e+05 3.0e+01 0 0 0 0 0 0 0 0 0 0 0
MatAXPY 40 1.0 1.0752e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatTranspose 80 1.0 3.0198e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatMatMult 60 1.0 3.0391e-01 1.0 7.82e+06 1.6 3.8e+05 2.8e+02 7.8e+02 0 0 0 0 0 0 0 0 0 0 2711
MatMatMultSym 60 1.0 2.4238e-01 1.0 0.00e+00 0.0 3.3e+05 2.4e+02 7.2e+02 0 0 0 0 0 0 0 0 0 0 0
MatMatMultNum 60 1.0 5.8508e-02 1.0 7.82e+06 1.6 4.7e+04 5.7e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 14084
MatPtAP 40 1.0 4.5617e-01 1.0 1.59e+07 1.6 3.3e+05 1.0e+03 6.4e+02 0 0 0 0 0 0 0 0 0 0 3649
MatPtAPSymbolic 40 1.0 2.6002e-01 1.0 0.00e+00 0.0 1.7e+05 6.5e+02 2.8e+02 0 0 0 0 0 0 0 0 0 0 0
MatPtAPNumeric 40 1.0 1.9293e-01 1.0 1.59e+07 1.6 1.5e+05 1.5e+03 3.2e+02 0 0 0 0 0 0 0 0 0 0 8629
MatTrnMatMult 40 1.0 2.3801e-01 1.0 6.09e+06 1.8 1.8e+05 1.0e+03 6.4e+02 0 0 0 0 0 0 0 0 0 0 2442
MatTrnMatMultSym 40 1.0 1.6962e-01 1.0 0.00e+00 0.0 1.7e+05 4.4e+02 6.4e+02 0 0 0 0 0 0 0 0 0 0 0
MatTrnMatMultNum 40 1.0 6.9000e-02 1.0 6.09e+06 1.8 9.7e+03 1.1e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 8425
MatGetLocalMat 240 1.0 4.9149e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 160 1.0 2.0470e-02 1.6 0.00e+00 0.0 3.3e+05 4.1e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatTranspose_SeqAIJ_FAST 80 1.0 2.9940e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
Mesh Partition 1 1.0 1.4825e+00 1.0 0.00e+00 0.0 9.8e+04 6.9e+01 6.0e+00 0 0 0 0 0 0 0 0 0 0 0
Mesh Migration 1 1.0 3.6680e-02 1.0 0.00e+00 0.0 1.5e+03 1.4e+04 6.0e+00 0 0 0 0 0 0 0 0 0 0 0
DMPlexDistribute 1 1.0 1.5269e+00 1.0 0.00e+00 0.0 1.0e+05 3.5e+02 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
DMPlexDistCones 1 1.0 1.8845e-02 1.2 0.00e+00 0.0 1.0e+03 1.7e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
DMPlexDistLabels 1 1.0 9.7280e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0
DMPlexDistData 1 1.0 3.1499e-01 1.4 0.00e+00 0.0 9.8e+04 4.3e+01 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
DMPlexStratify 2 1.0 9.3421e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
DMPlexPrealloc 2 1.0 3.5980e-02 1.0 0.00e+00 0.0 4.0e+04 1.8e+03 3.0e+01 0 0 0 0 0 0 0 0 0 0 0
SFSetGraph 20 1.0 1.6069e-05 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFSetUp 20 1.0 2.8043e-01 1.9 0.00e+00 0.0 6.7e+04 5.0e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFBcastBegin 25 1.0 3.9653e-02 2.5 0.00e+00 0.0 6.1e+04 4.9e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFBcastEnd 25 1.0 9.0128e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFReduceBegin 10 1.0 4.3473e-04 5.5 0.00e+00 0.0 7.4e+03 4.0e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFReduceEnd 10 1.0 5.7962e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFFetchOpBegin 2 1.0 1.6069e-0434.7 0.00e+00 0.0 1.8e+03 4.4e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFFetchOpEnd 2 1.0 8.9251e-04 2.6 0.00e+00 0.0 1.8e+03 4.4e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 302179 1.0 1.3128e+00 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyBegin 1 1.0 1.3844e-03 7.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyEnd 1 1.0 3.4710e-05 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 603945 1.0 2.2874e+01 4.4 0.00e+00 0.0 1.1e+09 5.0e+02 1.0e+00 2 0100100 0 2 0100100 0 0
VecScatterEnd 603944 1.0 8.2651e+01 4.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 7 0 0 0 0 7 0 0 0 0 0
VecSetRandom 11 1.0 2.7061e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
EPSSetUp 10 1.0 5.0371e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+01 0 0 0 0 0 0 0 0 0 0 0
EPSSolve 10 1.0 6.1329e+02 1.0 9.23e+11 1.8 1.1e+09 5.0e+02 1.5e+06 99100100100100 99100100100100 150509
STSetUp 10 1.0 2.5475e-04 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
STApply 150986 1.0 2.1997e+02 1.3 8.07e+10 1.8 1.1e+09 5.0e+02 1.2e+06 31 9100100 80 31 9100100 80 36950
BVCopy 1791 1.0 5.1953e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
BVMultVec 301925 1.0 1.5007e+02 3.1 3.31e+11 1.8 0.0e+00 0.0e+00 0.0e+00 14 36 0 0 0 14 36 0 0 0 220292
BVMultInPlace 1801 1.0 8.0080e+00 1.8 1.78e+11 1.8 0.0e+00 0.0e+00 0.0e+00 1 19 0 0 0 1 19 0 0 0 2222543
BVDotVec 301925 1.0 3.2807e+02 1.4 3.33e+11 1.8 0.0e+00 0.0e+00 3.0e+05 47 36 0 0 20 47 36 0 0 20 101409
BVOrthogonalizeV 150996 1.0 4.0292e+02 1.1 6.64e+11 1.8 0.0e+00 0.0e+00 3.0e+05 62 72 0 0 20 62 72 0 0 20 164619
BVScale 150996 1.0 4.1660e-01 3.2 5.27e+08 1.8 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 126494
BVSetRandom 10 1.0 2.5061e-03 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
DSSolve 1801 1.0 2.0764e+01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
DSVectors 2779 1.0 1.2691e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
DSOther 1801 1.0 1.2944e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 1 1 584 0.
Distributed Mesh 6 6 29160 0.
GraphPartitioner 2 2 1244 0.
Matrix 1104 1104 136615232 0.
Index Set 930 930 9125912 0.
IS L to G Mapping 3 3 2235608 0.
Section 28 26 18720 0.
Star Forest Graph 30 30 25632 0.
Discrete System 6 6 5616 0.
PetscRandom 11 11 7194 0.
Vector 604372 604372 8204816368 0.
Vec Scatter 203 203 272192 0.
Viewer 21 10 8480 0.
EPS Solver 10 10 86360 0.
Spectral Transform 10 10 8400 0.
Basis Vectors 10 10 530848 0.
Region 10 10 6800 0.
Direct Solver 10 10 9838880 0.
Krylov Solver 10 10 13920 0.
Preconditioner 10 10 10080 0.
========================================================================================================================
Average time to get PetscTime(): 3.49944e-08
Average time for MPI_Barrier(): 5.842e-06
Average time for zero size MPI_Send(): 8.72551e-06
#PETSc Option Table entries:
-config=benchmark3.json
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --prefix=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit --with-ssl=0 --download-c2html=0 --download-sowing=0 --download-hwloc=0 CFLAGS="-ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2" FFLAGS= CXXFLAGS="-ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2" --with-cc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc --with-cxx=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpic++ --with-fc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90 --with-precision=double --with-scalar-type=real --with-shared-libraries=1 --with-debugging=0 --with-64-bit-indices=0 COPTFLAGS= FOPTFLAGS= CXXOPTFLAGS= --with-blaslapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib/libopenblas.so --with-x=0 --with-cxx-dialect=C++11 --with-boost=1 --with-clanguage=C --with-scalapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib/libscalapack.so --with-scalapack=1 --with-metis=1 --with-metis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk --with-hdf5=1 --with-hdf5-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5 --with-hypre=1 --with-hypre-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne --with-parmetis=1 --with-parmetis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4 --with-mumps=1 --with-mumps-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b --with-trilinos=1 --with-trilinos-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo --with-fftw=0 --with-cxx-dialect=C++11 --with-superlu_dist-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include --with-superlu_dist-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib/libsuperlu_dist.a --with-superlu_dist=1 --with-suitesparse-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include --with-suitesparse-lib="/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libumfpack.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libklu.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcholmod.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libbtf.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libccolamd.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcolamd.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcamd.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libamd.so /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libsuitesparseconfig.so /lib64/librt.so" --with-suitesparse=1 --with-zlib-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include --with-zlib-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib/libz.so --with-zlib=1
-----------------------------------------
Libraries compiled on 2020-01-22 15:21:53 on eu-c7-051-02
Machine characteristics: Linux-3.10.0-862.14.4.el7.x86_64-x86_64-with-centos-7.5.1804-Core
Using PETSc directory: /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit
Using PETSc arch:
-----------------------------------------
Using C compiler: /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc -ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2
Using Fortran compiler: /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90
-----------------------------------------
Using include paths: -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/include -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include
-----------------------------------------
Using C linker: /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc
Using Fortran linker: /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90
Using libraries: -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib -lpetsc -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib /lib64/librt.so -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib -Wl,-rpath,/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib -L/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib:/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0 -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0 -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib -lmuelu-adapters -lmuelu-interface -lmuelu -lstratimikos -lstratimikosbelos -lstratimikosaztecoo -lstratimikosamesos -lstratimikosml -lstratimikosifpack -lModeLaplace -lanasaziepetra -lanasazi -lmapvarlib -lsuplib_cpp -lsuplib_c -lsuplib -lsupes -laprepro_lib -lchaco -lio_info_lib -lIonit -lIotr -lIohb -lIogs -lIogn -lIovs -lIopg -lIoexo_fac -lIopx -lIofx -lIoex -lIoss -lnemesis -lexoIIv2for32 -lexodus_for -lexodus -lmapvarlib -lsuplib_cpp -lsuplib_c -lsuplib -lsupes -laprepro_lib -lchaco -lio_info_lib -lIonit -lIotr -lIohb -lIogs -lIogn -lIovs -lIopg -lIoexo_fac -lIopx -lIofx -lIoex -lIoss -lnemesis -lexoIIv2for32 -lexodus_for -lexodus -lbelosxpetra -lbelosepetra -lbelos -lml -lifpack -lpamgen_extras -lpamgen -lamesos -lgaleri-xpetra -lgaleri-epetra -laztecoo -lisorropia -lxpetra-sup -lxpetra -lthyraepetraext -lthyraepetra -lthyracore -lthyraepetraext -lthyraepetra -lthyracore -lepetraext -ltrilinosss -ltriutils -lzoltan -lepetra -lsacado -lrtop -lkokkoskernels -lteuchoskokkoscomm -lteuchoskokkoscompat -lteuchosremainder -lteuchosnumerics -lteuchoscomm -lteuchosparameterlist -lteuchosparser -lteuchoscore -lteuchoskokkoscomm -lteuchoskokkoscompat -lteuchosremainder -lteuchosnumerics -lteuchoscomm -lteuchosparameterlist -lteuchosparser -lteuchoscore -lkokkosalgorithms -lkokkoscontainers -lkokkoscore -lkokkosalgorithms -lkokkoscontainers -lkokkoscore -lgtest -lpthread -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -lsuperlu_dist -lHYPRE -lopenblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lparmetis -lmetis -lm -lz -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl
-----------------------------------------
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200501/c1a2cf0d/attachment-0001.html>
More information about the petsc-users
mailing list