[petsc-dev] GPU performance of MatSOR()
Han Tran
hantran at cs.utah.edu
Wed Jul 27 18:47:48 CDT 2022
Hello,
Running my example using VECMPICUDA for VecSetType(), and MATMPIAIJCUSP for MatSetType(), I have the profiling results as shown below. It is seen that MatSOR() has %F of GPU, only has GpuToCpu count and size. Is it correct that PETSc currently does not have MatSOR implemented on GPU? It would be appreciated if you can provide an explanation on how MatSOR() currently use GPU. From this example, MatSOR takes a considerable time relatively compared to other functions.
Thank you.
-Han
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F
---------------------------------------------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
BuildTwoSided 220001 1.0 3.9580e+02139.9 0.00e+00 0.0 2.0e+00 4.0e+00 2.2e+05 4 0 0 0 20 4 0 0 0 20 0 0 0 0.00e+00 0 0.00e+00 0
BuildTwoSidedF 220000 1.0 3.9614e+02126.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.2e+05 4 0 0 0 20 4 0 0 0 20 0 0 0 0.00e+00 0 0.00e+00 0
VecMDot 386001 1.0 6.3426e+01 1.5 1.05e+11 1.0 0.0e+00 0.0e+00 3.9e+05 1 11 0 0 35 1 11 0 0 35 3311 26012 386001 1.71e+05 0 0.00e+00 100
VecNorm 496001 1.0 5.0877e+01 1.2 5.49e+10 1.0 0.0e+00 0.0e+00 5.0e+05 1 6 0 0 45 1 6 0 0 45 2159 3707 110000 4.87e+04 0 0.00e+00 100
VecScale 496001 1.0 7.9951e+00 1.0 2.75e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 6869 13321 0 0.00e+00 0 0.00e+00 100
VecCopy 110000 1.0 1.9323e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecSet 330017 1.0 5.4319e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecAXPY 110000 1.0 1.5820e+00 1.0 1.22e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 15399 35566 0 0.00e+00 0 0.00e+00 100
VecMAXPY 496001 1.0 1.1505e+01 1.0 1.48e+11 1.0 0.0e+00 0.0e+00 0.0e+00 0 16 0 0 0 0 16 0 0 0 25665 39638 0 0.00e+00 0 0.00e+00 100
VecAssemblyBegin 110000 1.0 1.2021e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+05 0 0 0 0 10 0 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0
VecAssemblyEnd 110000 1.0 1.5988e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecScatterBegin 496001 1.0 1.3002e+01 1.0 0.00e+00 0.0 9.9e+05 1.3e+04 1.0e+00 0 0100100 0 0 0100100 0 0 0 110000 4.87e+04 0 0.00e+00 0
VecScatterEnd 496001 1.0 1.8988e+01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecNormalize 496001 1.0 5.8797e+01 1.1 8.24e+10 1.0 0.0e+00 0.0e+00 5.0e+05 1 9 0 0 45 1 9 0 0 45 2802 4881 110000 4.87e+04 0 0.00e+00 100
VecCUDACopyTo 716001 1.0 3.4483e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 716001 3.17e+05 0 0.00e+00 0
VecCUDACopyFrom 1211994 1.0 5.1752e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 1211994 5.37e+05 0
MatMult 386001 1.0 4.8436e+01 1.0 1.90e+11 1.0 7.7e+05 1.3e+04 0.0e+00 1 21 78 78 0 1 21 78 78 0 7862 16962 0 0.00e+00 0 0.00e+00 100
MatMultAdd 110000 1.0 6.2666e+01 1.1 6.03e+10 1.0 2.2e+05 1.3e+04 1.0e+00 1 7 22 22 0 1 7 22 22 0 1926 16893 440000 3.39e+05 0 0.00e+00 100
MatSOR 496001 1.0 5.1821e+02 1.1 2.83e+11 1.0 0.0e+00 0.0e+00 0.0e+00 10 31 0 0 0 10 31 0 0 0 1090 0 0 0.00e+00 991994 4.39e+05 0
MatAssemblyBegin 110000 1.0 3.9732e+02109.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+05 4 0 0 0 10 4 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0
MatAssemblyEnd 110000 1.0 5.3015e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
MatZeroEntries 110000 1.0 1.3179e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
MatCUSPARSCopyTo 220000 1.0 3.2805e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 220000 2.41e+05 0 0.00e+00 0
KSPSetUp 110000 1.0 3.5344e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
KSPSolve 110000 1.0 6.8304e+02 1.0 8.20e+11 1.0 7.7e+05 1.3e+04 8.8e+05 13 89 78 78 80 13 89 78 78 80 2401 14311 496001 2.20e+05 991994 4.39e+05 66
KSPGMRESOrthog 386001 1.0 7.2820e+01 1.4 2.10e+11 1.0 0.0e+00 0.0e+00 3.9e+05 1 23 0 0 35 1 23 0 0 35 5765 30176 386001 1.71e+05 0 0.00e+00 100
PCSetUp 110000 1.0 1.8825e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
PCApply 496001 1.0 5.1857e+02 1.1 2.83e+11 1.0 0.0e+00 0.0e+00 0.0e+00 10 31 0 0 0 10 31 0 0 0 1090 0 0 0.00e+00 991994 4.39e+05 0
SFSetGraph 1 1.0 2.0936e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFSetUp 1 1.0 2.5347e-03 1.0 0.00e+00 0.0 4.0e+00 3.3e+03 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFPack 496001 1.0 3.0026e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
SFUnpack 496001 1.0 1.1296e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
---------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20220727/8df2d0f3/attachment-0001.html>
More information about the petsc-dev
mailing list