[petsc-dev] GPU performance of MatSOR()

Han Tran hantran at cs.utah.edu
Wed Jul 27 18:47:48 CDT 2022


Hello,

Running my example using VECMPICUDA for VecSetType(), and MATMPIAIJCUSP for MatSetType(), I have the profiling results as shown below. It is seen that MatSOR() has %F of GPU, only has GpuToCpu count and size. Is it correct that PETSc currently does not have MatSOR implemented on GPU? It would be appreciated if you can provide an explanation on how MatSOR() currently use GPU. From this example, MatSOR takes a considerable time relatively compared to other functions.

Thank you.

-Han

------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total   GPU    - CpuToGpu -   - GpuToCpu - GPU
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s Mflop/s Count   Size   Count   Size  %F
---------------------------------------------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

BuildTwoSided     220001 1.0 3.9580e+02139.9 0.00e+00 0.0 2.0e+00 4.0e+00 2.2e+05  4  0  0  0 20   4  0  0  0 20     0       0      0 0.00e+00    0 0.00e+00  0
BuildTwoSidedF    220000 1.0 3.9614e+02126.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.2e+05  4  0  0  0 20   4  0  0  0 20     0       0      0 0.00e+00    0 0.00e+00  0
VecMDot           386001 1.0 6.3426e+01 1.5 1.05e+11 1.0 0.0e+00 0.0e+00 3.9e+05  1 11  0  0 35   1 11  0  0 35  3311   26012   386001 1.71e+05    0 0.00e+00 100
VecNorm           496001 1.0 5.0877e+01 1.2 5.49e+10 1.0 0.0e+00 0.0e+00 5.0e+05  1  6  0  0 45   1  6  0  0 45  2159    3707   110000 4.87e+04    0 0.00e+00 100
VecScale          496001 1.0 7.9951e+00 1.0 2.75e+10 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0  6869   13321      0 0.00e+00    0 0.00e+00 100
VecCopy           110000 1.0 1.9323e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecSet            330017 1.0 5.4319e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecAXPY           110000 1.0 1.5820e+00 1.0 1.22e+10 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 15399   35566      0 0.00e+00    0 0.00e+00 100
VecMAXPY          496001 1.0 1.1505e+01 1.0 1.48e+11 1.0 0.0e+00 0.0e+00 0.0e+00  0 16  0  0  0   0 16  0  0  0 25665   39638      0 0.00e+00    0 0.00e+00 100
VecAssemblyBegin  110000 1.0 1.2021e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+05  0  0  0  0 10   0  0  0  0 10     0       0      0 0.00e+00    0 0.00e+00  0
VecAssemblyEnd    110000 1.0 1.5988e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecScatterBegin   496001 1.0 1.3002e+01 1.0 0.00e+00 0.0 9.9e+05 1.3e+04 1.0e+00  0  0100100  0   0  0100100  0     0       0   110000 4.87e+04    0 0.00e+00  0
VecScatterEnd     496001 1.0 1.8988e+01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
VecNormalize      496001 1.0 5.8797e+01 1.1 8.24e+10 1.0 0.0e+00 0.0e+00 5.0e+05  1  9  0  0 45   1  9  0  0 45  2802    4881   110000 4.87e+04    0 0.00e+00 100
VecCUDACopyTo     716001 1.0 3.4483e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0   716001 3.17e+05    0 0.00e+00  0
VecCUDACopyFrom  1211994 1.0 5.1752e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00 1211994 5.37e+05  0
MatMult           386001 1.0 4.8436e+01 1.0 1.90e+11 1.0 7.7e+05 1.3e+04 0.0e+00  1 21 78 78  0   1 21 78 78  0  7862   16962      0 0.00e+00    0 0.00e+00 100
MatMultAdd        110000 1.0 6.2666e+01 1.1 6.03e+10 1.0 2.2e+05 1.3e+04 1.0e+00  1  7 22 22  0   1  7 22 22  0  1926   16893   440000 3.39e+05    0 0.00e+00 100
MatSOR            496001 1.0 5.1821e+02 1.1 2.83e+11 1.0 0.0e+00 0.0e+00 0.0e+00 10 31  0  0  0  10 31  0  0  0  1090       0      0 0.00e+00 991994 4.39e+05  0
MatAssemblyBegin  110000 1.0 3.9732e+02109.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+05  4  0  0  0 10   4  0  0  0 10     0       0      0 0.00e+00    0 0.00e+00  0
MatAssemblyEnd    110000 1.0 5.3015e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
MatZeroEntries    110000 1.0 1.3179e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
MatCUSPARSCopyTo  220000 1.0 3.2805e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0   220000 2.41e+05    0 0.00e+00  0
KSPSetUp          110000 1.0 3.5344e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
KSPSolve          110000 1.0 6.8304e+02 1.0 8.20e+11 1.0 7.7e+05 1.3e+04 8.8e+05 13 89 78 78 80  13 89 78 78 80  2401   14311   496001 2.20e+05 991994 4.39e+05 66
KSPGMRESOrthog    386001 1.0 7.2820e+01 1.4 2.10e+11 1.0 0.0e+00 0.0e+00 3.9e+05  1 23  0  0 35   1 23  0  0 35  5765   30176   386001 1.71e+05    0 0.00e+00 100
PCSetUp           110000 1.0 1.8825e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
PCApply           496001 1.0 5.1857e+02 1.1 2.83e+11 1.0 0.0e+00 0.0e+00 0.0e+00 10 31  0  0  0  10 31  0  0  0  1090       0      0 0.00e+00 991994 4.39e+05  0
SFSetGraph             1 1.0 2.0936e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFSetUp                1 1.0 2.5347e-03 1.0 0.00e+00 0.0 4.0e+00 3.3e+03 1.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFPack            496001 1.0 3.0026e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
SFUnpack          496001 1.0 1.1296e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
---------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20220727/8df2d0f3/attachment-0001.html>


More information about the petsc-dev mailing list