(Type) Total Time, Call Count, Avg. Time per Call, %Total Time in Kernels, %Total Program Time ------------------------------------------------------------------------- Regions: - KokkosSparse_sptrsv[lower] (REGION) 21.695720 6419 0.003380 47.891934 40.962555 - KokkosSparse_sptrsv[upper] (REGION) 21.634439 6419 0.003370 47.756660 40.846853 - KokkosBlas::axpby[ETI] (REGION) 0.502906 25676 0.000020 1.110133 0.949510 - KokkosBlas::nrm1[TPL_ROCBLAS,double] (REGION) 0.318630 7219 0.000044 0.703355 0.601588 - KokkosSparse::spmv[NATIVE,double] (REGION) 0.318328 8019 0.000040 0.702689 0.601018 - KokkosBlas::fill (REGION) 0.117830 404 0.000292 0.260103 0.222469 - KokkosBlas::sum[ETI] (REGION) 0.109400 400 0.000274 0.241494 0.206553 - KokkosBlas::scal[TPL_ROCBLAS,double] (REGION) 0.007711 8019 0.000001 0.017021 0.014559 ------------------------------------------------------------------------- Kernels: - parfor_l_team (ParFor) 20.907269 950012 0.000022 46.151477 39.473921 - parfor_u_team (ParFor) 20.859433 950012 0.000022 46.045882 39.383604 - parfor_tp1 (ParFor) 2.024781 59200 0.000034 4.469575 3.822882 - KokkosSparse::spmv (ParFor) 0.305497 8019 0.000038 0.674365 0.576792 - KokkosBlas::Axpby::S11 (ParFor) 0.175647 12038 0.000015 0.387730 0.331630 - KokkosBlas::Axpby::S14 (ParFor) 0.173657 12838 0.000014 0.383337 0.327873 - VecMDot2 (ParRed) 0.171695 6019 0.000029 0.379005 0.324167 - VecTDot (ParRed) 0.137500 1200 0.000115 0.303522 0.259606 - Kokkos::ViewFill-1D (ParFor) 0.127742 1204 0.000106 0.281983 0.241183 - KokkosBlas::Sum::S0 (ParRed) 0.107688 400 0.000269 0.237714 0.203320 - MatSetValuesCOO_MPIAIJKokkos(_p_Mat*, double const*, InsertMode)::{lambda(long)#1} (ParFor) 0.105651 400 0.000264 0.233219 0.199475 - KokkosBlas::Axpby::S6 (ParFor) 0.105045 400 0.000263 0.231880 0.198330 - sort_crs_graph (ParFor) 0.032297 4 0.008074 0.071293 0.060978 - Kokkos::View::initialization [_mirror] via memset (ParFor) 0.022489 1608 0.000014 0.049642 0.042460 - MatSetValuesCOO_MPIAIJKokkos(_p_Mat*, double const*, InsertMode)::{lambda(long)#2} (ParFor) 0.016265 400 0.000041 0.035903 0.030708 - KokkosBlas::Axpby::S9 (ParFor) 0.006131 400 0.000015 0.013535 0.011576 - Kokkos::View::initialization [level_list_mirror] via memset (ParFor) 0.005256 800 0.000007 0.011602 0.009923 - Kokkos::View::initialization [host nodes_grouped_by_level] via memset (ParFor) 0.005173 800 0.000006 0.011418 0.009766 - Kokkos::View::initialization [host nodes_per_level] via memset (ParFor) 0.004910 800 0.000006 0.010840 0.009271 - Kokkos::View::initialization [lp] via memset (ParFor) 0.004853 800 0.000006 0.010713 0.009163 - Kokkos::View::initialization [DualView::modified_flags] via memset (ParFor) 0.001331 2044 0.000001 0.002938 0.002513 - Kokkos::View::initialization [h_lev] via memset (ParFor) 0.000287 2 0.000143 0.000633 0.000542 - Kokkos::View::initialization [] via memset (ParFor) 0.000242 16 0.000015 0.000533 0.000456 - MatSetValuesCOO_MPIAIJKokkos(_p_Mat*, double const*, InsertMode)::{lambda(long)#3} (ParFor) 0.000238 400 0.000001 0.000525 0.000449 - Kokkos::ViewCopy-1D (ParFor) 0.000082 4 0.000021 0.000182 0.000155 - Kokkos::View::initialization [h_llev] via memset (ParFor) 0.000041 2 0.000021 0.000091 0.000078 - Kokkos::View::initialization [h_iw] via memset (ParFor) 0.000040 2 0.000020 0.000088 0.000076 - Kokkos::View::initialization [h_iL] via memset (ParFor) 0.000040 2 0.000020 0.000088 0.000076 - Kokkos::View::initialization [workVector] via memset (ParFor) 0.000040 2 0.000020 0.000088 0.000076 - Kokkos::View::initialization [level_ptr] via memset (ParFor) 0.000028 2 0.000014 0.000062 0.000053 - Kokkos::View::initialization [level_idx] via memset (ParFor) 0.000025 2 0.000012 0.000055 0.000047 - Kokkos::View::initialization [level_list] via memset (ParFor) 0.000023 2 0.000011 0.000051 0.000043 - Kokkos::View::initialization [hlevel_ptr] via memset (ParFor) 0.000013 2 0.000007 0.000029 0.000025 - Kokkos::View::initialization [level_nchunks] via memset (ParFor) 0.000001 2 0.000000 0.000002 0.000002 - Kokkos::View::initialization [level_nrowsperchunk] via memset (ParFor) 0.000000 2 0.000000 0.000000 0.000000 ------------------------------------------------------------------------- Summary: Total Execution Time (incl. Kokkos + non-Kokkos): 52.96476 seconds Total Time in Kokkos kernels: 45.30141 seconds -> Time outside Kokkos kernels: 7.66336 seconds -> Percentage in Kokkos kernels: 85.53 % Total Calls to Kokkos Kernels: 2009840 -------------------------------------------------------------------------