[petsc-dev] Kokkos/Crusher perforance

Jed Brown jed at jedbrown.org
Sun Jan 23 22:47:43 CST 2022


Barry Smith via petsc-dev <petsc-dev at mcs.anl.gov> writes:

>   The PetscLogGpuTimeBegin()/End was written by Hong so it works with events to get a GPU timing, it is not suppose to include the CPU kernel launch times or the time to move the scalar arguments to the GPU. It may not be perfect but it is the best we can do to capture the time the GPU is actively doing the numerics, which is what we want.

As we discussed at the time, collecting the results can be asynchronous and this would be useful to reduce the negative impact of profiling on end-to-end performance.

But I think what's proposed here is okay because PetscLogGpuTimeBegin() starts counting when the device reaches that point, not when it's given on the host.


More information about the petsc-dev mailing list