[petsc-dev] Kokkos/Crusher perforance

Barry Smith bsmith at petsc.dev
Tue Jan 25 11:40:38 CST 2022



> On Jan 25, 2022, at 12:25 PM, Jed Brown <jed at jedbrown.org> wrote:
> 
> Barry Smith <bsmith at petsc.dev> writes:
> 
>>> On Jan 25, 2022, at 11:55 AM, Jed Brown <jed at jedbrown.org> wrote:
>>> 
>>> Barry Smith <bsmith at petsc.dev> writes:
>>> 
>>>> Thanks Mark, far more interesting. I've improved the formatting to make it easier to read (and fixed width font for email reading)
>>>> 
>>>> * Can you do same run with say 10 iterations of Jacobi PC?
>>>> 
>>>> * PCApply performance (looks like GAMG) is terrible! Problems too small?
>>> 
>>> This is -pc_type jacobi.
>> 
>>  Dang, how come it doesn't warn about all the gamg arguments passed to the program? I saw them and jump to the wrong conclusion.
> 
> We don't have -options_left by default. Mark has a big .petscrc or PETSC_OPTIONS.
> 
>>  How come PCApply is so low while Pointwise mult (which should be all of PCApply) is high?
> 
> I also think that's weird.
> 
>>> 
>>>> * VecScatter time is completely dominated by SFPack! Junchao what's up with that? Lots of little kernels in the PCApply? PCJACOBI run will help clarify where that is coming from.
>>> 
>>> It's all in MatMult.
>>> 
>>> I'd like to see a run that doesn't wait for the GPU.
>> 
>>  Indeed
> 
> What is the command line option to turn PetscLogGpuTimeBegin/PetscLogGpuTimeEnd into a no-op even when -log_view is on? I know it'll mess up attribution, but it'll still tell us how long the solve took.

  We don't have an API for this yet. It is slightly tricky because turning it off will also break the regular -log_view for some stuff like VecAXPY(); anything that doesn't have a needed synchronization with the CPU.) 

  Because of this I think Mark should just put a PetscTime() around KSPSolve run without -log_view and we can compare that number to the one from -log_view to see how much the synchronousness of PetscLogGPUTime is causing. Ad hoc yes, but a quick easy way to get the information.

> 
> Also, can we make WaitForKokkos a no-op? I don't think it's necessary for correctness (docs indicate kokkos::fence synchronizes).



More information about the petsc-dev mailing list