[petsc-dev] Kokkos/Crusher perforance

Jed Brown jed at jedbrown.org
Tue Jan 25 11:25:48 CST 2022


Barry Smith <bsmith at petsc.dev> writes:

>> On Jan 25, 2022, at 11:55 AM, Jed Brown <jed at jedbrown.org> wrote:
>> 
>> Barry Smith <bsmith at petsc.dev> writes:
>> 
>>>  Thanks Mark, far more interesting. I've improved the formatting to make it easier to read (and fixed width font for email reading)
>>> 
>>>  * Can you do same run with say 10 iterations of Jacobi PC?
>>> 
>>>  * PCApply performance (looks like GAMG) is terrible! Problems too small?
>> 
>> This is -pc_type jacobi.
>
>   Dang, how come it doesn't warn about all the gamg arguments passed to the program? I saw them and jump to the wrong conclusion.

We don't have -options_left by default. Mark has a big .petscrc or PETSC_OPTIONS.

>   How come PCApply is so low while Pointwise mult (which should be all of PCApply) is high?

I also think that's weird.

>> 
>>>  * VecScatter time is completely dominated by SFPack! Junchao what's up with that? Lots of little kernels in the PCApply? PCJACOBI run will help clarify where that is coming from.
>> 
>> It's all in MatMult.
>> 
>> I'd like to see a run that doesn't wait for the GPU.
>
>   Indeed

What is the command line option to turn PetscLogGpuTimeBegin/PetscLogGpuTimeEnd into a no-op even when -log_view is on? I know it'll mess up attribution, but it'll still tell us how long the solve took.

Also, can we make WaitForKokkos a no-op? I don't think it's necessary for correctness (docs indicate kokkos::fence synchronizes).


More information about the petsc-dev mailing list