[petsc-dev] PETSc issue I cannot post combine WaitForCUDA(); inside PetscLogGpuTimeEnd();

Fri Aug 28 10:26:23 CDT 2020


> On Aug 28, 2020, at 5:18 PM, Barry Smith <bsmith at petsc.dev> wrote:
> 
> 
> 
>> On Aug 28, 2020, at 5:35 AM, Karl Rupp <rupp at iue.tuwien.ac.at> wrote:
>> 
>> Hi,
>> 
>>>  Since we cannot post issues (reported here https://forum.gitlab.com/t/creating-new-issue-gives-cannot-create-issue-getting-whoops-something-went-wrong-on-our-end/41966?u=bsmith) here is my issue so I don't forget it.
>>>  I think
>>> err  = WaitForCUDA();CHKERRCUDA(err);
>>> ierr = PetscLogGpuTimeEnd();CHKERRQ(ierr);
>>> should be changed to include WaitForCUDA() actually WaitForDevice() inside the PetscLogGpuTimeEnd().
>>> Currently sometimes the WaitForCUDA() is missing in a few places resulting in bad timing.
>>> Also some _SeqCUDA() don't have the PetscLogGpuTimeEnd() and need to be fixed.
>>> The current model is a maintenance nightmare.
>>> Does anyone see a problem with making this change?
>> 
>> I'm fine with this change, as the maintenance benefits outweigh the performance cost for typical use cases.
>> 
>> I propose to also add the WaitForDevice(); at PetscLogGpuTimeBegin(). This will ensure that no previous GPU kernel executions spill over into the timed section.
> 
>  Might this incur an extra overhead checking the device? Or will it always be true that if there are no outstanding kernels it will not go to the GPU and the check will return immediately?

If we want to have a two barrier model, I propose we log the timing for waiting at the first barrier separately.
> 
> Barry
> 
>> 
>> Best regards,
>> Karli

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20200828/b53c2a6e/attachment.html>