[petsc-users] Cannot eagerly initialize cuda, as doing so results in cuda error 35 (cudaErrorInsufficientDriver) : CUDA driver version is insufficient for CUDA runtime version

Jed Brown jed at jedbrown.org
Thu Jan 20 23:49:35 CST 2022


Junchao Zhang <junchao.zhang at gmail.com> writes:

> I don't see values using PetscUnlikely() today.

It's usually premature optimization and PetscUnlikelyDebug makes it too easy to skip important checks. But at the time when I added PetscUnlikely, it was important for CHKERRQ(ierr). Specifically, without PetsUnlikely, many compilers (even at high optimization) would put the error handling code (a few instructions for every source line) in with the error-free path, using forward jumps to bypass it. Most CPUs predict that backward jumps are taken and forward jumps are not so this impacts both branch prediction and instruction cache locality.

With PetscUnlikely, the error-handling code is reliably deposited at the end of the function where it usually won't have to enter the instruction cache and is never branch predicted.


More information about the petsc-users mailing list