[petsc-users] 32-bit vs 64-bit GPU support

Fri Aug 11 14:38:45 CDT 2023

Rohan Yadav <rohany at alumni.cmu.edu> writes:

> With modern GPU sizes, for example A100's with 80GB of memory, a vector of
> length 2^31 is not that much memory -- one could conceivably run a CG solve
> with local vectors > 2^31.

Yeah, each vector would be 8 GB (single precision) or 16 GB (double). You can't store a matrix of this size, and probably not a "mesh", but it's possible to create such a problem if everything is matrix-free (possibly with matrix-free geometric multigrid). This is more likely to show up in a benchmark than any real science or engineering probelm. We should support it, but it still seems hypothetical and not urgent.

> Thanks Junchao, I might look into that. However, I currently am not trying
> to solve such a large problem -- these questions just came from wondering
> why the cuSPARSE kernel PETSc was calling was running faster than mine.

Hah, bandwidth doesn't like. ;-)