[petsc-dev] [petsc-maint] running CUDA on SUMMIT
Jed Brown
jed at jedbrown.org
Wed Aug 14 17:58:09 CDT 2019
"Smith, Barry F." <bsmith at mcs.anl.gov> writes:
>> On Aug 14, 2019, at 2:37 PM, Jed Brown <jed at jedbrown.org> wrote:
>>
>> Mark Adams via petsc-dev <petsc-dev at mcs.anl.gov> writes:
>>
>>> On Wed, Aug 14, 2019 at 2:35 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>>>
>>>>
>>>> Mark,
>>>>
>>>> Would you be able to make one run using single precision? Just single
>>>> everywhere since that is all we support currently?
>>>>
>>>>
>>> Experience in engineering at least is single does not work for FE
>>> elasticity. I have tried it many years ago and have heard this from others.
>>> This problem is pretty simple other than using Q2. I suppose I could try
>>> it, but just be aware the FE people might say that single sucks.
>>
>> When they say that single sucks, is it for the definition of the
>> operator or the preconditioner?
>>
>> As point of reference, we can apply Q2 elasticity operators in double
>> precision at nearly a billion dofs/second per GPU.
>
> And in single you get what?
I don't have exact numbers, but <2x faster on V100, and it sort of
doesn't matter because preconditioning cost will dominate. The big win
of single is on consumer-grade GPUs, which DOE doesn't install and
NVIDIA forbids to be used in data centers (because they're so
cost-effective ;-)).
>> I'm skeptical of big wins in preconditioning (especially setup) due to
>> the cost and irregularity of indexing being large compared to the
>> bandwidth cost of the floating point values.
More information about the petsc-dev
mailing list