[petsc-dev] [petsc-maint] running CUDA on SUMMIT

Wed Aug 14 15:54:08 CDT 2019

Mark Adams <mfadams at lbl.gov> writes:

> On Wed, Aug 14, 2019 at 3:37 PM Jed Brown <jed at jedbrown.org> wrote:
>
>> Mark Adams via petsc-dev <petsc-dev at mcs.anl.gov> writes:
>>
>> > On Wed, Aug 14, 2019 at 2:35 PM Smith, Barry F. <bsmith at mcs.anl.gov>
>> wrote:
>> >
>> >>
>> >>   Mark,
>> >>
>> >>    Would you be able to make one run using single precision? Just single
>> >> everywhere since that is all we support currently?
>> >>
>> >>
>> > Experience in engineering at least is single does not work for FE
>> > elasticity. I have tried it many years ago and have heard this from
>> others.
>> > This problem is pretty simple other than using Q2. I suppose I could try
>> > it, but just be aware the FE people might say that single sucks.
>>
>> When they say that single sucks, is it for the definition of the
>> operator or the preconditioner?
>>
>
> Operator.
>
> And "ve seen GMRES stagnate when using single in communication in parallel
> Gauss-Seidel. Roundoff is nonlinear.

Fair; single may still be useful in the preconditioner while using
double for operator and Krylov.

Do you have any applications that specifically want Q2 (versus Q1)
elasticity or have some test problems that would benefit?

>> As point of reference, we can apply Q2 elasticity operators in double
>> precision at nearly a billion dofs/second per GPU.
>
>
>> I'm skeptical of big wins in preconditioning (especially setup) due to
>> the cost and irregularity of indexing being large compared to the
>> bandwidth cost of the floating point values.
>>