[petsc-dev] Kernel fusion for vector operations

Karl Rupp rupp at mcs.anl.gov
Fri May 31 08:08:05 CDT 2013


Hi Jed,

 >> I went through the iterative solvers today, looking for places where we
>> can "save" memory bandwidth. For example, the two operations
>>     v <- v - alpha * u
>>     v <- v - beta  * w
>> are currently performed using two BLAS-like VecAXPY() calls, requiring v
>> to be read and written twice. Ideally, this can be fused into a single
>> operation
>>     v <- v - alpha * u - beta * w
>> with less pressure on the memory link.
>
>    VecAXPBYPCZ(v,-alpha,-beta,1.0,u,w);

Yes, it exists now, but hasn't been used in older solver implementations ;-)


>> The following operations occur in several solvers, i.e. we might want to
>> consider adding a dedicated MatXXX/VecYYY-routine:
>>
>>    - Matrix-Vector-product followed by an inner product:
>>       v     <- A x
>>       alpha <- (v, w) or (w, v)
>>      where w can either be x or a different vector.
>>
>>    - Application of a preconditioner followed by an inner product:
>>       z     <- M^-1 r
>>       alpha <- (r, z)
>
> There is also the combination
>
>    D^{-1} A x
>
> where D is either the diagonal (Jacobi) or point-block diagonal.  This
> is important in GAMG, which uses polynomial smoothers.

Good point. It certainly makes sense for the Jacobi case. The benefit 
will reduce with increasing block size, though.


>> Other operations are rather custom to the respective solver. FBCGSR
>> optimizes for data reuse already by computing the vector operations
>> directly inside the solver without using VecXYZ(), which makes it fairly
>> slow when using GPUs.
>
> Yeah, those methods have really custom operations that would involve a
> lot more traversals otherwise.  It can be abstracted to allow a GPU
> implementation, but we'd want to move the whole custom operation.

Yes, it's not that many custom operations where it really pays off, so 
dealing with them 'manually' is probably the better choice over an 
automatic machinery which imports other problems elsewhere - I hope 
Barry won't shoot me for this statement ;-)

Best regards,
Karli




More information about the petsc-dev mailing list