[petsc-dev] Supporting OpenCL matrix assembly
Karl Rupp
rupp at mcs.anl.gov
Mon Sep 23 15:49:17 CDT 2013
Hi Jed,
> We have some motivated users that would like a way to assemble matrices
> on a device, without needing to store all the element matrices to global
> memory or to transfer them to the CPU. Given GPU execution models, this
> means we need something that can be done on-the-spot in kernels. So
> what about a function that can be called by device threads?
>
> PetscErrorCode MatOpenCLGetSetValuesSource(Mat, synchronization_mechanism, char **);
>
> The user concatenates this type-specialized code into their source and
> calls MatSetValues(). The users I'm talking to here synchronize by
> coordinating threads using coloring of a sort. The user still needs to
> call MatAssemblyBegin/End from outside a kernel, though that function
> may or may not need to invoke its own kernel.
>
> Crazy?
a)
I think this needs a second thought on how we manage the raw OpenCL
buffers. My suggestion last year was that we 'wrap' pointers to raw
memory buffers into something like
struct generic_ptr {
void * cpu_ptr;
void * cuda_ptr;
cl_mem opencl_ptr;
};
underneath the 'special pointer' for Vec and Mat, but we then decided on
using a library-specific dispatch, i.e. spptr points to whatever a
library needs. For MatOpenCLGetSetValuesSource() we would have to be
very careful in the way the buffers are passed to the kernel, as
different OpenCL backends may expect slightly different semantics.
Currently we only have ViennaCL for that purpose, but even though it is
'my own' library, there is no point in being restrictive here.
b)
Other than that, I'm not sure whether I understand the semantics of the
proposed function correctly. In order for MatOpenCLGetSetValuesSource()
to be callable by device threads, it needs to be all embedded into the
OpenCL sources, which means that it has no knowledge about any of the
PETSc types. If, on the other hand, this is supposed to be a PETSc
function, then I don't know what 'synchronization_mechanism' is supposed
to do. In addition, the OpenCL context and command queue should be
passed as parameters to MatOpenCLGetSetValuesSource().
Best regards,
Karli
More information about the petsc-dev
mailing list