[petsc-dev] OpenCL platform and device query routines

Fri Apr 12 16:51:19 CDT 2013

Dear PETScians,

in order to make proper use of OpenCL functionality, we need some 
diagnostics for the user such that the correct device is used. Such 
functionality is partly also desired for CUDA, but less urgent (only 
addresses NVIDIA GPUs anyway).

OpenCL defines platforms (think of it as SDKs from the various vendors) 
and devices with one type out of {CPU, GPU, ACCELERATOR}. Each platform 
may support multiple devices, but not necessarily all OpenCL-enabled 
devices on the machine. For example, the AMD SDK (platform) does not 
provide support for NVIDIA GPUs, but it supports Intel CPUs (x86 ftw!). 
Since multiple SDKs can be installed in parallel, information on the 
proper enumeration is quite important to use the correct device.

Example: A machine equipped with an Intel CPU and an NVIDIA GPU with 
OpenCL SDKs from Intel, AMD, and NVIDIA installed. Within OpenCL one 
will 'see' the following:

- Platform 0:
   - Vendor: Intel
   - Device 0: Intel i7 whatever (CPU)

- Platform 1:
   - Vendor: AMD
   - Device 0: Intel i7 whatever (CPU)

- Platform 2:
   - Vendor NVIDIA
   - Device 0: NVIDIA GTX whatever (GPU)

(Maybe in different order. Matters can get worse with Xeon Phi, AMD 
APUs, etc.)

To provide the necessary diagnostics, I suggest in line with -vec_view 
the flag
   -opencl_view
to print the OpenCL infrastructure available on the system.
Is there any better naming scheme/proposal? -cuda_view and maybe some 
time later -threadcomm_view (-numa_view?) would follow from this choice. 
Note that this should be independent of external linear algebra 
libraries such as CUSP, ViennaCL, etc. to avoid unnecessary code 
duplication. However, the actual platform/device *setter* flags (e.g. 
pick device 0 from platform 1) need to be package-specific.

Best regards,
Karli