[petsc-dev] A closer look at the Xeon Phi

Shri abhyshr at hawk.iit.edu
Tue Feb 12 21:05:39 CST 2013


On Feb 12, 2013, at 7:43 PM, Karl Rupp wrote:

> Hi Jed,
> 
> >         Which crossover are you referring to? The CPU versus GTX285 at
>>        about 20k
>>        dofs, but with only very small gains for another order of magnitude?
>> 
>> 
>>    I assume it's the cross-over of Xeon Phi vs. Xeon,
>> 
>> 
>> MIC is slower than Xeon by more than an order of magnitude at 10k dofs.
> 
> Tim was referring to the cross-over at >10k...
> 
> 
>>    but almost all cross-overs happen in the 10k-100k region and are due
>>    to PCI-Express latency.
>> 
>> 
>> Why is PCI-Express latency important here? Can't the MIC code run
>> entirely on the device?
> 
> Almost-all (OpenCL, CUDA). Native mode ought to be the exception, but it's the OpenMP overhead which limits then. Single-core on the MIC is not really an option either...
> 
> It would be interesting to play with a pthreads-threadpool implementation on the MIC to see how much performance can really be obtained for smallish problems.

You can try running the example threadcomm/examples/tutorials/ex5.c. It gives a measure of the overhead in launching kernels with OpenMP and pthread.
> 
> Best regards,
> Karli
> 




More information about the petsc-dev mailing list