[petsc-dev] A closer look at the Xeon Phi
Shri
abhyshr at hawk.iit.edu
Tue Feb 12 21:05:39 CST 2013
On Feb 12, 2013, at 7:43 PM, Karl Rupp wrote:
> Hi Jed,
>
> > Which crossover are you referring to? The CPU versus GTX285 at
>> about 20k
>> dofs, but with only very small gains for another order of magnitude?
>>
>>
>> I assume it's the cross-over of Xeon Phi vs. Xeon,
>>
>>
>> MIC is slower than Xeon by more than an order of magnitude at 10k dofs.
>
> Tim was referring to the cross-over at >10k...
>
>
>> but almost all cross-overs happen in the 10k-100k region and are due
>> to PCI-Express latency.
>>
>>
>> Why is PCI-Express latency important here? Can't the MIC code run
>> entirely on the device?
>
> Almost-all (OpenCL, CUDA). Native mode ought to be the exception, but it's the OpenMP overhead which limits then. Single-core on the MIC is not really an option either...
>
> It would be interesting to play with a pthreads-threadpool implementation on the MIC to see how much performance can really be obtained for smallish problems.
You can try running the example threadcomm/examples/tutorials/ex5.c. It gives a measure of the overhead in launching kernels with OpenMP and pthread.
>
> Best regards,
> Karli
>
More information about the petsc-dev
mailing list