[petsc-dev] http://www.hpcwire.com/hpcwire/2012-11-12/intel_brings_manycore_x86_to_market_with_knights_corner.html
Paul Mullowney
paulm at txcorp.com
Tue Nov 13 12:11:14 CST 2012
Sparse solves. MKL has an option for using multiple CPU cores for their
sparse triangular solve with:
mkl_set_num_threads()
Under the hood, the MKL implementation uses the level-scheduler
algorithm for extracting some amount of parallelism. We've tested this
on many matrices and never seen scalability on a sandy bridge. I don't
know the reason for this. For some matrices, the level-scheduler
algorithm has a modest amount of parallelism and I would expect some
benefit going to multiple cores.
-Paul
> On 11/13/12 2:54 AM, Paul Mullowney wrote:
>> Every test we've done shows that the MKL triangular solve doesn't
>> scale at all on a sandy bridge multi-core. I doubt it will be any
>> different on the Xeon Phi.
>>
>> -Paul
> Do you mean sparse or dense solves? Sparse triangular solves are
> sequential in MKL. PARDISO also does it sequentially.
>
> Anton
>
>>>>
>>>>>
>>>>> In terms of raw numbers, $2,649 for 320 GB/sec and 8 GB of memory
>>>>> is quite a lot compared to the $500 of a Radeon HD 7970 GHz
>>>>> Edition at 288 GB/sec and 3 GB memory. My hope is that Xeon Phi
>>>>> can do better than GPUs in kernels requiring frequent global
>>>>> synchronizations, e.g. ILU-substitutions.
>>>>
>>>> But, but, but it runs the Intel instruction set, that is
>>>> clearly worth 5+ times the price :-)
>>>
>>> I'm tempted to say 'yes', but at a second thought I'm not so sure
>>> whether any of us is actually programming in x86 assembly (again)?
>>> Part of the GPU/accelerator hype is arguably due to a rediscovery of
>>> programming close to hardware, even though it was/is non-x86. With
>>> Xeon Phi we might now observe some sort of compiler war instead of
>>> low-level kernel tuning - is this what we want?
>>>
>>> Best regards,
>>> Karli
>>>
>>
>>
More information about the petsc-dev
mailing list