[petsc-dev] http://www.hpcwire.com/hpcwire/2012-11-12/intel_brings_manycore_x86_to_market_with_knights_corner.html
Karl Rupp
rupp at mcs.anl.gov
Mon Nov 12 22:16:10 CST 2012
Hi John,
> (...)
> I fully second Jed. Computational scientists are already fighting
> with getting scalable performance on a 'standard' multi-core
> architecture, so I doubt that one can really obtain a gain on an
> accelerator-architecture for any real-world application just be
> recompilation of existing code. Also, add the extra issue of
> PCI-Express latency.
>
>
> Two key points here:
>
> 1) the application will have to be threaded to get good performance on
> the Xeon Phi. I know that PETSc is moving in this direction. My thought
> was that you would have 1 MPI process on the card and 1 on each CPU and
> use threads.
I'd be happy if it were that simple, but I doubt this. Even Intel is
saying that the Xeon Phi is an accelerator architecture rather than a
multi-core architecture.
> 2) The recompilation is needed to run in "Native mode". This is not an
> offloaded computation in the GPU sense. The entire program runs on the
> card. All the memory is local. You run one binary on the card, a
> different binary on the CPU. The only thing that has to cross the bus
> is MPI communication, which should be faster than even the fastest
> network cards because it only has to cross the bus.
Hmm, that could indeed get past the latency issue to a large extent.
Probably some OS-functionality is not available on the Xeon Phi, thus
some redesigning would still be required. Let's see...
Best regards,
Karli
More information about the petsc-dev
mailing list