[petsc-users] Building with MKL 10.3
Natarajan CS
csnataraj at gmail.com
Tue Mar 15 17:20:25 CDT 2011
Thanks Eric and Rob.
Indeed! Was MKL_DYNAMIC set to default (true)? It looks like using 1 thread
per core (sequential MKL) is the right thing to do as baseline.
I would think that the performance of #cores = num_mpi_processes *
num_mkl_threads might be <= #cores = num_mpi_processes case (# cores const)
unless some cache effects come into play (Not sure what, I would think the
mkl installation should weed these issues out).
P.S :
Out of curiosity have you also tested your app on Nehalem? Any difference
between Nehalem vs Westmere for similar bandwidth?
On Tue, Mar 15, 2011 at 4:35 PM, Jed Brown <jed at 59a2.org> wrote:
> On Tue, Mar 15, 2011 at 22:30, Robert Ellis <Robert.Ellis at geosoft.com>wrote:
>
>> Regardless of setting the number of threads for MKL or OMP, the MKL
>> performance was worse than simply using --download-f-blas-lapack=1.
>
>
> Interesting. Does this statement include using just one thread, perhaps
> with a non-threaded MKL? Also, when you used threading, were you putting an
> MPI process on every core or were you making sure that you had enough cores
> for num_mpi_processes * num_mkl_threads?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110315/96b06f5f/attachment.htm>
More information about the petsc-users
mailing list