[petsc-users] Building with MKL 10.3
Rob Ellis
Robert.G.Ellis at Shaw.ca
Tue Mar 15 17:32:44 CDT 2011
Yes, MKL_DYNAMIC was set to true. No, I haven't tested on Nehalem. I'm
currently comparing sequential MKL with --download-f-blas-lapack=1.
Rob
From: petsc-users-bounces at mcs.anl.gov
[mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Natarajan CS
Sent: Tuesday, March 15, 2011 3:20 PM
To: PETSc users list
Cc: Robert Ellis
Subject: Re: [petsc-users] Building with MKL 10.3
Thanks Eric and Rob.
Indeed! Was MKL_DYNAMIC set to default (true)? It looks like using 1 thread
per core (sequential MKL) is the right thing to do as baseline.
I would think that the performance of #cores = num_mpi_processes *
num_mkl_threads might be <= #cores = num_mpi_processes case (# cores const)
unless some cache effects come into play (Not sure what, I would think the
mkl installation should weed these issues out).
P.S :
Out of curiosity have you also tested your app on Nehalem? Any difference
between Nehalem vs Westmere for similar bandwidth?
On Tue, Mar 15, 2011 at 4:35 PM, Jed Brown <jed at 59a2.org> wrote:
On Tue, Mar 15, 2011 at 22:30, Robert Ellis <Robert.Ellis at geosoft.com>
wrote:
Regardless of setting the number of threads for MKL or OMP, the MKL
performance was worse than simply using --download-f-blas-lapack=1.
Interesting. Does this statement include using just one thread, perhaps with
a non-threaded MKL? Also, when you used threading, were you putting an MPI
process on every core or were you making sure that you had enough cores for
num_mpi_processes * num_mkl_threads?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110315/d848575a/attachment.htm>
More information about the petsc-users
mailing list