[petsc-users] Building with MKL 10.3

Natarajan CS csnataraj at gmail.com
Tue Mar 15 17:20:25 CDT 2011


Thanks Eric and Rob.

Indeed! Was MKL_DYNAMIC set to default (true)? It looks like using 1 thread
per core (sequential MKL) is the right thing to do as baseline.
 I would think that the performance of #cores =  num_mpi_processes *
num_mkl_threads might be <= #cores = num_mpi_processes case (# cores const)
unless some cache effects come into play (Not sure what, I would think the
mkl installation should weed these issues out).


P.S :
Out of curiosity have you also tested your app on Nehalem? Any difference
between Nehalem vs Westmere for similar bandwidth?

On Tue, Mar 15, 2011 at 4:35 PM, Jed Brown <jed at 59a2.org> wrote:

> On Tue, Mar 15, 2011 at 22:30, Robert Ellis <Robert.Ellis at geosoft.com>wrote:
>
>> Regardless of setting the number of threads for MKL or OMP, the MKL
>> performance was worse than simply using --download-f-blas-lapack=1.
>
>
> Interesting. Does this statement include using just one thread, perhaps
> with a non-threaded MKL? Also, when you used threading, were you putting an
> MPI process on every core or were you making sure that you had enough cores
> for num_mpi_processes * num_mkl_threads?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110315/96b06f5f/attachment.htm>


More information about the petsc-users mailing list