[petsc-dev] PETSc OpenMP benchmarking

"C. Bergström" cbergstrom at pathscale.com
Sat Mar 17 03:55:40 CDT 2012


On 03/17/12 03:33 PM, Gerard Gorman wrote:
> Hi
>
> We have profiled on Cray compute nodes with two 16-core AMD Opteron
> 2.3GHz Interlagos processors, using the same matrix but this time with
> -ksp_type cg and -pc_type jacobi. Attached are the logs with the 32 MPI
> processes and the 32 OpenMP threads tests.
>
> Most of the time is in stage 2. As seen previously, MatMult is
> performing well, but the overall performance in KSPSolve drops for
> OpenMP. I have attached a plot of the (hybrid mpi+openmp time)/(pure
> openmp) where all 32 cores are always used. What the graph shows is that
> we are always getting better performance in MatMult for pure OpenMP but
> there is something additional in KSPSolve that degrades the OpenMP
> performance.
>
> So far we have profiled with oprofile measuring the event
> CPU_CLK_UNHALTED, but this has not shown up the bottleneck. So more
> digging is required.
>
> Any suggestions/comments gratefully received.
Why are you using gcc? (I'm biased, but serious question)  Did you post 
your CFLAGS and FFLAGS?  PathScale is happy to work with you on this.

Best,

./C



More information about the petsc-dev mailing list