[petsc-dev] PETSc OpenMP benchmarking

Gerard Gorman g.gorman at imperial.ac.uk
Sat Mar 17 05:31:26 CDT 2012


"C. Bergström" emailed the following on 17/03/12 08:55:
> On 03/17/12 03:33 PM, Gerard Gorman wrote:
>> Hi
>>
>> We have profiled on Cray compute nodes with two 16-core AMD Opteron
>> 2.3GHz Interlagos processors, using the same matrix but this time with
>> -ksp_type cg and -pc_type jacobi. Attached are the logs with the 32 MPI
>> processes and the 32 OpenMP threads tests.
>>
>> Most of the time is in stage 2. As seen previously, MatMult is
>> performing well, but the overall performance in KSPSolve drops for
>> OpenMP. I have attached a plot of the (hybrid mpi+openmp time)/(pure
>> openmp) where all 32 cores are always used. What the graph shows is that
>> we are always getting better performance in MatMult for pure OpenMP but
>> there is something additional in KSPSolve that degrades the OpenMP
>> performance.
>>
>> So far we have profiled with oprofile measuring the event
>> CPU_CLK_UNHALTED, but this has not shown up the bottleneck. So more
>> digging is required.
>>
>> Any suggestions/comments gratefully received.
> Why are you using gcc? (I'm biased, but serious question)  Did you
> post your CFLAGS and FFLAGS?  PathScale is happy to work with you on
> this.
>
> Best,
>
> ./C

Thanks for your help - I'll give pathscale a go this evening.

I used GCC just because it gave me the least trouble building in the
Cray env. As it was I still had to compile my own valgrind installation
and was given the magic scrub_headers script to remove some troublesome
headers generated by the petsc config. I didn't set any additional flags
- just configured with --with-debugging=0

Cheers
Gerard




More information about the petsc-dev mailing list