[petsc-users] [petsc-maint] Speedup problem when using OpenMP?

Karl Rupp rupp at mcs.anl.gov
Mon Nov 4 08:51:20 CST 2013


 > I have a question on the speedup of PETSc when using OpenMP. I can get
> good speedup when using MPI, but no speedup when using OpenMP.
> The example is ex2f with m=100 and n=100. The number of available
> processors is 16 (32 threads) and the OS is Windows Server 2012. The log
> files for 4 and 8 processors are attached.
> The commands I used to run with 4 processors are as follows:
> Run using MPI
> mpiexec -n 4 Petsc-windows-ex2f.exe -m 100 -n 100 -log_summary
> log_100x100_mpi_p4.log
> Run using OpenMP
> Petsc-windows-ex2f.exe -threadcomm_type openmp -threadcomm_nthreads 4 -m
> 100 -n 100 -log_summary log_100x100_openmp_p4.log
> The PETSc used for this test is PETSc for Windows
> http://www.mic-tc.ch/downloads/PETScForWindows.zip, but I guess this is
> not the problem because the same problem exists when I use PETSc-dev in
> Cygwin. I don't know if this problem exists in Linux, would anybody help
> to test?

For the 100x100 case considered, the execution times per call are 
somewhere in the millisecond to sub-millisecond range (e.g. 1.3ms for 68 
calls to VecScale with 4 processors). I'd say this is too small in order 
to see any reasonable performance gain when running multiple threads, 
consider problem sizes of about 1000x1000 instead.

Moreover, keep in mind that typically you won't get a perfectly linear 
scaling with the number of processor cores, because ultimately the 
memory bandwidth is the limiting factor for standard vector operations.

Best regards,

