[petsc-users] Strange mpi timing and CPU load when -np > 2

Barry Smith bsmith at petsc.dev
Mon Sep 26 11:58:15 CDT 2022



  It is important to check out https://petsc.org/main/faq/?highlight=faq#why-is-my-parallel-solver-slower-than-my-sequential-solver-or-i-have-poor-speed-up <https://petsc.org/main/faq/?highlight=faq#why-is-my-parallel-solver-slower-than-my-sequential-solver-or-i-have-poor-speed-up>

  In particular you will need to set appropriate binding for the mpiexec and should run the streams benchmark (make mpistreams) using the binding to find the potential performance of the system.

  If you are using a thread enabled BLAS/LAPACK that utilizes all the cores then you can get oversubscription and thus slow performance during BLAS/LAPACK calls. We try not to link with thread enabled BLAS/LAPACK by default. See https://petsc.org/main/docs/manual/blas-lapack/?highlight=thread%20blas

   Barry



> On Sep 26, 2022, at 12:39 PM, Duan Junming via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Dear all,
> 
> I am using PETSc 3.17.4 on a Linux server, compiling with --download-exodus --download-hdf5 --download-openmpi --download-triangle --with-fc=0 --with-debugging=0 PETSC_ARCH=arch-linux-c-opt COPTFLAGS="-g -O3" CXXOPTFLAGS="-g -O3".
> The strange thing is when I run my code with mpirun -np 1 ./main, the CPU time is 30s.
> When I use mpirun -np 2 ./main, the CPU time is 16s. It's OK.
> But when I use more than 2 CPUs, like mpirun -np 3 ./main, the CPU time is 30s.
> The output of command time is: real 0m30.189s, user 9m3.133s, sys 10m55.715s.
> I can also see that the CPU load is about 100% for each process when np = 2, but the CPU load goes to 2000%, 1000%, 1000% for each process (the server has 40 CPUs).
> Do you have any idea about this?
> 
> Thanks in advance!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220926/b2ae1a63/attachment-0001.html>


More information about the petsc-users mailing list