[petsc-users] MatCreate performance

Ale Foggia amfoggia at gmail.com
Mon Mar 11 07:22:39 CDT 2019


Hello all,

Thanks for your answers.

1) I'm working with a matrix with a linear size of 2**34, but it's a sparse
matrix, and the number of elements different from zero is 43,207,072,74. I
know that the distribution of these elements is not balanced between the
processes, the matrix is more populated in the middle part.

2) I initialize Slepc. Then I create the basis elements of the system (this
part does not involve Petsc/Slepc, and every process is just computing -and
owns- an equal amount of basis elements). Then I call:
ierr = MatCreate(PETSC_COMM_WORLD, &A); CHKERRQ(ierr);
ierr = MatSetType(A, MATMPIAIJ); CHKERRQ(ierr);
ierr = MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, size, size);
CHKERRQ(ierr);
ierr = MatMPIAIJSetPreallocation(A, 0, d_nnz, 0, o_nnz); CHKERRQ(ierr);
ierr = MatZeroEntries(A); CHKERRQ(ierr);
After this, I compute the elements of the matrix and set the values with
MatSetValues. The I call EPSSolve (with KrylovSchur and setting the type as
EPS_HEP).

3) There are a few more things that are strange to me. I measure the
execution time of these parts both with a PetscLogStage and with a
std::chrono (in nanoseconds) clock. I understand that the time given by the
Log is an average over the processes right? In the case of the std::chrono,
I'm only printing the times from process 0 (no average over processes).
What I see is the following:
                             1024 procs          2048 procs       4096
procs        8192 procs
                             Log       std          Log       std       Log
      std        Log       std
MatCreate            68.42   122.7      67.08    121.2   62.29    116
73.36    127.4
preallocation        140.36  140.3     76.45     76.45   40.31    40.3
21.13     21.12
MatSetValues       237.79  237.7     116.6    116.6    60.59    60.59
35.32    35.32
ESPSolve             162.8    160        95.8       94.2     62.17
60.63     41.16    40.24

- So, all the times (including the total execution time that I'm not
showing here) are the same between PetscLogStage and the std::chrono clock,
except for the part of MatCreate. Maybe that part is very unbalanced?
- The time of the MatCreate given by the PetscLogStage is not changing.

Ale

El vie., 8 mar. 2019 a las 17:00, Jed Brown (<jed at jedbrown.org>) escribió:

> This is very unusual.  MatCreate() does no work, merely dup'ing a
> communicator (or referencing an inner communicator if this is not the
> first PetscObject on the provided communicator).  What size matrices are
> you working with?  Can you send some performance data and (if feasible)
> a reproducer?
>
> Ale Foggia via petsc-users <petsc-users at mcs.anl.gov> writes:
>
> > Hello all,
> >
> > I have a problem with the scaling of the MatCreate() function. I wrote a
> > code to diagonalize sparse matrices and I'm running it in parallel. I've
> > observed a very bad speedup of the code and it's given by the MatCreate
> > part of it: for a fixed matrix size, when I increase the number of
> > processes the time taken by the function also increases. I wanted to know
> > if you expect this behavior or if maybe there's something wrong with my
> > code. When I go to (what I consider) very big matrix sizes, and depending
> > on the number of mpi processes, in some cases, MatCreate takes more time
> > than the time the solver takes to solve the system for one eigenvalue or
> > the time it takes to set up the values.
> >
> > Ale
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190311/93fbdc07/attachment.html>


More information about the petsc-users mailing list