<div dir="ltr"><div dir="ltr">On Mon, Mar 11, 2019 at 8:23 AM Ale Foggia via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov">petsc-users@mcs.anl.gov</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hello all,</div><div><br></div><div>Thanks for your answers.</div><div><br></div><div>1)
I'm working with a matrix with a linear size of 2**34, but it's a
sparse matrix, and the number of elements different from zero is
43,207,072,74. I know that the distribution of these elements is not balanced between the processes, the matrix is more populated in the middle part.<br></div><div><br></div><div>2) I initialize Slepc. Then I
create the basis elements of the system (this part does not involve
Petsc/Slepc, and every process is just computing -and owns- an equal
amount of basis elements). Then I call:</div><div>ierr = MatCreate(PETSC_COMM_WORLD, &A); CHKERRQ(ierr);</div><div>ierr = MatSetType(A, MATMPIAIJ); CHKERRQ(ierr);<br>ierr = MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, size, size); CHKERRQ(ierr);</div><div>ierr = MatMPIAIJSetPreallocation(A, 0, d_nnz, 0, o_nnz); CHKERRQ(ierr);</div><div>ierr = MatZeroEntries(A); CHKERRQ(ierr);</div><div>After
this, I compute the elements of the matrix and set the values with
MatSetValues. The I call EPSSolve (with KrylovSchur and setting the type
as EPS_HEP).</div><div><br></div><div>3) There are a few more things
that are strange to me. I measure the execution time of these parts both
with a PetscLogStage and with a std::chrono (in nanoseconds) clock. I understand that the time given by the Log is an average over the processes right? In the case of the std::chrono, I'm only printing the times from process 0 (no average over processes). What I see
is the following:</div><div> 1024 procs 2048 procs 4096 procs 8192 procs<br></div><div> Log std Log std Log std Log std</div><div>MatCreate 68.42 122.7 67.08 121.2 62.29 116 73.36 127.4 <br></div><div>preallocation 140.36 140.3 76.45 76.45 40.31 40.3 21.13 21.12 <br></div><div>MatSetValues 237.79 237.7 116.6 116.6 60.59 60.59 35.32 35.32<br></div><div>ESPSolve 162.8 160 95.8 94.2 62.17 60.63 41.16 40.24</div><div><br></div><div>- So,
all the times (including the total execution time that I'm not showing
here) are the same between PetscLogStage and the std::chrono clock,
except for the part of MatCreate. Maybe that part is very unbalanced?</div></div></blockquote><div><br></div><div>MatCreate() does nothing at all, but it does have a synchronization (to check the comm). So you must be very imbalanced</div><div>_coming into_ MatCreate. It also appears that 0 is more imbalanced than the rest, so maybe you are doing serial work on</div><div>0 that no one else does before you call MatCreate.</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>- The time of the MatCreate given by the PetscLogStage is not changing.</div><div><br></div><div>Ale</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">El vie., 8 mar. 2019 a las 17:00, Jed Brown (<<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>>) escribió:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">This is very unusual. MatCreate() does no work, merely dup'ing a<br>
communicator (or referencing an inner communicator if this is not the<br>
first PetscObject on the provided communicator). What size matrices are<br>
you working with? Can you send some performance data and (if feasible)<br>
a reproducer?<br>
<br>
Ale Foggia via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> writes:<br>
<br>
> Hello all,<br>
><br>
> I have a problem with the scaling of the MatCreate() function. I wrote a<br>
> code to diagonalize sparse matrices and I'm running it in parallel. I've<br>
> observed a very bad speedup of the code and it's given by the MatCreate<br>
> part of it: for a fixed matrix size, when I increase the number of<br>
> processes the time taken by the function also increases. I wanted to know<br>
> if you expect this behavior or if maybe there's something wrong with my<br>
> code. When I go to (what I consider) very big matrix sizes, and depending<br>
> on the number of mpi processes, in some cases, MatCreate takes more time<br>
> than the time the solver takes to solve the system for one eigenvalue or<br>
> the time it takes to set up the values.<br>
><br>
> Ale<br>
</blockquote></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div></div></div></div>