[petsc-users] Enhancing MatScale computing time

Matthew Knepley knepley at gmail.com
Thu Oct 22 15:28:28 CDT 2020


On Thu, Oct 22, 2020 at 4:17 PM Antoine Côté <Antoine.Cote3 at usherbrooke.ca>
wrote:

> Hi Sir,
>
> MatScale in "Main Stage" is indeed called 6 times for 0% run time. In
> stage "Stiff_Adj" though, we get :
>
> MatScale            8192 1.0 7.1185e+01 1.0 3.43e+10 1.0 0.0e+00 0.0e+00
> 0.0e+00 50 46  0  0  0  80 98  0  0  0   482
>
> MatMult is indeed expensive (23% run time) and should be improved, but
> MatScale in "Stiff_Adj" is still taking 50% run time
>

I was a little surprised that MatScale gets only 450 MFlops. However, it
looks like you are running the debugging version of PETSc. Could you
configure
a version without debugging:

  $PETSC_DIR/$PETSC_ARCH/lib/petsc/conf/reconfigure-$PETSC_ARCH.py
--with-debugging=0 --PETSC_ARCH=arch-master-opt

and rerun the timings?

  Thanks,

     Matt


> Thanks,
>
> Antoine
> ------------------------------
> *De :* Barry Smith <bsmith at petsc.dev>
> *Envoyé :* 22 octobre 2020 16:09
> *À :* Antoine Côté <Antoine.Cote3 at USherbrooke.ca>
> *Cc :* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Objet :* Re: [petsc-users] Enhancing MatScale computing time
>
>
> MatMult             9553 1.0 3.2824e+01 1.0 3.54e+10 1.0 0.0e+00 0.0e+00
> 0.0e+00 23 48  0  0  0  61 91  0  0  0  1079
> MatScale               6 1.0 5.3896e-02 1.0 2.52e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0   467
>
> Though the flop rate of MatScale is not so high (467) it is taking very
> little (0 percent of the run time while MatMult takes 23 percent of the
> time).
>
> So the main cost related to the matrices is MatMult because it has a lot
> of operations 9553, you might think about your algorithms you are using and
> if there
> improvements.
>
> It looks like you are using some kind of multigrid and solve 6 problems
> with 1357 total iterations which is 200 iterations per solve. This is
> absolutely HUGE for multigrain, you need to tune the multigrid for you
> problem to bring that down to at most a couple dozen iterations per solve.
>
>   Barry
>
> On Oct 22, 2020, at 3:02 PM, Antoine Côté <Antoine.Cote3 at USherbrooke.ca>
> wrote:
>
> Hi,
>
> See attached files for both outputs. Tell me if you need any
> clarification. It was run with a DMDA of 33x17x17 nodes (creating
> 32x16x16=8192 elements). With 3 dof per nodes, problem has a total of 28611
> dof.
>
> Note : Stage "Stiff_Adj" is the part of the code modifying Mat
> K. PetscLogStagePush/Pop was used.
>
> Regards,
>
> Antoine
> ------------------------------
> *De :* Matthew Knepley <knepley at gmail.com>
> *Envoyé :* 22 octobre 2020 15:35
> *À :* Antoine Côté <Antoine.Cote3 at USherbrooke.ca>
> *Cc :* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Objet :* Re: [petsc-users] Enhancing MatScale computing time
>
> On Thu, Oct 22, 2020 at 3:23 PM Antoine Côté <Antoine.Cote3 at usherbrooke.ca>
> wrote:
>
> Hi,
>
> I'm working with a 3D DMDA, with 3 dof per "node", used to create a sparse
> matrix Mat K. The Mat is modified repeatedly by the program, using the
> commands (in that order) :
>
> MatZeroEntries(K)
> In a for loop : MatSetValuesLocal(K, 24, irow, 24, icol, vals, ADD_VALUES)
> MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY)
> MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY)
> MatDiagonalScale(K, vec1, vec1)
> MatDiagonalSet(K, vec2, ADD_VALUES)
>
> Computing time seems high and I would like to improve it. Running tests
> with "-log_view" tells me that MatScale() is the bottle neck (50% of total
> computing time) . From manual pages, I've tried a few tweaks :
>
>    - DMSetMatType(da, MATMPIBAIJ) : "For problems with multiple degrees
>    of freedom per node, ... BAIJ can significantly enhance performance",
>    Chapter 14.2.4
>    - Used MatMissingDiagonal() to confirm there is no missing diagonal
>    entries : "If the matrix Y is missing some diagonal entries this routine
>    can be very slow", MatDiagonalSet() manual
>    - Tried MatSetOption()
>       - MAT_NEW_NONZERO_LOCATIONS == PETSC_FALSE : to increase assembly
>       efficiency
>       - MAT_NEW_NONZERO_LOCATION_ERR == PETSC_TRUE : "When true, assembly
>       processes have one less global reduction"
>       - MAT_NEW_NONZERO_ALLOCATION_ERR == PETSC_TRUE : "When true,
>       assembly processes have one less global reduction"
>       - MAT_USE_HASH_TABLE == PETSC_TRUE : "Improve the searches during
>       matrix assembly"
>
> According to "-log_view", assembly is fast (0% of total time), and the
> use of a DMDA makes me believe preallocation isn't the cause of performance
> issue.
>
> I would like to know how could I improve MatScale(). What are the best
> practices (during allocation, when defining Vecs and Mats, the DMDA, etc.)?
> Instead of MatDiagonalScale(), should I use another command to obtain the
> same result faster?
>
>
> Something is definitely strange. Can you please send the output of
>
>   -log_view -info :mat
>
>   Thanks,
>
>      Matt
>
>
> Thank you very much!
>
> Antoine Côté
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <https://can01.safelinks.protection.outlook.com/?url=http:%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F&data=04%7C01%7CAntoine.Cote3%40usherbrooke.ca%7C2f4d6ff4e9aa48b4058a08d876c6665d%7C3a5a8744593545f99423b32c3a5de082%7C0%7C0%7C637389941843624498%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=EgZu%2BdmuXzZwE8LSyMC4BhoC7Or%2BHvrwykv%2BcPZOCXg%3D&reserved=0>
> <LogView.out><mat.0>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20201022/7c8f65e3/attachment-0001.html>


More information about the petsc-users mailing list