[petsc-dev] MatTransposeMatMult and MatTranspose

Jed Brown jedbrown at mcs.anl.gov
Sun Oct 14 17:31:35 CDT 2012


I added proper preallocation for MatTranspose_MPIAIJ(), which speeds it up
greatly.

https://bitbucket.org/petsc/petsc-dev/changeset/486d000050ec62fbd732c0049cb5f09b2b5709b8
https://bitbucket.org/petsc/petsc-dev/changeset/75fca7ed1efa754ca010596a8ba69319501baf52(oops)

Testing on cg
$ mpirun -n 64 ./ex56 -pc_type gamg -ksp_monitor -ksp_rtol 1e-1
-log_summary -mattransposematmult_viamatmatmult 1

*Before:*
-ne 99
MatTranspose           3 1.0 *1.3230e+00* 1.0 0.00e+00 0.0 1.0e+04 2.7e+03
5.1e+01 17  0  3  2  4  33  0  6  7  4     0
MatTrnMatMult          3 1.0 1.8360e+00 1.0 2.26e+07 1.1 2.3e+04 6.0e+03
1.2e+02 24  2  6 12  9  46 10 13 35 10   765
-ne 119
MatTranspose           3 1.0 *2.3402e+00* 1.0 0.00e+00 0.0 1.3e+04 3.1e+03
5.1e+01 16  0  3  2  4  34  0  6  7  4     0
MatTrnMatMult          3 1.0 3.2240e+00 1.0 3.91e+07 1.1 2.8e+04 6.9e+03
1.2e+02 23  2  6 12  9  46 10 13 35 10   759

*After:*
-ne 99
MatTranspose           3 1.0 *9.5813e-02* 1.0 0.00e+00 0.0 1.0e+04 2.7e+03
4.8e+01  1  0  3  2  4   3  0  6  7  4     0
MatTrnMatMult          3 1.0 6.0673e-01 1.0 2.26e+07 1.1 2.3e+04 6.0e+03
1.2e+02  8  2  6 12  9  21 10 13 35 10  2316
-ne 119
MatTranspose           3 1.0 *1.8572e-01* 1.0 0.00e+00 0.0 1.3e+04 3.1e+03
4.8e+01  2  0  3  2  4   4  0  6  7  4     0
MatTrnMatMult          3 1.0 1.0656e+00 1.0 3.91e+07 1.1 2.8e+04 6.9e+03
1.2e+02 10  2  6 12  9  23 10 13 35 10  2297

*Reference* (-mattransposematmult_viamatmatmult 0):
-ne 99
MatTrnMatMult          3 1.0 8.0196e-01 1.0 1.02e+08 1.1 1.3e+04 1.3e+04
8.7e+01 13 10  4 15  7  28 33  8 40  7  7831
-ne 119
MatTrnMatMult          3 1.0 1.3759e+00 1.0 1.78e+08 1.1 1.6e+04 1.6e+04
8.7e+01 12 10  4 15  7  27 33  8 40  8  7999

I don't know why the reference implementation claims to have done so many
more flops.

This indicates that perhaps it makes sense for MatPtAP to do an explicit
transpose and then RAP. Unless we can find a fast data structure for A^T *
B.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20121014/84a2cee8/attachment.html>


More information about the petsc-dev mailing list