[petsc-users] RE : How to get bandwidth peak out of PETSc log ?
Jed Brown
jedbrown at mcs.anl.gov
Fri Jun 21 05:41:11 CDT 2013
HOUSSEN Franck <Franck.Houssen at cea.fr> writes:
> MatMult 13216 1.0 5.4614e+01 1.0 5.52e+10 1.0 0.0e+00 0.0e+00 0.0e+00 21 56 0 0 0 21 56 0 0 0 1010
Much of your time is spent here, which is bandwidth limited. It needs 6
bytes per flop (plus a little, if the vector is perfectly reused) , so
this number is about 6 GB/s.
> MatMult 13550 1.0 3.8204e+01 1.0 2.85e+10 1.0 2.7e+04 3.5e+04 0.0e+00 23 55 79 93 0 23 55 79 93 0 1494
Here with two processes, you have about 9 GB/s.
Was your STREAM test (getting 11 GB/s) using multiple threads/processes?
Can you send STREAM results for one and for two threads? 50% of STREAM
is not very good (though it's actually the best you can do on some funny
architectures), 70-85% is what we expect.
If you're getting a low fraction of peak in MatMult, try reordering your
matrix to have lower bandwidth. You can use MatGetOrdering with RCM for
this.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130621/e5ff7298/attachment-0001.pgp>
More information about the petsc-users
mailing list