[petsc-dev] Feed back on report on performance of vector operations on Summit requested
Zhang, Junchao
jczhang at mcs.anl.gov
Thu Oct 10 11:48:49 CDT 2019
*Better to have an abstract for readers to know your intention/conclusion
*p.5 "We also launch all jobs using the --launch_distribution cyclic option so that MPI ranks are assigned to resource sets in a circular fashion, which we deem appropriate for most high performance computing (HPC) algorithms."
Cyclic distribution is fine for these simple Vec ops since there is almost no communication, but can not be deemed appropriate for most HPC algorithms. I assume packed distribution is better for locality.
*Fig. 1 Left. I would use the diagram at p.11 of https://press3.mcs.anl.gov/atpesc/files/2018/08/ATPESC_2018_Track-1_6_7-30_130pm_Hill-Summit_at_ORNL.pdf, which is more informative and contains a lot of numbers we can compare with your results. E.g., peak bandwidth, you mentioned but did not list.
*2.1 cudaMemcopy ?
For the two bullets VecAXPY, VecDot, you'd better clearly list how you counted their FLOPS & memory, which you used to calculate bandwidth and performance in the report.
*p.12 VecACPY ?
*p.12 I don't the difference of the two GPU launch time.
*When appropriate, can you draw a line for hardware peak bandwidth or FLOPS/s in the figures.
*p.13, some bullets are not important and you can mention them earlier in your experimental setup.
bullet 4: I think the reason is: to get peak CPU->GPU bandwidth, the cpu buffer has to be pinned (i.e. non-pageable).
--Junchao Zhang
On Wed, Oct 9, 2019 at 5:34 PM Smith, Barry F. via petsc-dev <petsc-dev at mcs.anl.gov<mailto:petsc-dev at mcs.anl.gov>> wrote:
We've prepared a short report on the performance of vector operations on Summit and would appreciate any feed back including: inconsistencies, lack of clarity, incorrect notation or terminology, etc.
Thanks
Barry, Hannah, and Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20191010/ecb033cc/attachment.html>
More information about the petsc-dev
mailing list