[petsc-users] matsetvaluesblocked4_
Jed Brown
jed at jedbrown.org
Wed May 27 18:34:43 CDT 2020
Mark Adams <mfadams at lbl.gov> writes:
> Nvidias's NSight with 2D Q3 and bs=10. (attached).
Thanks; this is basically the same as a CPU -- the cost is searching the
sorted rows for the next entry. I've long thought we should optimize
the implementations to fast-path when the next column index in the
sparse matrix equals the next index in the provided block. It'd just
take a good CPU test to demonstrate that payoff.
More information about the petsc-users
mailing list