[petsc-users] matsetvaluesblocked4_

Jed Brown jed at jedbrown.org
Wed May 27 18:34:43 CDT 2020


Mark Adams <mfadams at lbl.gov> writes:

> Nvidias's NSight with 2D Q3 and bs=10. (attached).

Thanks; this is basically the same as a CPU -- the cost is searching the
sorted rows for the next entry.  I've long thought we should optimize
the implementations to fast-path when the next column index in the
sparse matrix equals the next index in the provided block.  It'd just
take a good CPU test to demonstrate that payoff.


More information about the petsc-users mailing list