<div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br><br>

With a graph and a row-based algorithm, I would partition it using METIS or similar, then apply an RCM ordering within each block. This will give better locality of access to the vector x, which has been known (since at least the PETSc-FUN3D work in the late 90s) to be a major factor in SpMV performance.<br>

<br></blockquote><div><br></div><div>I can just add that we (at Berkeley) also looked at SpMV ordering in the 90s. Sivan Toledo has a nice paper on it and a grad student made a PhD thesis in the topic ("Sparsity" package).</div><div>Anyway we found that if you did something reasonable, RCM, Metis, performance was pretty similar (we used IBMs a lot also), and random ordering was about 3x slower. FWIW.</div></div></div>