<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hello Matt and PETSc-users,<br><div apple-content-edited="true"> <span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><br></div></div></div></div></div></div></span><br class="Apple-interchange-newline"> </div><div><blockquote type="cite"><div><br>1) With any performance question, please send the output of -log_summary</div></blockquote><div><br></div><div>You'll find the output attached</div><div>(But there is _really_ not much to see).</div><div><br></div><blockquote type="cite"><div><br>2) I think it is unlikely that cache misses are responsible for this<br>performance. It is<br> much more likely that bandwidth limitations are responsible.</div></blockquote><div><br></div><div>As far as I can see, there are neither bandwidth limitations nor latency problems (since there is an infiniband-interconnect). </div><div>MPI-Performance (Vampirtrace + Scalasca) looks good (late senders/receivers, barriers etcpp.). </div><div>PAPI-Instrumentation says: cache misses.</div><br><blockquote type="cite"><div><br>Please see the paper<br> by Kaushik and Gropp which models sparse matvec performance (on<br>Dinesh's website).<br><br></div></blockquote><div><br></div><div>Which Paper on which website. Please send a link.</div><div><br></div><div><br></div><div><br></div><blockquote type="cite"><div>3) You would see better performance using a block method. Sparse matvec without<br> blocks will never see good percentages of peak (ditto for backsolve).<br></div></blockquote></div><div><br></div><div>How do I use the block methods? </div><div>Since I rely on the "user-level" interfaces kspsolve etcpp., I don't see how i could influence this.</div><div>You'll find basic source code attached.</div><div><br></div><div>Sincerly,</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>Christoph<br></div><div><br></div><br><div><div>--</div><div>Christoph Statz</div><div><br></div><div>Institut für Nachrichtentechnik</div><div>Technische Universität Dresden</div><div>01062 Dresden</div><div><br></div><div>Email: <a href="mailto:christoph.statz@mailbox.tu-dresden.de">christoph.statz@mailbox.tu-dresden.de</a></div><div>Phone: +49 351 463 32287</div><div><br></div><div></div></div></body></html>