<html>


<head>


<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


</head>


<body text="#000000" bgcolor="#FFFFFF">


<br>


<br>


<div class="moz-cite-prefix">On 10/3/19 1:12 AM, Karl Rupp wrote:<br>


</div>


<blockquote type="cite" cite="mid:c5f058f0-6958-6d2a-8ded-53a1f5e8b338@iue.tuwien.ac.at">


<blockquote type="cite">Do you have any experience with nsparse? <br>


<br>


<a class="moz-txt-link-freetext" href="https://github.com/EBD-CREST/nsparse">https://github.com/EBD-CREST/nsparse</a>


<br>


<br>


I've seen claims that it is much faster than cuSPARSE for sparse <br>


matrix-matrix products. <br>


</blockquote>


<br>


I haven't tried nsparse, no. <br>


<br>


But since the performance comes from a hardware feature (cache), I would be surprised if there is a big performance leap over ViennaCL. (There's certainly some potential for some tweaking of ViennaCL's kernels; but note that even ViennaCL is much faster than


 cuSPARSE's spGEMM on average). <br>


<br>


With the libaxb-wrapper we can just add nsparse as an operations backend and then easily try it out and compare against the other packages. In the end it doesn't matter which package provides the best performance; we just want to leverage it :-)


<br>


</blockquote>


I'd be happy to add support for this (though I suppose I should play with it first to verify that it is, in fact, worthwhile). Karl, is your branch with libaxb ready for people to start using it, or should we wait for you to do more with it? (Or, would you


 like any help with it?)<br>


<br>


I'd like to try to add support for a few things like cuSPARSE SpGEMM before I go to the Summit hackathon, but I don't want to write a bunch of code that will be thrown away once your libaxb approach is in place.<br>


<br>


--Richard<br>


<blockquote type="cite" cite="mid:c5f058f0-6958-6d2a-8ded-53a1f5e8b338@iue.tuwien.ac.at">


<br>


Best regards, <br>


Karli <br>


<br>


<br>


<br>


<blockquote type="cite"><br>


Karl Rupp via petsc-dev <a class="moz-txt-link-rfc2396E" href="mailto:petsc-dev@mcs.anl.gov">


<petsc-dev@mcs.anl.gov></a> writes: <br>


<br>


<blockquote type="cite">Hi Richard, <br>


<br>


CPU spGEMM is about twice as fast even on the GPU-friendly case of a <br>


single rank: <a class="moz-txt-link-freetext" href="http://viennacl.sourceforge.net/viennacl-benchmarks-spmm.html">


http://viennacl.sourceforge.net/viennacl-benchmarks-spmm.html</a> <br>


<br>


I agree that it would be good to have a GPU-MatMatMult for the sake of <br>


experiments. Under these performance constraints it's not top priority, <br>


though. <br>


<br>


Best regards, <br>


Karli <br>


<br>


<br>


On 10/3/19 12:00 AM, Mills, Richard Tran via petsc-dev wrote: <br>


<blockquote type="cite">Fellow PETSc developers, <br>


<br>


I am wondering why the AIJCUSPARSE and AIJVIENNACL matrix types do not <br>


support the sparse matrix-matrix multiplication (SpGEMM, or MatMatMult() <br>


in PETSc parlance) routines provided by cuSPARSE and ViennaCL, <br>


respectively. Is there a good reason that I shouldn't add those? My <br>


guess is that support was not added because SpGEMM is hard to do well on <br>


a GPU compared to many CPUs (it is hard to compete with, say, Intel Xeon <br>


CPUs with their huge caches) and it has been the case that one would <br>


generally be better off doing these operations on the CPU. Since the <br>


trend at the big supercomputing centers seems to be to put more and more <br>


of the computational power into GPUs, I'm thinking that I should add the <br>


option to use the GPU library routines for SpGEMM, though. Is there some <br>


good reason to *not* do this that I am not aware of? (Maybe the CPUs are <br>


better for this even on a machine like Summit, but I think we're at the <br>


point that we should at least be able to experimentally verify this.) <br>


<br>


--Richard <br>


</blockquote>


</blockquote>


</blockquote>


</blockquote>


<br>


</body>


</html>