[petsc-dev] How does this compare...

Wed Mar 28 22:28:27 CDT 2012

Actually, please excuse my misreading. The quoted 2-3x speedup was versus
itself due to hardware prefetch, so it does appear to be asymptotically
better than SGEMM (they quote the crossover at n ~= 1000).

Jack

On Wed, Mar 28, 2012 at 10:23 PM, Jack Poulson <jack.poulson at gmail.com>wrote:

> I just took a look at the paper and, after reading a bit, I am baffled
> that there are no citations of the H-matrix literature (Ctrl+F "Hackbusch"
> turns up nothing...). The paper seems to be in a similar spirit as the
> well-known H-matrix/H-matrix multiplication algorithms, which take O(N
> log^2 N) work, for O(1) ranks, but using truncation instead of low-rank
> approximations in clusters. Since only factors of 2-3x are reported versus
> SGEMM, which is O(N^3), that tells you that this algorithm (or, at least,
> its implementation) is O(N^3).
>
> I do not think a comparison can be drawn to sparse matrix-matrix multiply
> algorithms. I believe Gilbert and Buluc have the most recent publication on
> parallel sparse mat-mat. I have done some work on parallel
> H-matrix/H-matrix multiplication (the code is here:
> http://bitbucket.org/poulson/dmhm), but I haven't pushed the paper out
> the door yet. See http://www.ices.utexas.edu/~poulson/slides/DMHM.pdf for
> the talk I gave on it. I suspect that the ideas from parallel
> H-matrix/H-matrix multiplication could be used to generate a
> communication-efficient sparse mat-mat if it is okay to generate the result
> in a more general data-sparse form (i.e., low rank blocks) rather than
> specifically as explicitly classically sparse.
>
> Jack
>
>
> On Wed, Mar 28, 2012 at 10:07 PM, Hong Zhang <hzhang at mcs.anl.gov> wrote:
>
>> Entire paper discusses SpAMM, and compares with SGEMM.
>> Sorry, I do not see anything about SpMV.
>> Hong
>>
>> On Wed, Mar 28, 2012 at 9:52 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>
>>> On Wed, Mar 28, 2012 at 9:39 PM, Hong Zhang <hzhang at mcs.anl.gov> wrote:
>>>
>>>> Matt:
>>>>
>>>>> to the new scalable MatMatMult() that is done in PETSc? Is it possible
>>>>> to even compare
>>>>> the flop rates sensibly?
>>>>>
>>>>>   http://arxiv.org/abs/1203.1692
>>>>>
>>>> "An Optimized Sparse Approximate Matrix Multiply"
>>>> computes  Approximate matrix product for matrices with decay
>>>>                    ^^^^^^^^^^^^
>>>>       ^^^^^^^^^^
>>>> It is for special application in chemistry(?),
>>>> not applicable for general sparse matrix product.
>>>> No comparison can be drawn to petsc's C=A*B.
>>>>
>>>
>>> This is the wrong conclusion. What he does is sparsify the result, so he
>>> is
>>> computing SOME sort of sparse matrix-vector product. The question is:
>>>
>>>    Does his SpMV use the same algorithm as PETSc?
>>>
>>> That is why I asked.
>>>
>>>     Matt
>>>
>>>
>>>> Hong
>>>>
>>>>>
>>>>> This is what Jeff is planning on using in his SciDAC.
>>>>>
>>>>>       Matt
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120328/890654b6/attachment.html>