[petsc-dev] Fwd: Heaps

Jed Brown jedbrown at mcs.anl.gov
Sat Jun 9 16:35:12 CDT 2012

Forwarding here in case anyone has comments on this approach to
MatMatMultSymbolic. The idea is to use a heap as a priority queue for the
next nontrivial entry in the rows of B that need to be merged.

The asymptotics are basically the same as if you have a list with O(log n)
insertion. There are two versions, the _Heap uses the heap to make sure
that all duplicate entries come to the top at the same time. The _BTHeap
version uses a PetscBT to avoid putting duplicates into the heap (so each
nonzero in the merged version involves one heap operation).

It's possible that merging lists (accelerated by BT) is still the best
option in the long run. It seems to be somewhat more memory-friendly.
Alternatively, making the heap fatter might give enough of a boost to pay

The attached callgrind/kcachegrind profiles were moved to

http://i.imgur.com/j7yOh.png (btheap)
http://i.imgur.com/1ORcH.png (standard)

---------- Forwarded message ----------
From: Jed Brown <jedbrown at mcs.anl.gov>
Date: Sat, Jun 9, 2012 at 4:18 PM
Subject: Re: Heaps
To: Hong Zhang <hzhang at mcs.anl.gov>

On Sat, Jun 9, 2012 at 3:51 PM, Hong Zhang <hzhang at mcs.anl.gov> wrote:

> Jed :
> Wow, you are so fast!
>> I implemented a heap-based MatMatMultSymbolic_SeqAIJ_SeqAIJ_Heap like we
>> discussed earlier, but it's slower than the PetscLLCondensed stuff.
>> I left the code in case you want to look at it or in case the heap can be
>> used elsewhere.
>> http://petsc.cs.iit.edu/petsc/petsc-dev/rev/d2e6d8577ed8
>> http://petsc.cs.iit.edu/petsc/petsc-dev/rev/437bcb349f01
> I'll look at this while supervising the high school kid :-)
> The algorithm gives optimal complexity than current linear search.
> We need investigate and optimize it though. This is a good project for
> the kid.

I just pushed a better one which has competitive performance (now there is
one heap operation per nonzero in C rather than one per entry in the rows
of B that need to be merged).


This could perhaps be optimized more.

$ valgrind --tool=callgrind ./ex94 -f0
/home/jed/petsc/datafiles/matrices/arco3 -f1
/home/jed/petsc/datafiles/matrices/arco3 -viewer_binary_skip_info

See attached profiles.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120609/1dda185d/attachment.html>

More information about the petsc-dev mailing list