<div class="gmail_quote">Forwarding here in case anyone has comments on this approach to MatMatMultSymbolic. The idea is to use a heap as a priority queue for the next nontrivial entry in the rows of B that need to be merged.<div>

<br></div><div>The asymptotics are basically the same as if you have a list with O(log n) insertion. There are two versions, the _Heap uses the heap to make sure that all duplicate entries come to the top at the same time. The _BTHeap version uses a PetscBT to avoid putting duplicates into the heap (so each nonzero in the merged version involves one heap operation).</div>


<div><br></div><div>It's possible that merging lists (accelerated by BT) is still the best option in the long run. It seems to be somewhat more memory-friendly. Alternatively, making the heap fatter might give enough of a boost to pay off.</div>

<div><br></div><div>The attached callgrind/kcachegrind profiles were moved to</div><div><br></div><div><div><a href="http://i.imgur.com/j7yOh.png">http://i.imgur.com/j7yOh.png</a> (btheap)</div><div><a href="http://i.imgur.com/1ORcH.png">http://i.imgur.com/1ORcH.png</a> (standard)</div>

<div><div class="h5"><br>

<br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">Jed Brown</b> <span dir="ltr"><<a href="mailto:jedbrown@mcs.anl.gov" target="_blank">jedbrown@mcs.anl.gov</a>></span><br>

Date: Sat, Jun 9, 2012 at 4:18 PM<br>

Subject: Re: Heaps<br>To: Hong Zhang <<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>><br><br><br><div class="gmail_quote"><div><div>On Sat, Jun 9, 2012 at 3:51 PM, Hong Zhang <span dir="ltr"><<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Jed :<div><br></div><div>Wow, you are so fast!<br><div class="gmail_quote"><div><div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I implemented a heap-based MatMatMultSymbolic_SeqAIJ_SeqAIJ_Heap like we discussed earlier, but it's slower than the PetscLLCondensed stuff.<div>


<br></div><div>I left the code in case you want to look at it or in case the heap can be used elsewhere.</div>

<div><br></div><div><a href="http://petsc.cs.iit.edu/petsc/petsc-dev/rev/d2e6d8577ed8" target="_blank">http://petsc.cs.iit.edu/petsc/petsc-dev/rev/d2e6d8577ed8</a></div><div><a href="http://petsc.cs.iit.edu/petsc/petsc-dev/rev/437bcb349f01" target="_blank">http://petsc.cs.iit.edu/petsc/petsc-dev/rev/437bcb349f01</a></div>


</blockquote><div><br></div></div></div><div>I'll look at this while supervising the high school kid :-)</div><div>The algorithm gives optimal complexity than current linear search.</div><div>We need investigate and optimize it though. This is a good project for</div>


<div>the kid.</div></div></div></blockquote><div><br></div></div></div><div>I just pushed a better one which has competitive performance (now there is one heap operation per nonzero in C rather than one per entry in the rows of B that need to be merged).</div>


<div><br></div><div><a href="http://petsc.cs.iit.edu/petsc/petsc-dev/rev/27307154f71e" target="_blank">http://petsc.cs.iit.edu/petsc/petsc-dev/rev/27307154f71e</a></div><div><a href="http://petsc.cs.iit.edu/petsc/petsc-dev/rev/4ab803940a7f" target="_blank">http://petsc.cs.iit.edu/petsc/petsc-dev/rev/4ab803940a7f</a></div>


<div><br></div><div>This could perhaps be optimized more.</div><div><br></div><div>$ valgrind --tool=callgrind ./ex94 -f0 /home/jed/petsc/datafiles/matrices/arco3 -f1 /home/jed/petsc/datafiles/matrices/arco3 -viewer_binary_skip_info -matmatmult_btheap</div>


<div><br></div><div>See attached profiles.</div></div>

</div><br></div></div></div>

</div><br>