[petsc-dev] Triangular solve on the GPU
Matthew Knepley
knepley at gmail.com
Mon Aug 16 07:06:41 CDT 2010
I am no longer as pessimistic as I once was. To start, this is a good report
on prefix-sum in parallel:
http://www.cs.cmu.edu/~blelloch/papers/Ble93.pdf
It can solve first-order recurrences optimally. Higher order recurrences can
be reduced to first-order,
however it is not work optimal (too big by a factor of the order, m), which
seems alright to me. Thus
we should be able to solve bandwidth m matrices efficiently. Here is a nice
implementation of prefix-sum
http://code.google.com/p/cudpp/
This all leads me to believe that for a bounded number of nonzeros per row,
we can get an efficient
algorithm for triangular solve.
Matt
--
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100816/615d43d0/attachment.html>
More information about the petsc-dev
mailing list