[petsc-dev] Triangular solve on the GPU

Chris Cooper chris.cooper at drdstudios.com
Mon Aug 16 16:47:30 CDT 2010


You're probably aware the thrust library which you're already including also
provides GPU algorithms for prefix-sum, reduce, sort etc
eg. http://thrust.googlecode.com/svn/tags/1.2.0/doc/html/modules.html

On Mon, Aug 16, 2010 at 10:06 PM, Matthew Knepley <knepley at gmail.com> wrote:

> I am no longer as pessimistic as I once was. To start, this is a good
> report on prefix-sum in parallel:
>
>   http://www.cs.cmu.edu/~blelloch/papers/Ble93.pdf<http://www.cs.cmu.edu/%7Eblelloch/papers/Ble93.pdf>
>
> It can solve first-order recurrences optimally. Higher order recurrences
> can be reduced to first-order,
> however it is not work optimal (too big by a factor of the order, m), which
> seems alright to me. Thus
> we should be able to solve bandwidth m matrices efficiently. Here is a nice
> implementation of prefix-sum
>
>   http://code.google.com/p/cudpp/
>
> This all leads me to believe that for a bounded number of nonzeros per row,
> we can get an efficient
> algorithm for triangular solve.
>
>   Matt
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>



-- 
Chris Cooper
Dr. D Studios
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100817/5d3578e7/attachment.html>


More information about the petsc-dev mailing list