[petsc-dev] Non-scalable matrix operations

Mark F. Adams mark.adams at columbia.edu
Fri Dec 23 13:37:41 CST 2011


On Dec 23, 2011, at 2:20 PM, Matthew Knepley wrote:

> 
> 
> On Fri, Dec 23, 2011 at 12:55 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
> On Fri, Dec 23, 2011 at 12:27, Mark F. Adams <mark.adams at columbia.edu> wrote:
> A more interesting thing is partition down to the thread level and keep about 100 vertices per thread (this might be to big for a GPU...)
> 
> It's fine to have more partitions than threads.
>  
> and then use locks of some sort for the shared memory synchronization
> 
> It can be lock-free, your thread just waits until a buffer has been marked as updated. Since the reader/writer relationships are predefined, it's not actually a lock. (You can do more general methods lock-free too.)
> 
> You could use
> 
>   a) coloring and stream events (easy)
> 
>   b) what John Cohen does which I still do not understand
> 
> We should talk to him

Also, my algorithm exploits static partitions where you know who is processing your ghost nodes (so you can avoid stepping on each other).  This might not be available.  Coloring looks OK for 7 point stencils but the number of colors gets very large at higher order.  You need about 13 colors for a 3D hex mesh (not very high order at all).  But inside the compute node there are sets of vertices that need processing and you can do whatever you want on these sets, so if you need to color then so be it.

Mark

> 
>    Matt 
>  
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111223/d4a2e386/attachment.html>


More information about the petsc-dev mailing list