[petsc-dev] programming model for PETSc

Matthew Knepley knepley at gmail.com
Thu Nov 24 17:09:48 CST 2011


On Thu, Nov 24, 2011 at 4:49 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:

> On Thu, Nov 24, 2011 at 16:41, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> Let's start with the "lowest" level, or at least the smallest. I think
>> the only sane way to program for portable performance here
>> is using CUDA-type vectorization. This SIMT style is explained well here
>> http://www.yosefk.com/blog/simd-simt-smt-parallelism-in-nvidia-gpus.html
>> I think this is much easier and more portable than the intrinsics for
>> Intel, and more performant and less error prone than threads.
>> I think you can show that it will accomplish anything we want to do.
>> OpenCL seems to have capitulated on this point. Do we agree
>> here?
>>
>
> Moving from the other thread, I asked how far we could get with an API for
> high-level data movement combined with CUDA/OpenCL kernels. Matt wrote
>
> *I think it will get you quite far, and the point for me will be*
> *how will the user describe a communication pattern, and how will we
> automate the generation of MPI*
> *from that specification. Sieve has an attempt to do this buried in it
> inspired by the "manifold" idea.*
> *
> *
> Now that CUDA supports function pointers and similar, we can write real
> code in it. Whenever OpenCL gets around to supporting them, we'll be able
> to write real code for multicore and see how it performs. To unify the
> distributed and manycore aspects, we need some sort of hierarchical
> abstraction for NUMA and a communicator-like object to maintain scope.
> After applying a local-distribution filter, we might be able to express
> this using coloring plus the parallel primitives that I have been
> suggesting in the other thread.
>

One key operation which has not yet been discussed is the "push forward" of
a mapping as Dmitry put it. Here is a scenario:
We understand a matching of mesh points between processes. In order to
construct a ghost communication (VecScatter), I
need to compose the mapping between mesh points and the mapping of mesh
points to data. I think this operation is generic
and important, For example, it turns a mesh point partition into a topology
distribution, or if you like a row partition into a matrix
distribution. I think this might be the right operation to take any
partition to a data distribution algorithm.

    Matt


> I'll think more on this and see if I can put together a concrete API
> proposal.
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111124/508311a7/attachment.html>


More information about the petsc-dev mailing list