[petsc-dev] IMP prototype

Thu Jan 2 22:12:57 CST 2014

Victor Eijkhout <eijkhout at tacc.utexas.edu> writes:

> On Jan 2, 2014, at 10:50 AM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
>
>> I find simple demonstrations as unconvincing as most patents.  99% of
>> the work remains in extending the idea into something practical.  It may
>> or may not pan out, but we can't say anything from the simple
>> demonstration alone.
>
> Maybe you and I disagree on what I'm demonstrating. My goal was to
> show that my notion of parallelism generalizes MPI & tasking
> notions. Not that I have a better notation for VecScatters.

My impression is that your transformation is recognizing a common
pattern of communication into temporary buffers, followed by
computation, followed by post-communication and putting a declarative
syntax on it (with callbacks for the computation).  The same operations
can also be written imperatively and I'm not seeing the profound
advantage of converting to your callback system.

> And from this demonstration we can definitely say something: namely
> that I've shown how one API can address multiple types of
> parallelism. That's more than any other system I know of. 

There are systems like AMPI that run MPI programs on top of threads, and
the MPI implementations already optimize shared-memory communication.
If you want separate work buffers per thread, those systems will give
you a single abstraction.  But the reason people want native interfaces
to threads is so that they can use large shared data structures and
techniques like cooperative prefetch.  Your abstraction is not uniform
if you need to index into owned parts of shared data structures or
perform optimizations like cooperative prefetch.  If you're going to use
separate buffers, why is a system like MPI not sufficient?

What semantic does your abstraction provide for hybrid
distributed/shared memory that imperative communication systems cannot?

> But let's be constructive: I want to use this demonstration to get
> funding. NSF/DOE/Darpa, I don't know. Now if you can't say anything
> from this simple demonstration, then what would convince you as a
> reviewer?

Make a precise and concrete (falsifiable) statement about what semantic
your system can provide that others cannot.  Show examples that are hard
to express with existing solutions (such as MPI), but are cleanly
represented by your system.  Be fair and show examples of the converse
if they exist (silver bullets are rare).

>> another layer of callbacks
>
> If you have mentioned that objection before it escaped my attention. 

See earlier message on "Callback Hell".

> Yes, I agree that in that respect (which has little to do with the
> parallelism part) my demonstration is not optimal. The unification of
> MPI & tasks is going too far there. For MPI it would be possible to
> have calls like VecScatterBegin/End and instead of a callback just
> have the local node code in place. For task models that is not
> possible (afaik). See for instance Quark, where each task contains a
> function pointer and a few data pointers.

Quark does this because it is a dynamic scheduler.  (When people
compare, static schedules are usually as good or better, though if the
dependency graph changes from run to run, you may still want to write it
as a DAG and have callbacks.)  But your model is not a general DAG and
the execution model is BSP so it's not clear what is gained by switching
to callbacks.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20140102/5a08bff7/attachment.sig>