[petsc-dev] IMP prototype

Jed Brown jedbrown at mcs.anl.gov
Tue Dec 31 15:10:55 CST 2013


Victor Eijkhout <eijkhout at tacc.utexas.edu> writes:

> On Dec 31, 2013, at 2:37 AM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
>
>> How would you propose to handle
>
> Look, what I have now is basically like a re-implementation of a
> VecScatter between two distributed vectors. It's only partly about
> coming up with cute new algorithms under the hood: my foremost claim
> is about the integrative API.

Hmm, I don't really see the API for specifying the distributions.
rightshift->operate(">>1") is a pretty special-purpose interface.
Presumably this part is non-collective, but if you only ever accept
one-sided specification, it implies that setup costs will be relatively
higher.

The BSP model also seems inconvenient for summing contributions in the
delta-epsilon stage.


Have you looked at PetscSF?

http://59A2.org/files/StarForest.pdf

>> irregular communication patterns
>
> Sparse matrices? Basically a VecScatter.

Sure, but what I'm interested in is how expressive the specification of
that irregular pattern will be, so that good algorithms can be used to
perform that operation.  For example, you might want to use Alltoallv in
some cases, MPI-3 neighborhood collectives in others, one-sided in
others, and plain point-to-point messages in other cases.

> Small number of dense rows: are you implying that a matrix-vector
> product with such a matrix should be a combination of regular sparse
> VecScatter and a reduction? 

Yes, for example.  The point is that a naive mapping to MPI doesn't make
sense all the time, and we'd like the input data describing the
communication pattern to be scalable, as well as provide enough semantic
information for the implementation to choose a good mapping.

>> I think that the setup costs are actually critical.  They are not always
>> amortized in either dynamic execution models or methods with frequent
>> adaptivity, particle transport, self-contact, etc.  The MPI model may
>> already have two-sided knowledge about the new communication pattern or
>> know a good way to use collectives.  It's more work for the programmer
>> to specify both sides of the communication, but dynamic exchange is
>> significantly more expensive.
>
> I'm not really sure what you're arguing here, but I'll offer the following.
> 1. any setup algorithm you have in Petsc I can straightaway copy to my software

What would the interface look like for two-sided specification of a beta
distribution?

MatAssembly would be a useful test case for one-sided specification.  In
my new implementation, we use MPI_Issend/MPI_Ibarrier/MPI_Iprobe small
constant-sized messages for the initial rendezvous.  As those messages
are received, the actual payload receives are posted.  On reassembly
(with MAT_SUBSET_OFF_PROC_ENTRIES), the rendezvous is not repeated, but
zero-sized messages are sent for empty contributions (to save needing to
perform the rendezvous).  Receives are processed using MPI_Waitsome()
and unpacked into the matrix data structure rather than going into an
intermediate buffer.

It seems hard to make such a weakly-synchronous algorithm with the
interface you propose (or with PetscSF for that matter, which is why I'm
not using PetscSF for this operation, despite it being useful in many
other places).

> 2. adaptivity: rather than doing a full setup you could do an update
> of an existing one. 

Great, but what would that interface look like?  I know you see this is
something for the distant future, but I think issues like these are the
hard part if you want something general.

The concept of doing some communication to work space, some computation,
and perhaps post-communication is pervasive, but there are lots of ways
to organize it.  You have inverted control by adding callbacks.  So
instead of imperative code that looks like

  buffers = malloc();
  pre_communicate(global_in, buffers);
  local_work(buffers);
  post_communicate(buffers, global_out);

you have something that looks like:

  buffers = malloc();
  ctx = create_context();
  define_communication(ctx, buffers, callback_that_does_the_work, global_in, global_out);
  run_computation(ctx);

While callbacks are useful and powerful, they should not be used
gratuitously and you can find numerous rants about "Callback Hell".  The
reason that Callback Hell is unpopular is the same reason that goto is
difficult to reason about: information in the call stack and clarity
about dependencies is obfuscated to the user.  Reasoning about
misbehavior/bugs and error conditions is more complicated, especially to
the user that doesn't also understand the library implementation (or
compiler, if you promote concepts to that level).

Your sample code has a lot more indirection and boilerplate than would a
standard VecScatter implementation.  If you are going to argue that we
should add a layer of callbacks to every computation we do, you'll have
to provide significant tangible benefit.  I think one of the most
important features is that incremental complexity should be accommodated
very simply and should be easy to debug.  And of course that all
important algorithms need to be expressible in a similarly concise
manner and perform similarly to other approaches.


BTW, there is a typo in

  for (int step=0; step<=nsteps; ++step) {

which should be

  for (int step=1; step<=nsteps; ++step) {
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20131231/28864376/attachment.sig>


More information about the petsc-dev mailing list