[petsc-dev] PETSc - MPI3 functionality

Sat Sep 8 07:58:59 CDT 2018

On Sat, Sep 8, 2018 at 4:43 AM Tamara Dancheva <tamaradanceva19933 at gmail.com>
wrote:

> Hi Barry,
>
> I see the issue..
>
>  In the FEM library and solver that I am working on, PETSc is used all
> throughout for the data distribution, synchronization of functions,
> assembly. There is another UPC alternation of using the JANPACK linear
> algebra backend (http://www.csc.kth.se/~njansson/janpack/), which gives
> increased performance. My project is about exploring another pathway,
> optimization given that this software targets large scale computations, an
> asynchronous version of the algorithm for which I have implemented a
> Block-Jacobi with inner Krylov Solvers (inner solve with PETSc). This
> version aims for a speedup factor of about 1.7-2.0 (from some literature
> although not in the same context exactly) and it is done with the same
> motivation behind ExaFLOW (http://exaflow-project.eu/), I would say. This
> still requires me to modify the ghost exchange routines in order to be able
> to advance the processes out of sync. I could implement this out of PETSc,
> but I would significantly increase the memory footprint, since the
> necessary data is currently fed to PETSc and discarded. In this context,
> since PETSc also works with, stores MPI requests, I can reuse and extend
> upon the implementation since this is close to the approach I have in mind
> (using either a circular of limited size buffer of MPI Requests and
> non-blocking collectives). I had also considering not using PETSc at all to
> avoid all the blocking regions, however considering the scope of my
> project, deemed that it would take too long to implement and validate.
>

Okay, it would help to have a better idea of

  a) the structure of the computation

  b) how you hope to achieve speedup

  c) experiments you have that support your theory for b)

First, what kind of synchronization exists in your library now? Is it
implicit or explicit? I am guessing that its explicit, since implicit codes
typically have a lot of synchronization in the solver itself. However,
explicit codes typically have a synchronization each timestep when
determining the stable timestep. What syncrhonization are you seeking to
avoid?

For an explicit code, you can try to measure wait time, as we do in PETSc,
but these measurements are sometimes hard to interpret. You could remove
the sychronizations in your current code, and then run ignoring the answer
and see how much speedup you get, which would be some sort of upper bound.
Have you done this?

  Thanks,

     Matt

> Hope this sums it up well,
> Tamara
>
>
> On Sat, Sep 8, 2018 at 4:28 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>
>>
>>
>>     Tamara,
>>
>>        The VecScatter routines are in a big state of flux now as we try
>> to move from a monolithic implementation (where many cases were handled
>> with cumbersome if checks in the code) to simpler independent standalone
>> implementations that easily allow new implementations orthogonal to the
>> current implementations. So it is not a good time to dive in.
>>
>>     We are trying to do the refactorization but it is a bit frustrating
>> and slow.
>>
>>      Can you tell us why you feel you need a custom implementation? Is
>> the current implementation too slow (how do you know it is too slow?)?
>>
>>     Barry
>>
>> > On Sep 7, 2018, at 12:26 PM, Tamara Dancheva <
>> tamaradanceva19933 at gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > I am developing an asynchronous method for a FEM solver, and need a
>> custom implementation of the VecScatterBegin and VecScatterEnd routines.
>> Since PETSc uses its own limited set of MPI functions, could you tell what
>> would be the best way possible to extend upon it and use for example the
>> non-blocking collectives, igatherv and so on?
>> >
>> > I hope the question is specific enough, let me know if anything, I can
>> provide with more information. I would very much appreciate any help,
>> thanks in advance!
>> >
>> > Best,
>> > Tamara
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20180908/2af79466/attachment.html>