[petsc-dev] Integrating PFLOTRAN, PETSC & SAMRAI

Tue Jun 7 13:17:54 CDT 2011

On 6/7/11 1:43 PM, Jed Brown wrote:
> On Tue, Jun 7, 2011 at 19:05, Boyce Griffith <griffith at cims.nyu.edu
> <mailto:griffith at cims.nyu.edu>> wrote:
>
>     Another might be easier access to CUDA.
>
>
> What do SAMRAI matrices look like? In particular, how are they constructed?

There is not a notion of matrices in SAMRAI --- only vectors.  (I 
believe that SAMRAI was designed for and is still used, at least at 
LLNL, mainly for hyperbolic problems, so they never have to solve large 
systems of equations.)

I'm not familiar with the details of Bobby's PFLOTRAN SAMRAI 
implementation, but in my code everything is matrix free, including 
preconditioners.

At one point, I tried implementing the local part of the matrix-free 
matrices use PETSc Mat's, but performance was modestly lower than a pure 
Fortran implementation, and so I abandoned this.  This was for a 
cell-centered Poisson problem using a 5/7-point finite-volume stencil. 
I think this even held in the case where there was only one patch per 
processor --- i.e., contiguous local storage --- although I may be 
misremembering.

In the past, I also used to build local patch-based Mat's to use with 
MatRelax within SAMRAI-based FAC preconditioners, but found that it was 
modestly faster to use a specialized Fortran implementation of Gauss-Seidel.

One place where using PETSc Mat's to provide the local part of the 
matrix seems to win is for Vanka-like smoothers for incompressible 
Stokes.  In this case, performing local solves using PETSc seems to be 
faster than whatever I cobbled together.

> Your mention of CUDA offers a different approach. Instead of using the
> non-contiguous storage directly, you could copy it to contiguous
> storage. Thus you can make a Vec type (or adorn an existing Vec) with
> SAMRAI information. Then you would have an interface with two functions
>
> VecSAMRAIGetVector(Vec,SAMRAIVector*);
> VecSAMRAIRestoreVector(Vec,SAMRAIVector*);

Getting this right can be a lot of work, because parallel AMR data 
indexing is not so simple for data centerings other than cell-centered, 
but I have started to do something like this in some places where I need 
to be able to compute explicitly matrices that look like P^t A P.

An alternative, which I had in mind when I said that I was interested in 
trying to use PETSc to provide the storage for the SAMRAI data 
structures, is to use SAMRAI to provide basic grid management, and to 
use PETSc to provide data storage.  One would allocate appropriately 
sized PETSc vectors and set various pointers appropriately so that the 
SAMRAI data would continue to "look like" regular SAMRAI data.  Because 
the data is actually stored in a PETSc Vec, it would also "look like" 
regular PETSc data from the standpoint of a PETSc solver.  Doing this 
would mean including some redundant ghost cell data in the PETSc Vec 
that provides the actual storage.

There are some cases where retaining such redundant data in the Vec 
might be a bad thing, and for such cases, it ought to be relatively 
straightforward to map the "ghosted" Vec to a "nonghosted" Vec that is 
actually used in the solver.

> The first of these would actually create SAMRAI storage if it didn't
> exist and copy from there from the Vec. If the SAMRAIVector is modified,
> a flag is set in the Vec indicating that the copy "on the SAMRAI" is up
> to date. If the Vec is accessed on the PETSc side via VecGetArray() or
> similar, the data would be copied over. If it's modified on the PETSc
> side, the SAMRAI copy would be marked as invalid.
>
> Reasons for this instead of using a single copy in discontiguous storage:
>
> 1. All PETSc solver components would work.
>
> 2. Some operations will be faster using contiguous storage.
>
> 3. You can easily have a run-time option to switch.
>
> The downside is possibly using more memory, but in most use cases, only
> a couple Vecs are actually touched by the application. Most of the Vecs
> are in the Krylov space or stored time steps. Those would be lighter and
> faster using contiguous storage (PETSc native), and since
> VecSAMRAIGetVector() would only be called on the ones the user interacts
> directly with, it probably only costs a couple vectors to maintain the
> multiple copies (which could mean less storage in total when summed over
> the whole Krylov space).

I am already sort-of heading in this direction, although it is pretty 
low on my TODO list right now.

> What do SAMRAI matrices look like? What is the assembly code like? Can
> we preserve that API and get the data to go directly into a PETSc format?

No, we can't, but only because there is not really any API there to 
preserve.  ;-)  (Again, speaking only for my code, not Bobby's.)

-- Boyce