[Petsc-trilinos-discussion] Scope and requirements

Bartlett, Roscoe A. bartlettra at ornl.gov
Fri Nov 22 10:26:31 CST 2013


Barry,

Okay, very reasonable observations here.  Assuming that Hydra-TH keeps using PETSc/ML, we would need to:

1) Upgrade Hydra-TH to use the updated version of PETSc used by VERA (which is newer than Hydra-TH currently uses).

2) Ensure that the current Trilinos version of ML works with this updated version of PETSc.

3) Before Hydra-TH get coupled into any other code, we need to remove the direct insertion of ML support in the PETSc lib and defer it to Hydra-TH or some other downstream lib that uses dependency injection to overcome the link order issue.

I can update the exiting CASL Kanban ticket with this info and see if the Hydra-TH developers are interested to purse this.  If they are, what level of support could the PETSc team give them on this in case they ask?

However this might be a problem that we may not need to solve if Hydra-TH just moves over to use PETSc/HYPRE or pure Trilinos or if Hydra-TH never gets coupled with anything that intersects with the parts of VERA that use Trilinos.  For the current FY, I think the plan is for them to only couple Hydra-TH with a little LANL code called MAMBA which does not use Trilinos so there would be no issue with Hydra-TH's usage of independent PETSc/ML.  (The TriBITS build system allows packages to use incompatible versions of libraries as long as there are no packages that bring these together (and there are not in the case of Hydra-TH currently).)

-Ross

> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov]
> Sent: Thursday, November 21, 2013 5:26 PM
> To: Bartlett, Roscoe A.
> Cc: Jed Brown; petsc-trilinos-discussion at lists.mcs.anl.gov
> Subject: Re: [Petsc-trilinos-discussion] Scope and requirements
>
>
> >  It will be a large amount of work to re-verify all of their Exodus-based gold-
> standard regression tests for a change like this (and so they have put this off
> for a year!).
>
>    It sounds like the difficulty here is that the current version of ML in Trilinos
> would generate possibly different convergence results than the version used
> by the application code that generated their regression tests. Thus switching
> to the latest ML in Trilinos is the problem. Even if they had used the older ML
> directly in their code (and not through PETSc) the same problem would exist
> (they could not just relink to the latest ML, they would have to redue all their
> gold standard tests, - 1 year).
>
>     This is any interesting problem, but pointing to PETSc (or PETSc/Trilinos) in
> any way related to this problem is a red herring.
>
>     It is also a problem distinct from keeping consistent interfaces or changing
> interfaces because it has nothing to do with the interfaces, but instead actual
> details of the implementation.
>
>    So let's talk about how this problem could be dealt with.
>
> 1) never change the implementation of "finished" packages once they are
> released. Thus they could start using the latest/greatest packages of Trilinos
> but still use ML since it would produce the same "results" since it was not
> changed. This is not a great solution.
>
> 2) name-space each  "implementation" of the each package so one can
> download the latest Trilinos (which may have a new ML) but link in the
> appropriate older implementation version of ML so they get the same
> numerical results (since they are using the same ML) but newer parts of
> Trilinos are also available and newer versions of ML are also available. With
> the right design this is actually possible; essentially Trilinos source would be
> current source plus previous source all carefully name-spaced (or handled
> with clever dynamic loading or both) so the application can pick and chose
> what "version" of packages it uses. Ideally (and possible) the same
> application could in different places use different versions.
>
>   Note that a really good regression suite should tolerate reasonable changes
> in convergence of iterative methods. Since even upgrading the compiler can
> change convergence rates/behaviors expecting nearly identical results in the
> regression test is not reasonable. The problem is that the theory of how to
> design such regression tests is very immature and virtually all regression tests
> do not tolerate reasonable variation so many users do expect virtually
> "identical results".
>
>    So is there anyway (except 2 to the extreme) we can help users with this
> conundrum.
>
>    Barry
>
>
>
>
> On Nov 21, 2013, at 11:05 AM, Bartlett, Roscoe A. <bartlettra at ornl.gov>
> wrote:
>
> > Okay, so I have some concrete use cases from the CASL VERA effort that I
> am involved with that has issues with PETSc and Trilinos.  We have a situation
> where we need to couple together multiple Fortran and C++ codes into
> single executables that use mixes of PETSc and Trilinos and it is a mess and
> not because of any interfacing issues really.  Here is why ...
> >
> > The LANL code Hydra-TH is using PETSc and some old version of ML.
> Therefore we can't even link Hydra-TH in with code that uses current
> versions of Trilinos.  We have not even tried to use the up-to-date version of
> ML under PETSc and to do so would create a nasty circular dependency in out
> build process the way it is defined now.  More on this issue below.
> >
> > The INL code MOOSE with APP Peregrine (PNNL) uses PETSc with HYPRE.
> >
> > We have two parallel Fortran codes using PETSc, MPACT (Univ. of Mich.)
> and COBRA-TF (Penn. State), that have overlapping sets of processes and
> they can't currently figure out how to initialize PETSc to work in these cases
> (they may ask for some help actually).  Also, what about nested PETSc solves
> from different applications?  What does that output look like if you could
> even get it to run (which they have not yet)?
> >
> > The ORNL code  Insilico (part of the Exnihilo repo that includes Denovo)
> uses up-to-date solvers in Trilinos.
> >
> > CASL VERA has a top level driver code called Tiamat that couples together
> COBRA-TF (which uses PETSc), Peregrine/MOOSE (which also uses
> PETSc/HYPRE), and Insilico (which uses up-to-date Trilinos).  In addition, it
> runs these codes in different clusters of processes that may or may not
> overlap (that is a runtime decision on startup).  The output from this coupled
> code dumped to STDOUT is currently incompressible.
> >
> > As of right now, it is completely impractical to couple Hydra-TH into Insilico
> because of their use of PETSc and an old version of ML from Trilinos.  As a
> matter of fact, it is agreed by all parties that before that happens, Hydra-TH
> needs to either use only PETSc/HYPRE or Trilinos but no mixing the two at all.
> It is recognized that changing Hydra-TH to use PETSc/HYPRE or all Trilinos will
> be a huge job because it will change all of their tests.  It will be a large amount
> of work to re-verify all of their Exodus-based gold-standard regression tests
> for a change like this (and so they have put this off for a year!).
> >
> > Therefore, before there is any discussion of new fancy interfaces, we have
> to resolve the following issues first:
> >
> > 1) The current version of PETSc must be compatible with the current
> version of Trilinos so they can be linked in a single executable and we must
> remove link order dependencies.  Also, users need to know ranges of
> versions of Trilinos and PETSc that are link and runtime compatible.  The
> better that the mature capabilities in Trilinos and PETSc can maintain
> backward compatibility over longer ranges of time/versions, the easier this
> gets.  The ideas for doing this are described in the section "Regulated
> Backward Compatibility" in
> http://web.ornl.gov/~8vt/TribitsLifecycleModel_v1.0.pdf .  Also, PETSc
> should support dependency inversion and dependency injection so that
> people can add ML support into PETSc without having to directly have ML
> upstream from PETSc.  We can do this already with Stratimikos in Trilinos so
> we could add a PETSc solver or preconditioner (or any other preconditioner
> or solver) in a downstream library.  This is already being used in production
> code in CASL in  Insilico.  There is a little interface issue here but not much.
> >
> > 2) The user needs to have complete control of where output goes on an
> object-by-object based on each process, period.   Otherwise, multiphysics
> codes (either all PETSc or Trilinos or mixes) create incomprehensible output.
> This also applies to nested solves (i.e. how does output form GMRES nested
> inside of GMRES output looks like?).  We have suggested standards for this in
> Trilinos that if every code followed would solve this problem (see GCG 18 in
> http://web.ornl.gov/~8vt/TrilinosCodingDocGuidelines.pdf ).  While this is
> not an official standard in Trilinos I think it is followed pretty well by more or
> less by the more modern basic linear solvers and preconditioners.  How do
> you allow users to customize how PETSc, Trilinos, and their own objects
> create and intermix output such that it is useful for them?  In a complex
> multi-physics application, this is very hard.
> >
> > From the standpoint of CASL, if all these physics codes used the more
> modern Trilinos preconditioners and solvers, all of the above problems would
> go away.  But that is just not feasible right now for many of the reasons listed
> above and below.
> >
> > NOT: The reason the Fortran codes use PETSC is because Trilinos has no
> acceptable Fortran interface and you need to be a C++ programmer to write
> a customized Fortran to C++ interface to Trilinos.  And in our experience, if a
> programming team knows C++ in addition to Fortran, they would be writing a
> lot of their coordination code C++ in the first place where they could just
> directly be using Trilinos from C++ (and therefore no need for a Fortran
> interface for Trilinos).  The lack of a general portable Fortran interface to
> basic Trilinos data structures and other facilities makes every Fortran-only
> team go to PETSc.  That is a no-brainer.  But here-in we have the current
> status quo and why it is not feasible to switch all of CASL VERA codes over to
> Trilinos.
> >
> > Therefore, I would say that before there is any talk of more detailed
> interfaces or interoperability between Trilinos and PETSc that we first solve
> the basic problems of version compatibility, dependency injection, and
> outputting control.  While these problems are much easier than the more
> challenging interfacing work, they will still require a lot of ongoing efforts.
> >
> > Cheers,
> >
> > -Ross
> >
> >> -----Original Message-----
> >> From: Barry Smith [mailto:bsmith at mcs.anl.gov]
> >> Sent: Wednesday, November 20, 2013 11:16 PM
> >> To: Jed Brown
> >> Cc: Bartlett, Roscoe A.; petsc-trilinos-discussion at lists.mcs.anl.gov
> >> Subject: Re: [Petsc-trilinos-discussion] Scope and requirements
> >>
> >>
> >>   Hmm, the PETSc wrapper for ML is rather clunky and fragile (meaning it is
> >> difficult to change safely, that is without breaking something else in the
> >> process). This could be for several reasons
> >>
> >>    1) ML has evolved and doesn't have the clearest documentation
> >>    2) there isn't a great match between the abstractions/code organization
> in
> >> ML/Trilinos and PETSc
> >>    3) the interface was done "by hand" as needed each time to get a bit
> more
> >> functionality across and hence is a bit ad hoc.
> >>
> >>   I hate to think of having many of these fragile ad hoc interfaces hanging
> >> around. So how could this be done in scalable maintainable way? If we
> >> understand the fundamental design principles of the two packages to
> >> determine commonalities (and possibly troublesome huge differences)
> we
> >> may be able to see what changes could be made to make it easier to mate
> >> plugins from the two sides. So a brief discussion of PETSc object life cycles
> for
> >> the Trilinos folks. If they can produce something similar for Trilinos that
> would
> >> help us see the sticky points.
> >>
> >>     Most important PETSc objects have the following life cycles (I'll use Mat
> as
> >> the example class, same thing holds for other classes, like nonlinear
> solves,
> >> preconditioners....)
> >>
> >>     Mat mat = MatCreate(MPI_Comm)
> >>     MatSetXXX()                                         // can set come generic properties of
> the
> >> matrix, like size
> >>     MatSetType()    // instantiate the actual class like compressed sparse
> row,
> >> or matrix free or ....
> >>     MatSetYYY()      // set generic properties of the matrix or ones specific to
> >> the actual class instantiated
> >>     MatSetFromOptions()   // allow setting options from the command line
> etc
> >> for the matrix
> >>     MatSetUp()        //  "setup" the up the matrix so that actual methods
> may
> >> be called on the matrix
> >>
> >>         In some sense all of the steps above are part of the basic constructor
> of
> >> the object (note at this point we still don't have any entries in the matrix)
> >>         Also at this point the "size" and parallel layout of the matrix (for
> solvers
> >> the size of the vectors and matrices it uses) is set in stone and cannot be
> >> changed
> >>         (without an XXXReset()).
> >>
> >>      MatSetValues()
> >>      MatAssemblyBegin/End()    // phases to put values into matrices with
> >> explicit storage of entries
> >>
> >>         Once this is done one can perform operations with the matrix
> >>
> >>      MatMult()
> >>      etc
> >>
> >>      MatDestroy() or
> >>      MatReset()         // cleans out the object of everything related to the
> >> size/parallel layout of the vectors/matrices but leaves the type and
> options
> >> that have been
> >>                                    set, this is to allow one to use the same (solver) object
> >> again for a different size problem (due to grid refinement or whatever)
> >> without
> >>                                    needing to recreate the object from scratch.
> >>
> >>     There are a few other things like serialization but I don't think they
> matter
> >> in this discussion. There is reference counting so you can pass objects into
> >> other objects etc and the objects will be kept around automatically until
> the
> >> reference counts go down to zero. If you have a class that provides these
> >> various stages then it is not terribly difficult to wrap them up to look like
> >> PETSc objects (what Jed called a plugin). In fact for ML we have a
> >> PCCreate_ML() PCSetUp_ML() PCDestroy_ML() etc.
> >>
> >>      So one way to program with PETSc is to write code that manages the
> life
> >> cycles of whatever objects are needed. For example you create a linear
> >> solver object, a matrix object, some vector objects, set the right options,
> fill
> >> them up appropriately, call the solver and then cleanup. Same with
> nonlinear
> >> solvers, ODE integrators, eigensolvers.
> >>
> >>      For "straightforward" applications this model can often be fine. When
> one
> >> wants to do more complicated problems or use algorithms that require
> more
> >> information about the problem such as geometric multigrid and "block"
> >> solvers (here I mean block solvers like for Stokes equation and "multi
> >> physics" problems not block Jacobi or multiple right hand side "block
> solvers")
> >> requiring the user to mange the life cycles of all the vectors, matrices,
> >> auxiliary vector and matrices (creating them, giving them appropriate
> >> dimensions, hooking them together, filling them with appropriate values,
> >> and finally destroying them) is asking too much of the users.  PETSc has
> the
> >> DM object, which can be thought of as a "factory" for the correctly sized
> sub
> >> vectors and matrices needed for the particular problem being solved. The
> >> DM is given to the solver then the solver queries the DM to create
> whatever
> >> of those objects it needs in the solution process. For example with
> geometric
> >> multigrid the PCMG asks the DM for each coarser grid matrix. For a Stokes
> >> problem PCFIELDSPLIT can ask for (0,0) part of the operator, or the (1,1)
> etc
> >> and build up the solver from the objects provided by the DM.  Thus a
> typical
> >> PETSc program creates an appropriate solver object (linear, nonlinear,
> ODE,
> >> eigen), creates a DM for the problem, passes the DM to the solver and
> >> during the solver set up it obtains all the information it needs from the
> DM
> >> and solves the problem. Obviously the DM needs information about the
> >> mesh and PDE being solved.
> >>
> >>   I am not writing this to advocate making Trilinos follow the same model
> >> (though I think you should :-)) but instead to develop a common
> >> understanding of what is common in our approaches (and can be
> leveraged)
> >> and what is fundamentally different (and could be troublesome). For
> >> example, the fact that you have a different approach to creating Trilinos
> >> objects (using C++ factories) is superficially very different from what
> PETSc
> >> does, but that may not really matter.
> >>
> >>   Barry
> >>
> >>
> >>
> >> On Nov 20, 2013, at 5:16 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
> >>
> >>> Barry Smith <bsmith at mcs.anl.gov> writes:
> >>>>  One "suggestion" is, could there be a "higher-level" interface that
> >>>>  people could use that incorporated either PETSc or Trilinos
> >>>>  underneath? A difficulty I see with that approach is that design
> >>>>  decisions (either in C or C++) at one level of the stack permeate
> >>>>  the entire stack and users have to see more then just the top level
> >>>>  of the stack from our libraries. For example,
> >>>
> >>> I think that if the program managers want better integration and an
> >>> easier transition between packages, that instead of creating a new
> >>> high-level interface, we should build out our cross-package plugin
> >>> support.  For example, ML is a popular solver among PETSc users, but we
> >>> could add support for more Trilinos preconditioners and solvers, and
> >>> even support assembling directly into a [TE]petra matrix (PETSc matrices
> >>> are more dynamic, but I still think this can be done reasonably).  We
> >>> probably need to talk some about representing block systems, but I
> don't
> >>> think the issues are insurmountable.
> >>>
> >>> Then applications can freely choose whichever package they find more
> >>> convenient (based on experience, types of solvers, implementation
> >>> language, etc) with confidence that they will be able to access any
> >>> unique features of the other package.  When coupling multiple
> >>> applications using different solver packages, the coupler should be able
> >>> to choose either package to define the outer solve, with the same code
> >>> assembling either a monolithic matrix or a split/nested matrix with
> >>> native preconditioners within blocks.
> >>>
> >>> As I see it, there is a fair amount of duplicate effort by packages
> >>> (such as libMesh, Deal.II, FEniCS) that ostensibly support both Trilinos
> >>> and PETSc, but were not written by "solvers people" and are managing
> the
> >>> imperfect correspondence themselves.  The unified interfaces that
> these
> >>> projects have built are generally less capable than had they committed
> >>> entirely to either one of our interfaces.
> >>>
> >>>
> >>> It will take some effort to implement this interoperability and it's
> >>> hard to sneak in under "basic research" like a lot of other software
> >>> maintenance tasks, but I think that providing it within our own plugin
> >>> systems is less work and can be supported better than any alternative.
> >



More information about the Petsc-trilinos-discussion mailing list