[petsc-dev] PETSc factories 101 and 102

Sat Sep 8 05:39:27 CDT 2012

   Some discussion came up today about the use of "factories" in C++ (and other OO languages) code and I thought it would be useful to related it to how PETSc handles the same issue since some numerical libraries (the next generation of Trilinos's ML and Chombo) are using factories extensively and recklessly (and will, I am confident, alienate a lot of users. I love it when our competitors make the mistake of chasing every new idea down the wrong road).

    "In object-oriented computer programming, a factory is an object for creating other objects ", "it deals with the problem of creating objects (products) without specifying the exact class of object that will be created."   http://en.wikipedia.org/wiki/Factory_method_pattern http://en.wikipedia.org/wiki/Factory_(software_concept)

    So for example if in my code I need a KSP solver object I could do something like

     void myroutine( …..) {
        KSPObject  *kspobject = new KSPGMRESObject

   where KSPObject is my abstract class and KSPGMRESObject is a specific implementation some one has written.  Now in the code I can go off and use the KSPObject to solve something and the rest of my code does not need to "know" that the KSPObject is actually a KSPGMRESObject.  But since when I create the object I cannot create an abstract class and can only create a specific implementation my code has now hardwired the KSPGMRESObject.  Factories are a way of "avoiding" this hardwiring.  I can introduce a new class KSPObjectFactory that has a method newKSPObject() and then reorganize my code as

    void myroutine(KSPObjectFactory *factory, ….) {
       KSPObject *kspobject = factor->newKSPObject(); 

  and I've removed the explicit use of a particular implementation constructor from my routine.  The actual decision of what type of KSPObject to create is "pushed up higher in the code" and involves the factory.   So for example I could have a KSPObjectFactory() that produces a particular implementation of a KSPObject by setting a string name into the factory.  So if BarrysKSPObjectFactory is a particular implementation of KSPObjectFactory then I could write "higher up in my code" 

       BarrysKSPObjectFactory  *kspobjectfactory = new BarrysKSPObjectFactory;
        kspobjectfactory->setimplementationbyname("gmres");
        myroutine(kspobjectfactory); 

one could also consider a method on BarrysKSPObjectFactory   setimplementationbyCommandLineArgs(argsc,args);   now I can "push up" the decision of what KSPObject to actually use to runtime as a command option.  

    A drawback to factories is that it can easily double the number of different classes that users have to deal with and many people (at least Mark and I) find it cumbersome.

PETSc factories 101
----------------------------

     In PETSc because all objects are essentially delegator objects ("the delegation pattern is a design pattern in object-oriented programming where an object, instead of performing one of its stated tasks, delegates that task to an associated helper object" http://en.wikipedia.org/wiki/Delegation_pattern)  when we "create" a PETSc object we have not yet actually created the delegated object and thus do not need traditional factories for the purpose listed above. For example

      KSP  ksp;
      KSPCreate(comm,&ksp); 

      gives me a KSP solver object that I can pass around to other code, keep references to, and even set options on but it does not have a specific implementation of a solver wired to it yet.  When I call 

      KSPSetType(ksp,"gmres");  or  KSPSetFromOptions(ksp);  

     what happens is the KSP object looks for the "gmres factory" that has been registered with KSPRegister("gmres",KSPCreate_GMRES,..) and calls KSPCreate_GMRES() to generate the delegate that will actually do the solving. 

      Since the delegate is completely encapsulated inside the ksp object I can change the delegate at a later time in the code to have a different implementation by just calling 

      KSPSetType(ksp,"cg")

      The old delegate is freed and the new solver implementation is put in place. And all references to the ksp object continue to work (just using the new solver).

      So you see the design of PETSc allows "pushing up" the specific choice of implementations of classes in essentially the same way as factory objects do but without the need for users to explicitly create and manipulate the factories.

PETSc factories 102
---------------------------

     The other place PETSc uses factories is to allow "mesh information" to determine algebraic objects that are created within algebraic solvers. This is done with the DM abstract class which you can think of as a factory for Vecs and Mats (though it does other things as well). 

      Consider the nonlinear solver SNES in PETSc.  I can use 

       SNESSetJacobian(snes,J,J,func,ctx) 

to provide the matrix that Newton's method is going to use.  But these means that my code that creates the SNES object and sets is various parameters has to explicit know how to create the J Mat object.  If I am using a complicated meshing package that generated some grid and is going to use finite elements to compute the Jacobian J I'd like the figuring out of the size and sparsity pattern of the J to be handled by the meshing package.  Thus I would make an implementation of the DM abstract class (say MattsDM) that does all this yucky figuring out. Then I could call 

DMCreateMatrix(dm,&J)   /* now my solver code sees nothing of the yuckyness of particular mesh details */

and then pass the J to 

SNESSetJacobian(snes,J,J,func,ctx) 

Similarly I can do the same thing to create vectors. 

We can take this one step further. So far I've been assuming that the application programmer is explicitly creating the algebraic objects (Mats and Vecs) and passing them to the solver object. 

Once solver objects start getting complicated; for example with multigrid methods it is painful for the user to create ALL the vectors and matrices needed for the multiple levels and provide them all to PCMG (though it is possible and we provide interfaces for doing that).  

Instead we can create DMs that can generate appropriate sized vectors and matrices and give those DMs to the solver object and the solver then calls the methods to get new vectors and matrices wherever it needs them inside the solver.   PCMG would still need several DMs, one for each level. But rather than requiring the user to create these several DMs the user can create a single DM and the DM objects have methods in them that generate coarser or finer DMs that can be used to generate the vectors and matrices on the other levels.  This is why a simple call of SNESSetDM(snes,dm) allows the nonlinear and linear solver objects (including all the levels of multigrid) to create all their various needed vectors and matrices.

When composing complicated solvers this approach can be extremely powerful, one can envision DMs being able to generate sub DMs that represent just pieces of the physics and using those DMs to generate the vectors and matrices for the solver associated with the sub physics (in PCFieldSplit). Thus we can generate all the algebraic pieces for very complicated nested solvers with multigrid inside fieldsplit and fieldsplit into multigrid etc for any number of levels with one simple paradigm.

In conclusion, you can think of PETSc as having one important visible factory class the DM and then one factory completely transparent to the user for each abstract PETSc object: IS, Vec,Mat,KSP,SNES, and yes even DM. IMHO factories are a powerful and useful tool for large libraries but they should be used sparingly and most of them though thoughtful design need never be seen the users (because if seen by the users that steepens the learning curve a great deal)

   Questions, clarifications?

    Barry

We actually have another factory in PETSc, MatGetVecs() that returns for a given matrix appropriately sized vectors. I don't have some grand philosophical reason for it to be around, but it is a very useful utility since often when you have a matrix you need vectors to perform operations with it and it is cumbersome to have to pass some vector through several layers of routines to get it to where it is needed to interact with the matrix.