<html>

  <head>


    <meta http-equiv="content-type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    Hello,<br>

    I'm trying to convert some PETSc examples from C to Python. I noted

    a strange behavior of the snes solver. It seems that the same

    example in Python and C are characterized by a different number of

    function evaluations. I converted ex25.c example in the "

    snes/examples/tutorials" folder.  The number of function evaluations

    is always bigger in Python and increases with the number of grid

    nodes. The increased number of function evaluations affects the

    performance and in other examples can drastically increase the

    execution time (for example in ex30.c " Steady-state 2D

    subduction...").<br>

    This is my Python version of the simple ex25.c:<br>

    <br>

    ## # SNES/Tutorials/ex25<br>

    import sys<br>

    import petsc4py<br>

    petsc4py.init(sys.argv)<br>

    from petsc4py import PETSc<br>

    import numpy as np<br>

    <br>

    class UserContext(object):<br>

        def __init__(self, da):<br>

            self.da = da<br>

            self.localX = da.createLocalVec()<br>

            <br>

        def formFunction(self, snes, X, F):<br>

            <br>

            self.da.globalToLocal(X, self.localX)<br>

            x = self.da.getVecArray(self.localX)<br>

            f = self.da.getVecArray(F)<br>

            <br>

            (x0, x1), (y0, y1)= self.da.getRanges()<br>

            hx = 1/float(x1-1)<br>

            hy = 1/float(y1-1)<br>

            <br>

            for j in range(y0, y1):<br>

                for i in range(x0, x1):<br>

                    if i == 0 or i == (x1-1) or j == 0 or j == (y1-1 ):<br>

                        f[j,i] = x[j,i] - (1.0 - (2.0 * hx * i - 1.0) *

    (2.0 * hx * i -1.0))<br>

                    else:<br>

                        gradup    = (x[j+1, i] -x[j, i])/hy<br>

                        graddown  = (x[j, i]   -x[j-1, i])/hy<br>

                        gradright = (x[j, i+1] -x[j, i])/hx<br>

                        gradleft  = (x[j, i]   -x[j, i-1])/hx<br>

                        <br>

                        gradx = 0.5 * (x[j, i+1] - x[j, i-1])/hx<br>

                        grady = 0.5 * (x[j+1, i] - x[j-1, i])/hy<br>

                        <br>

                        coeffup = 1.0/np.sqrt(1.0 + gradup * gradup +

    gradx * gradx)<br>

                        coeffdown = 1.0/np.sqrt(1.0 + graddown *

    graddown + gradx * gradx)<br>

                        <br>

                        coeffleft  = 1.0/np.sqrt(1.0 + gradleft *

    gradleft + grady * grady)<br>

                        coeffright = 1.0/np.sqrt(1.0 + gradright *

    gradright + grady * grady)<br>

            <br>

                        f[j, i] = (coeffup * gradup - coeffdown *

    graddown) * hx + (coeffright * gradright - coeffleft * gradleft) *

    hy<br>

    <br>

    <br>

    snes = PETSc.SNES().create()<br>

    da = PETSc.DMDA().create(dim=(-5, -5),<br>

                            stencil_type = PETSc.DMDA.StencilType.STAR,<br>

                            stencil_width =1,<br>

                            setup=False)<br>

    da.setFromOptions()<br>

    da.setUp()<br>

    <br>

    snes.setDM(da)<br>

    <br>

    ctx = UserContext(da)<br>

    da.setAppCtx(ctx)<br>

    <br>

    F = da.createGlobalVec()<br>

    snes.setFunction(ctx.formFunction, F)<br>

    <br>

    snes.setFromOptions()<br>

    <br>

    x = da.createGlobalVector()<br>

    <br>

    snes.solve(None, x)<br>

    its = snes.getIterationNumber()<br>

    lits = snes.getLinearSolveIterations()<br>

    <br>

    print "Number of SNES iterations = %d" % its<br>

    print "Number of Linear iterations = %d" % lits<br>

    <br>

    litspit = lits/float(its)<br>

    print "Average Linear its / SNES = %e" % float(litspit)<br>

    <br>

    #####################################<br>

    <br>

    I executed the code with the following options:<br>

    <br>

    python ex25.py <span style="font-family:monospace"><span

        style="color:#000000;background-color:#ffffff;">-snes_view

        -snes_monitor -da_grid_x 50 -da_grid_y 50</span></span><br>

    <br>

    This is the output from Python:<br>

    <br>

    <title>Konsole output</title>

    <div>

      <span style="font-family:monospace"><span

          style="color:#000000;background-color:#ffffff;">  0 SNES

          Function norm 7.229568285876e+00  </span><br>

         1 SNES Function norm 1.970854029907e-02  <br>

         2 SNES Function norm 5.721344031051e-03  <br>

         3 SNES Function norm 1.179738644817e-03  <br>

         4 SNES Function norm 1.028879149924e-04  <br>

         5 SNES Function norm 7.472012969660e-07  <br>

         6 SNES Function norm 4.744184684658e-11  <br>

        SNES Object: 1 MPI processes

        <br>

         type: newtonls

        <br>

         maximum iterations=50, maximum function evaluations=10000

        <br>

         tolerances: relative=1e-08, absolute=1e-50, solution=1e-08

        <br>

         total number of linear solver iterations=223

        <br>

         total number of function evaluations=37

        <br>

         norm schedule ALWAYS

        <br>

         SNESLineSearch Object:   1 MPI processes

        <br>

           type: bt

        <br>

             interpolation: cubic

        <br>

             alpha=1.000000e-04

        <br>

           maxstep=1.000000e+08, minlambda=1.000000e-12

        <br>

           tolerances: relative=1.000000e-08, absolute=1.000000e-15,

        lambda=1.000000e-08

        <br>

           maximum iterations=40

        <br>

         KSP Object:   1 MPI processes

        <br>

           type: gmres

        <br>

             GMRES: restart=30, using Classical (unmodified)

        Gram-Schmidt Orthogonalization with no iterative refinement

        <br>

             GMRES: happy breakdown tolerance 1e-30

        <br>

           maximum iterations=10000, initial guess is zero

        <br>

           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000

        <br>

           left preconditioning

        <br>

           using PRECONDITIONED norm type for convergence test

        <br>

         PC Object:   1 MPI processes

        <br>

           type: ilu

        <br>

             ILU: out-of-place factorization

        <br>

             0 levels of fill

        <br>

             tolerance for zero pivot 2.22045e-14

        <br>

             matrix ordering: natural

        <br>

             factor fill ratio given 1, needed 1

        <br>

               Factored matrix follows:

        <br>

                 Mat Object:           1 MPI processes

        <br>

                   type: seqaij

        <br>

                   rows=2500, cols=2500

        <br>

                   package used to perform factorization: petsc

        <br>

                   total: nonzeros=12300, allocated nonzeros=12300

        <br>

                   total number of mallocs used during MatSetValues

        calls =0

        <br>

                     not using I-node routines

        <br>

           linear system matrix = precond matrix:

        <br>

           Mat Object:     1 MPI processes

        <br>

             type: seqaij

        <br>

             rows=2500, cols=2500

        <br>

             total: nonzeros=12300, allocated nonzeros=12300

        <br>

             total number of mallocs used during MatSetValues calls =0

        <br>

               not using I-node routines

        <br>

        Number of SNES iterations = 6

        <br>

        Number of Linear iterations = 223

        <br>

        Average Linear its / SNES = 3.716667e+01<br>

      </span></div>

    <br>

    This the output from the original C code with the same options

    (./ex25  <span style="font-family:monospace"><span

        style="color:#000000;background-color:#ffffff;">-snes_view

        -snes_monitor -da_grid_x 50 -da_grid_y 50)</span></span>:<br>

    <br>

    <title>Konsole output</title>

    <div>

      <span style="font-family:monospace"><span

          style="color:#000000;background-color:#ffffff;">  0 SNES

          Function norm 7.229568285876e+00  </span><br>

         1 SNES Function norm 1.970854029907e-02  <br>

         2 SNES Function norm 5.721344035538e-03  <br>

         3 SNES Function norm 1.179738636852e-03  <br>

         4 SNES Function norm 1.028879138425e-04  <br>

         5 SNES Function norm 7.472012249912e-07  <br>

         6 SNES Function norm 4.743884624164e-11  <br>

        SNES Object: 1 MPI processes

        <br>

         type: newtonls

        <br>

         maximum iterations=50, maximum function evaluations=10000

        <br>

         tolerances: relative=1e-08, absolute=1e-50, solution=1e-08

        <br>

         total number of linear solver iterations=223

        <br>

         total number of function evaluations=7

        <br>

         norm schedule ALWAYS

        <br>

         SNESLineSearch Object:   1 MPI processes

        <br>

           type: bt

        <br>

             interpolation: cubic

        <br>

             alpha=1.000000e-04

        <br>

           maxstep=1.000000e+08, minlambda=1.000000e-12

        <br>

           tolerances: relative=1.000000e-08, absolute=1.000000e-15,

        lambda=1.000000e-08

        <br>

           maximum iterations=40

        <br>

         KSP Object:   1 MPI processes

        <br>

           type: gmres

        <br>

             GMRES: restart=30, using Classical (unmodified)

        Gram-Schmidt Orthogonalization with no iterative refinement

        <br>

             GMRES: happy breakdown tolerance 1e-30

        <br>

           maximum iterations=10000, initial guess is zero

        <br>

           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000

        <br>

           left preconditioning

        <br>

           using PRECONDITIONED norm type for convergence test

        <br>

         PC Object:   1 MPI processes

        <br>

           type: ilu

        <br>

             ILU: out-of-place factorization

        <br>

             0 levels of fill

        <br>

             tolerance for zero pivot 2.22045e-14

        <br>

             matrix ordering: natural

        <br>

             factor fill ratio given 1, needed 1

        <br>

               Factored matrix follows:

        <br>

                 Mat Object:           1 MPI processes

        <br>

                   type: seqaij

        <br>

                   rows=2500, cols=2500

        <br>

                   package used to perform factorization: petsc

        <br>

                   total: nonzeros=12300, allocated nonzeros=12300

        <br>

                   total number of mallocs used during MatSetValues

        calls =0

        <br>

                     not using I-node routines

        <br>

           linear system matrix = precond matrix:

        <br>

           Mat Object:     1 MPI processes

        <br>

             type: seqaij

        <br>

             rows=2500, cols=2500

        <br>

             total: nonzeros=12300, allocated nonzeros=12300

        <br>

             total number of mallocs used during MatSetValues calls =0

        <br>

               not using I-node routines

        <br>

        Number of SNES iterations = 6

        <br>

        Number of Linear iterations = 223

        <br>

        Average Linear its / SNES = 3.716667e+01<br>

      </span></div>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

    <br>

    Why the number of function evaluations is so different?<br>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

    I'm using last version of PETSc and petsc4py.<br>

    Thank you.<br>

    <br>

    Nicola<br>

    <br>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">

    <pre class="moz-signature" cols="72">-- 

_____________________________________________________________________

Nicola Creati

Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS

IRI (Ricerca Teconologica e Infrastrutture) Department

B.go Grotta Gigante - Brisciki 42/c

34010 Sgonico - Zgonik (TS) - Italy

Tel. +39-0402140213

Fax  +39-040327307</pre>

  </body>

</html>