<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Feb 8, 2016 at 2:41 AM, Nicola Creati <span dir="ltr"><<a href="mailto:ncreati@inogs.it" target="_blank">ncreati@inogs.it</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  
  <div text="#000000" bgcolor="#FFFFFF">

    Hello,<br>

    I'm trying to convert some PETSc examples from C to Python. I noted

    a strange behavior of the snes solver. It seems that the same

    example in Python and C are characterized by a different number of

    function evaluations. I converted ex25.c example in the "

    snes/examples/tutorials" folder.  The number of function evaluations

    is always bigger in Python and increases with the number of grid

    nodes. The increased number of function evaluations affects the

    performance and in other examples can drastically increase the

    execution time (for example in ex30.c " Steady-state 2D

    subduction...").<br>

    This is my Python version of the simple ex25.c:<br></div></blockquote><div><br></div><div>You did not provide the Jacobian function, resulting in a finite difference approximation that uses more function evaluations.</div><div><br></div><div>   Matt</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF">

    ## # SNES/Tutorials/ex25<br>

    import sys<br>

    import petsc4py<br>

    petsc4py.init(sys.argv)<br>

    from petsc4py import PETSc<br>

    import numpy as np<br>

    <br>

    class UserContext(object):<br>

        def __init__(self, da):<br>

            self.da = da<br>

            self.localX = da.createLocalVec()<br>

            <br>

        def formFunction(self, snes, X, F):<br>

            <br>

            self.da.globalToLocal(X, self.localX)<br>

            x = self.da.getVecArray(self.localX)<br>

            f = self.da.getVecArray(F)<br>

            <br>

            (x0, x1), (y0, y1)= self.da.getRanges()<br>

            hx = 1/float(x1-1)<br>

            hy = 1/float(y1-1)<br>

            <br>

            for j in range(y0, y1):<br>

                for i in range(x0, x1):<br>

                    if i == 0 or i == (x1-1) or j == 0 or j == (y1-1 ):<br>

                        f[j,i] = x[j,i] - (1.0 - (2.0 * hx * i - 1.0) *

    (2.0 * hx * i -1.0))<br>

                    else:<br>

                        gradup    = (x[j+1, i] -x[j, i])/hy<br>

                        graddown  = (x[j, i]   -x[j-1, i])/hy<br>

                        gradright = (x[j, i+1] -x[j, i])/hx<br>

                        gradleft  = (x[j, i]   -x[j, i-1])/hx<br>

                        <br>

                        gradx = 0.5 * (x[j, i+1] - x[j, i-1])/hx<br>

                        grady = 0.5 * (x[j+1, i] - x[j-1, i])/hy<br>

                        <br>

                        coeffup = 1.0/np.sqrt(1.0 + gradup * gradup +

    gradx * gradx)<br>

                        coeffdown = 1.0/np.sqrt(1.0 + graddown *

    graddown + gradx * gradx)<br>

                        <br>

                        coeffleft  = 1.0/np.sqrt(1.0 + gradleft *

    gradleft + grady * grady)<br>

                        coeffright = 1.0/np.sqrt(1.0 + gradright *

    gradright + grady * grady)<br>

            <br>

                        f[j, i] = (coeffup * gradup - coeffdown *

    graddown) * hx + (coeffright * gradright - coeffleft * gradleft) *

    hy<br>

    <br>

    <br>

    snes = PETSc.SNES().create()<br>

    da = PETSc.DMDA().create(dim=(-5, -5),<br>

                            stencil_type = PETSc.DMDA.StencilType.STAR,<br>

                            stencil_width =1,<br>

                            setup=False)<br>

    da.setFromOptions()<br>

    da.setUp()<br>

    <br>

    snes.setDM(da)<br>

    <br>

    ctx = UserContext(da)<br>

    da.setAppCtx(ctx)<br>

    <br>

    F = da.createGlobalVec()<br>

    snes.setFunction(ctx.formFunction, F)<br>

    <br>

    snes.setFromOptions()<br>

    <br>

    x = da.createGlobalVector()<br>

    <br>

    snes.solve(None, x)<br>

    its = snes.getIterationNumber()<br>

    lits = snes.getLinearSolveIterations()<br>

    <br>

    print "Number of SNES iterations = %d" % its<br>

    print "Number of Linear iterations = %d" % lits<br>

    <br>

    litspit = lits/float(its)<br>

    print "Average Linear its / SNES = %e" % float(litspit)<br>

    <br>

    #####################################<br>

    <br>

    I executed the code with the following options:<br>

    <br>

    python ex25.py <span style="font-family:monospace"><span style="color:#000000;background-color:#ffffff">-snes_view

        -snes_monitor -da_grid_x 50 -da_grid_y 50</span></span><br>

    <br>

    This is the output from Python:<br>

    <br>

    
    <div>

      <span style="font-family:monospace"><span style="color:#000000;background-color:#ffffff">  0 SNES

          Function norm 7.229568285876e+00  </span><br>

         1 SNES Function norm 1.970854029907e-02  <br>

         2 SNES Function norm 5.721344031051e-03  <br>

         3 SNES Function norm 1.179738644817e-03  <br>

         4 SNES Function norm 1.028879149924e-04  <br>

         5 SNES Function norm 7.472012969660e-07  <br>

         6 SNES Function norm 4.744184684658e-11  <br>

        SNES Object: 1 MPI processes

        <br>

         type: newtonls

        <br>

         maximum iterations=50, maximum function evaluations=10000

        <br>

         tolerances: relative=1e-08, absolute=1e-50, solution=1e-08

        <br>

         total number of linear solver iterations=223

        <br>

         total number of function evaluations=37

        <br>

         norm schedule ALWAYS

        <br>

         SNESLineSearch Object:   1 MPI processes

        <br>

           type: bt

        <br>

             interpolation: cubic

        <br>

             alpha=1.000000e-04

        <br>

           maxstep=1.000000e+08, minlambda=1.000000e-12

        <br>

           tolerances: relative=1.000000e-08, absolute=1.000000e-15,

        lambda=1.000000e-08

        <br>

           maximum iterations=40

        <br>

         KSP Object:   1 MPI processes

        <br>

           type: gmres

        <br>

             GMRES: restart=30, using Classical (unmodified)

        Gram-Schmidt Orthogonalization with no iterative refinement

        <br>

             GMRES: happy breakdown tolerance 1e-30

        <br>

           maximum iterations=10000, initial guess is zero

        <br>

           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000

        <br>

           left preconditioning

        <br>

           using PRECONDITIONED norm type for convergence test

        <br>

         PC Object:   1 MPI processes

        <br>

           type: ilu

        <br>

             ILU: out-of-place factorization

        <br>

             0 levels of fill

        <br>

             tolerance for zero pivot 2.22045e-14

        <br>

             matrix ordering: natural

        <br>

             factor fill ratio given 1, needed 1

        <br>

               Factored matrix follows:

        <br>

                 Mat Object:           1 MPI processes

        <br>

                   type: seqaij

        <br>

                   rows=2500, cols=2500

        <br>

                   package used to perform factorization: petsc

        <br>

                   total: nonzeros=12300, allocated nonzeros=12300

        <br>

                   total number of mallocs used during MatSetValues

        calls =0

        <br>

                     not using I-node routines

        <br>

           linear system matrix = precond matrix:

        <br>

           Mat Object:     1 MPI processes

        <br>

             type: seqaij

        <br>

             rows=2500, cols=2500

        <br>

             total: nonzeros=12300, allocated nonzeros=12300

        <br>

             total number of mallocs used during MatSetValues calls =0

        <br>

               not using I-node routines

        <br>

        Number of SNES iterations = 6

        <br>

        Number of Linear iterations = 223

        <br>

        Average Linear its / SNES = 3.716667e+01<br>

      </span></div>

    <br>

    This the output from the original C code with the same options

    (./ex25  <span style="font-family:monospace"><span style="color:#000000;background-color:#ffffff">-snes_view

        -snes_monitor -da_grid_x 50 -da_grid_y 50)</span></span>:<br>

    <br>

    
    <div>

      <span style="font-family:monospace"><span style="color:#000000;background-color:#ffffff">  0 SNES

          Function norm 7.229568285876e+00  </span><br>

         1 SNES Function norm 1.970854029907e-02  <br>

         2 SNES Function norm 5.721344035538e-03  <br>

         3 SNES Function norm 1.179738636852e-03  <br>

         4 SNES Function norm 1.028879138425e-04  <br>

         5 SNES Function norm 7.472012249912e-07  <br>

         6 SNES Function norm 4.743884624164e-11  <br>

        SNES Object: 1 MPI processes

        <br>

         type: newtonls

        <br>

         maximum iterations=50, maximum function evaluations=10000

        <br>

         tolerances: relative=1e-08, absolute=1e-50, solution=1e-08

        <br>

         total number of linear solver iterations=223

        <br>

         total number of function evaluations=7

        <br>

         norm schedule ALWAYS

        <br>

         SNESLineSearch Object:   1 MPI processes

        <br>

           type: bt

        <br>

             interpolation: cubic

        <br>

             alpha=1.000000e-04

        <br>

           maxstep=1.000000e+08, minlambda=1.000000e-12

        <br>

           tolerances: relative=1.000000e-08, absolute=1.000000e-15,

        lambda=1.000000e-08

        <br>

           maximum iterations=40

        <br>

         KSP Object:   1 MPI processes

        <br>

           type: gmres

        <br>

             GMRES: restart=30, using Classical (unmodified)

        Gram-Schmidt Orthogonalization with no iterative refinement

        <br>

             GMRES: happy breakdown tolerance 1e-30

        <br>

           maximum iterations=10000, initial guess is zero

        <br>

           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000

        <br>

           left preconditioning

        <br>

           using PRECONDITIONED norm type for convergence test

        <br>

         PC Object:   1 MPI processes

        <br>

           type: ilu

        <br>

             ILU: out-of-place factorization

        <br>

             0 levels of fill

        <br>

             tolerance for zero pivot 2.22045e-14

        <br>

             matrix ordering: natural

        <br>

             factor fill ratio given 1, needed 1

        <br>

               Factored matrix follows:

        <br>

                 Mat Object:           1 MPI processes

        <br>

                   type: seqaij

        <br>

                   rows=2500, cols=2500

        <br>

                   package used to perform factorization: petsc

        <br>

                   total: nonzeros=12300, allocated nonzeros=12300

        <br>

                   total number of mallocs used during MatSetValues

        calls =0

        <br>

                     not using I-node routines

        <br>

           linear system matrix = precond matrix:

        <br>

           Mat Object:     1 MPI processes

        <br>

             type: seqaij

        <br>

             rows=2500, cols=2500

        <br>

             total: nonzeros=12300, allocated nonzeros=12300

        <br>

             total number of mallocs used during MatSetValues calls =0

        <br>

               not using I-node routines

        <br>

        Number of SNES iterations = 6

        <br>

        Number of Linear iterations = 223

        <br>

        Average Linear its / SNES = 3.716667e+01<br>

      </span></div>

    
    <br>

    Why the number of function evaluations is so different?<br>

    
    I'm using last version of PETSc and petsc4py.<br>

    Thank you.<span class="HOEnZb"><font color="#888888"><br>

    <br>

    Nicola<br>

    <br>

    
    <pre cols="72">-- 

_____________________________________________________________________

Nicola Creati

Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS

IRI (Ricerca Teconologica e Infrastrutture) Department

B.go Grotta Gigante - Brisciki 42/c

34010 Sgonico - Zgonik (TS) - Italy

Tel. <a href="tel:%2B39-0402140213" value="+390402140213" target="_blank">+39-0402140213</a>

Fax  <a href="tel:%2B39-040327307" value="+39040327307" target="_blank">+39-040327307</a></pre>

  </font></span></div>


</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature">What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div>

</div></div>