[petsc-users] Sparse linear system solving

Lidia lidia.varsh at mail.ioffe.ru
Tue May 31 06:34:15 CDT 2022


Matt, Mark, thank you much for your answers!


Now we have run example # 5 on our computer cluster and on the local 
server and also have not seen any performance increase, but by unclear 
reason running times on the local server are much better than on the 
cluster.

Now we will try to run petsc #5 example inside a docker container on our 
server and see if the problem is in our environment. I'll write you the 
results of this test as soon as we get it.

The ksp_monitor outs for the 5th test at the current local server 
configuration (for 2 and 4 mpi processes) and for the cluster (for 1 and 
3 mpi processes) are attached .


And one more question. Potentially we can use 10 nodes and 96 threads at 
each node on our cluster. What do you think, which combination of 
numbers of mpi processes and openmp threads may be the best for the 5th 
example?

Thank you!


Best,
Lidiia

On 31.05.2022 05:42, Mark Adams wrote:
> And if you see "NO" change in performance I suspect the solver/matrix 
> is all on one processor.
> (PETSc does not use threads by default so threads should not change 
> anything).
>
> As Matt said, it is best to start with a PETSc example that does 
> something like what you want (parallel linear solve, see 
> src/ksp/ksp/tutorials for examples), and then add your code to it.
> That way you get the basic infrastructure in place for you, which is 
> pretty obscure to the uninitiated.
>
> Mark
>
> On Mon, May 30, 2022 at 10:18 PM Matthew Knepley <knepley at gmail.com> 
> wrote:
>
>     On Mon, May 30, 2022 at 10:12 PM Lidia <lidia.varsh at mail.ioffe.ru>
>     wrote:
>
>         Dear colleagues,
>
>         Is here anyone who have solved big sparse linear matrices
>         using PETSC?
>
>
>     There are lots of publications with this kind of data. Here is one
>     recent one: https://arxiv.org/abs/2204.01722
>
>         We have found NO performance improvement while using more and
>         more mpi
>         processes (1-2-3) and open-mp threads (from 1 to 72 threads).
>         Did anyone
>         faced to this problem? Does anyone know any possible reasons
>         of such
>         behaviour?
>
>
>     Solver behavior is dependent on the input matrix. The only
>     general-purpose solvers
>     are direct, but they do not scale linearly and have high memory
>     requirements.
>
>     Thus, in order to make progress you will have to be specific about
>     your matrices.
>
>         We use AMG preconditioner and GMRES solver from KSP package,
>         as our
>         matrix is large (from 100 000 to 1e+6 rows and columns), sparse,
>         non-symmetric and includes both positive and negative values. But
>         performance problems also exist while using CG solvers with
>         symmetric
>         matrices.
>
>
>     There are many PETSc examples, such as example 5 for the
>     Laplacian, that exhibit
>     good scaling with both AMG and GMG.
>
>         Could anyone help us to set appropriate options of the
>         preconditioner
>         and solver? Now we use default parameters, maybe they are not
>         the best,
>         but we do not know a good combination. Or maybe you could
>         suggest any
>         other pairs of preconditioner+solver for such tasks?
>
>         I can provide more information: the matrices that we solve,
>         c++ script
>         to run solving using petsc and any statistics obtained by our
>         runs.
>
>
>     First, please provide a description of the linear system, and the
>     output of
>
>       -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view
>
>     for each test case.
>
>       Thanks,
>
>          Matt
>
>         Thank you in advance!
>
>         Best regards,
>         Lidiia Varshavchik,
>         Ioffe Institute, St. Petersburg, Russia
>
>
>
>     -- 
>     What most experimenters take for granted before they begin their
>     experiments is infinitely more interesting than any results to
>     which their experiments lead.
>     -- Norbert Wiener
>
>     https://www.cse.buffalo.edu/~knepley/
>     <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220531/e2762299/attachment-0001.html>
-------------- next part --------------
user at kitesrv:/data/raid1/tmp/petsc/src/ksp/ksp/tutorials$ LD_LIBRARY_PATH=/data/raid1/tmp/petsc/install/lib time /data/raid1/tmp/petsc/install/bin/mpirun -n 4 ./ex5 -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view
  0 KSP preconditioned resid norm 5.925618307774e+02 true resid norm 1.489143042155e+03 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 2.022444592869e+02 true resid norm 5.516335411837e+02 ||r(i)||/||b|| 3.704369060378e-01
  2 KSP preconditioned resid norm 1.058407283487e+02 true resid norm 2.759478800294e+02 ||r(i)||/||b|| 1.853064965673e-01
  3 KSP preconditioned resid norm 3.693395834711e+01 true resid norm 1.275254917089e+02 ||r(i)||/||b|| 8.563683145197e-02
  4 KSP preconditioned resid norm 1.408997772215e+01 true resid norm 4.766518770804e+01 ||r(i)||/||b|| 3.200846819863e-02
  5 KSP preconditioned resid norm 5.361143512330e+00 true resid norm 2.037934685918e+01 ||r(i)||/||b|| 1.368528494730e-02
  6 KSP preconditioned resid norm 1.510583748885e+00 true resid norm 6.398081426940e+00 ||r(i)||/||b|| 4.296485458965e-03
  7 KSP preconditioned resid norm 5.309564077280e-01 true resid norm 2.119763900049e+00 ||r(i)||/||b|| 1.423479034614e-03
  8 KSP preconditioned resid norm 1.241952019288e-01 true resid norm 4.237260240924e-01 ||r(i)||/||b|| 2.845435341652e-04
  9 KSP preconditioned resid norm 5.821568520561e-02 true resid norm 2.020452441170e-01 ||r(i)||/||b|| 1.356788692539e-04
 10 KSP preconditioned resid norm 1.821252523719e-02 true resid norm 7.650904617862e-02 ||r(i)||/||b|| 5.137790260087e-05
 11 KSP preconditioned resid norm 4.368507888148e-03 true resid norm 1.808437158965e-02 ||r(i)||/||b|| 1.214414671909e-05
Linear solve converged due to CONVERGED_RTOL iterations 11
KSP Object: 4 MPI processes
  type: gmres
    restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 4 MPI processes
  type: bjacobi
    number of blocks = 4
    Local solver information for first block is in the following KSP and PC objects on rank 0:
    Use -ksp_view ::ascii_info_detail to display information for all blocks
  KSP Object: (sub_) 1 MPI processes
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (sub_) 1 MPI processes
    type: ilu
      out-of-place factorization
      0 levels of fill
      tolerance for zero pivot 2.22045e-14
      matrix ordering: natural
      factor fill ratio given 1., needed 1.
        Factored matrix follows:
          Mat Object: 1 MPI processes
            type: seqaij
            rows=6, cols=6
            package used to perform factorization: petsc
            total: nonzeros=16, allocated nonzeros=16
              not using I-node routines
    linear system matrix = precond matrix:
    Mat Object: (sub_) 1 MPI processes
      type: seqaij
      rows=6, cols=6
      total: nonzeros=16, allocated nonzeros=30
      total number of mallocs used during MatSetValues calls=0
        not using I-node routines
  linear system matrix = precond matrix:
  Mat Object: 4 MPI processes
    type: mpiaij
    rows=24, cols=24
    total: nonzeros=98, allocated nonzeros=240
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
Norm of error 0.00593779, Iterations 11
  0 KSP preconditioned resid norm 7.186228668401e+02 true resid norm 3.204758181205e+03 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 1.689435569629e+02 true resid norm 8.091360976081e+02 ||r(i)||/||b|| 2.524796105845e-01
  2 KSP preconditioned resid norm 4.515826460465e+01 true resid norm 2.188560284713e+02 ||r(i)||/||b|| 6.829096490175e-02
  3 KSP preconditioned resid norm 6.171620381119e+00 true resid norm 3.403822057849e+01 ||r(i)||/||b|| 1.062115100544e-02
  4 KSP preconditioned resid norm 1.386278591005e+00 true resid norm 7.417896285993e+00 ||r(i)||/||b|| 2.314650861802e-03
  5 KSP preconditioned resid norm 2.592583219309e-01 true resid norm 1.555506429237e+00 ||r(i)||/||b|| 4.853740411242e-04
  6 KSP preconditioned resid norm 4.031309698756e-02 true resid norm 2.532670674279e-01 ||r(i)||/||b|| 7.902844867150e-05
  7 KSP preconditioned resid norm 8.307351448472e-03 true resid norm 5.128232470272e-02 ||r(i)||/||b|| 1.600193268980e-05
  8 KSP preconditioned resid norm 7.297751517013e-04 true resid norm 4.388588867606e-03 ||r(i)||/||b|| 1.369397820199e-06
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 4 MPI processes
  type: gmres
    restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 4 MPI processes
  type: bjacobi
    number of blocks = 4
    Local solver information for first block is in the following KSP and PC objects on rank 0:
    Use -ksp_view ::ascii_info_detail to display information for all blocks
  KSP Object: (sub_) 1 MPI processes
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (sub_) 1 MPI processes
    type: ilu
      out-of-place factorization
      0 levels of fill
      tolerance for zero pivot 2.22045e-14
      matrix ordering: natural
      factor fill ratio given 1., needed 1.
        Factored matrix follows:
          Mat Object: 1 MPI processes
            type: seqaij
            rows=6, cols=6
            package used to perform factorization: petsc
            total: nonzeros=16, allocated nonzeros=16
              not using I-node routines
    linear system matrix = precond matrix:
    Mat Object: (sub_) 1 MPI processes
      type: seqaij
      rows=6, cols=6
      total: nonzeros=16, allocated nonzeros=30
      total number of mallocs used during MatSetValues calls=0
        not using I-node routines
  linear system matrix = precond matrix:
  Mat Object: 4 MPI processes
    type: mpiaij
    rows=24, cols=24
    total: nonzeros=98, allocated nonzeros=240
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
Norm of error 0.000783616, Iterations 8
**************************************** ***********************************************************************************************************************
***                                WIDEN YOUR WINDOW TO 160 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document                                 ***
****************************************************************************************************************************************************************

------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------

./ex5 on a  named kitesrv with 4 processors, by user Tue May 31 13:41:49 2022
Using 3 OpenMP threads
Using Petsc Release Version 3.17.1, unknown 

                         Max       Max/Min     Avg       Total
Time (sec):           3.720e-03     1.002   3.718e-03
Objects:              8.900e+01     1.000   8.900e+01
Flops:                7.334e+03     1.023   7.252e+03  2.901e+04
Flops/sec:            1.975e+06     1.025   1.950e+06  7.801e+06
MPI Msg Count:        1.380e+02     1.468   1.170e+02  4.680e+02
MPI Msg Len (bytes):  4.364e+03     1.641   3.005e+01  1.406e+04
MPI Reductions:       1.050e+02     1.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 1.2380e-04   3.3%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  2.000e+00   1.9%
 1:  Original Solve: 2.9587e-03  79.6%  1.7574e+04  60.6%  2.660e+02  56.8%  2.824e+01       53.4%  5.300e+01  50.5%
 2:    Second Solve: 6.3269e-04  17.0%  1.1432e+04  39.4%  2.020e+02  43.2%  3.244e+01       46.6%  3.200e+01  30.5%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage


--- Event Stage 1: Original Solve

BuildTwoSided          3 1.0 4.6371e-05 2.1 0.00e+00 0.0 1.0e+01 4.0e+00 3.0e+00  1  0  2  0  3   1  0  4  1  6     0
BuildTwoSidedF         2 1.0 3.5755e-05 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  1  0  0  0  2   1  0  0  0  4     0
MatMult               24 1.0 4.3441e-04 2.8 1.10e+03 1.1 2.6e+02 2.9e+01 1.0e+00  8 14 56 53  1  10 23 98100  2    10
MatSolve              12 1.0 2.7930e-06 1.7 3.12e+02 1.2 0.0e+00 0.0e+00 0.0e+00  0  4  0  0  0   0  7  0  0  0   412
MatLUFactorNum         1 1.0 2.2540e-06 1.3 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    35
MatILUFactorSym        1 1.0 9.7690e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       1 1.0 2.9545e-05 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  1  0  0  0  1   1  0  0  0  2     0
MatAssemblyEnd         1 1.0 1.8802e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00  5  0  0  0  5   6  0  0  0  9     0
MatGetRowIJ            1 1.0 2.3500e-07 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 9.4980e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView                3 3.0 8.1759e-05 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  1  0  0  0  1   1  0  0  0  2     0
VecMDot               11 1.0 8.3470e-05 3.4 7.26e+02 1.0 0.0e+00 0.0e+00 1.1e+01  1 10  0  0 10   1 17  0  0 21    35
VecNorm               26 1.0 9.3217e-05 1.3 3.12e+02 1.0 0.0e+00 0.0e+00 2.6e+01  2  4  0  0 25   3  7  0  0 49    13
VecScale              12 1.0 4.7450e-06 1.9 7.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0    61
VecCopy               13 1.0 4.1490e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                25 1.0 4.0900e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               13 1.0 5.6990e-06 1.3 1.56e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  4  0  0  0   109
VecAYPX               12 1.0 2.0830e-06 1.5 7.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0   138
VecMAXPY              23 1.0 5.4380e-06 1.6 1.72e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0 24  0  0  0   0 39  0  0  0  1262
VecAssemblyBegin       1 1.0 1.6203e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  1   0  0  0  0  2     0
VecAssemblyEnd         1 1.0 1.0170e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin       24 1.0 8.1684e-05 1.1 0.00e+00 0.0 2.6e+02 2.9e+01 1.0e+00  2  0 56 53  1   3  0 98100  2     0
VecScatterEnd         24 1.0 3.4411e-04 7.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  5  0  0  0  0   7  0  0  0  0     0
VecNormalize          12 1.0 5.7443e-05 1.7 2.16e+02 1.0 0.0e+00 0.0e+00 1.2e+01  1  3  0  0 11   2  5  0  0 23    15
SFSetGraph             1 1.0 1.0780e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp                1 1.0 3.1964e-05 1.1 0.00e+00 0.0 2.0e+01 9.6e+00 1.0e+00  1  0  4  1  1   1  0  8  3  2     0
SFPack                24 1.0 7.5830e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFUnpack              24 1.0 1.6541e-05 9.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               2 1.0 3.2793e-0413.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  3  0  0  0  2   3  0  0  0  4     0
KSPSolve               1 1.0 1.0170e-03 1.0 4.35e+03 1.0 2.3e+02 3.0e+01 3.6e+01 27 59 49 50 34  34 98 86 93 68    17
KSPGMRESOrthog        11 1.0 9.3328e-05 2.7 1.52e+03 1.0 0.0e+00 0.0e+00 1.1e+01  1 21  0  0 10   2 35  0  0 21    65
PCSetUp                2 1.0 1.6209e-04 1.1 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00  4  0  0  0  0   5  0  0  0  0     0
PCSetUpOnBlocks        1 1.0 5.6611e-05 1.2 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   2  0  0  0  0     1
PCApply               12 1.0 4.0174e-05 1.4 3.12e+02 1.2 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  7  0  0  0    29

--- Event Stage 2: Second Solve

BuildTwoSided          1 1.0 1.9497e-05 2.0 0.00e+00 0.0 8.0e+00 4.0e+00 1.0e+00  0  0  2  0  1   2  0  4  0  3     0
BuildTwoSidedF         1 1.0 2.1054e-05 1.9 0.00e+00 0.0 1.6e+01 6.6e+01 1.0e+00  0  0  3  8  1   2  0  8 16  3     0
MatMult               18 1.0 1.1072e-04 1.7 8.28e+02 1.1 1.8e+02 3.0e+01 0.0e+00  2 11 38 39  0  14 27 89 84  0    28
MatSolve               9 1.0 1.8810e-06 1.5 2.34e+02 1.2 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  8  0  0  0   459
MatLUFactorNum         1 1.0 1.3380e-06 1.4 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0    58
MatAssemblyBegin       1 1.0 3.0984e-05 1.6 0.00e+00 0.0 1.6e+01 6.6e+01 1.0e+00  1  0  3  8  1   4  0  8 16  3     0
MatAssemblyEnd         1 1.0 2.4243e-05 1.1 1.60e+01 1.0 0.0e+00 0.0e+00 2.0e+00  1  0  0  0  2   4  1  0  0  6     3
MatZeroEntries         1 1.0 2.1270e-06 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView                3 3.0 5.9168e-05 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  1  0  0  0  1   5  0  0  0  3     0
VecMDot                8 1.0 4.9608e-05 3.0 3.96e+02 1.0 0.0e+00 0.0e+00 8.0e+00  1  5  0  0  8   4 14  0  0 25    32
VecNorm               20 1.0 5.5958e-05 1.1 2.40e+02 1.0 0.0e+00 0.0e+00 2.0e+01  1  3  0  0 19   8  8  0  0 62    17
VecScale               9 1.0 2.6620e-06 1.6 5.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0    81
VecCopy               10 1.0 3.7840e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecSet                19 1.0 2.4540e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               10 1.0 2.1130e-06 1.2 1.20e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  4  0  0  0   227
VecAYPX                9 1.0 1.3480e-06 1.5 5.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0   160
VecMAXPY              17 1.0 3.6180e-06 1.3 9.60e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0 13  0  0  0   1 34  0  0  0  1061
VecScatterBegin       18 1.0 2.5301e-05 1.3 0.00e+00 0.0 1.8e+02 3.0e+01 0.0e+00  1  0 38 39  0   4  0 89 84  0     0
VecScatterEnd         18 1.0 7.8795e-05 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   9  0  0  0  0     0
VecNormalize           9 1.0 2.5470e-05 1.3 1.62e+02 1.0 0.0e+00 0.0e+00 9.0e+00  1  2  0  0  9   4  6  0  0 28    25
SFPack                18 1.0 3.6420e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
SFUnpack              18 1.0 1.5920e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               2 1.0 2.6500e-07 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 2.8180e-04 1.0 2.78e+03 1.0 1.7e+02 3.0e+01 2.7e+01  7 38 36 37 26  44 96 84 79 84    39
KSPGMRESOrthog         8 1.0 5.5511e-05 2.6 8.28e+02 1.0 0.0e+00 0.0e+00 8.0e+00  1 11  0  0  8   5 29  0  0 25    60
PCSetUp                2 1.0 4.3870e-06 1.2 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  1  0  0  0    18
PCSetUpOnBlocks        1 1.0 3.4260e-06 1.4 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0    23
PCApply                9 1.0 2.2834e-05 1.5 2.34e+02 1.2 0.0e+00 0.0e+00 0.0e+00  1  3  0  0  0   3  8  0  0  0    38
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Viewer     1              1          856     0.

--- Event Stage 1: Original Solve

              Matrix     5              1         3376     0.
              Vector    47             26        47584     0.
           Index Set     5              2         1844     0.
   Star Forest Graph     3              0            0     0.
       Krylov Solver     2              0            0     0.
      Preconditioner     2              0            0     0.
              Viewer     2              1          856     0.
    Distributed Mesh     1              0            0     0.
     Discrete System     1              0            0     0.
           Weak Form     1              0            0     0.

--- Event Stage 2: Second Solve

              Matrix     0              4        14732     0.
              Vector    18             39        70976     0.
           Index Set     0              3         2808     0.
   Star Forest Graph     0              3         3408     0.
       Krylov Solver     0              2        20534     0.
      Preconditioner     0              2         1968     0.
              Viewer     1              1          856     0.
    Distributed Mesh     0              1         5080     0.
     Discrete System     0              1          976     0.
           Weak Form     0              1          632     0.
========================================================================================================================
Average time to get PetscTime(): 2.95e-08
Average time for MPI_Barrier(): 1.2316e-06
Average time for zero size MPI_Send(): 6.7525e-07
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_view
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-python --prefix=/data/raid1/tmp/petsc/install --with-debugging=no --with-blas-lib=/usr/lib/libblas.so --with-lapack-lib=/usr/lib/liblapack.so --with-openmp=true --with-mpi=true --download-openmpi=yes
-----------------------------------------
Libraries compiled on 2022-05-31 10:31:52 on kitesrv 
Machine characteristics: Linux-4.4.0-116-generic-x86_64-with-Ubuntu-16.04-xenial
Using PETSc directory: /data/raid1/tmp/petsc/install
Using PETSc arch: 
-----------------------------------------

Using C compiler: /data/raid1/tmp/petsc/install/bin/mpicc  -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g -O -fopenmp   
Using Fortran compiler: /data/raid1/tmp/petsc/install/bin/mpif90  -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O -fopenmp    
-----------------------------------------

Using include paths: -I/data/raid1/tmp/petsc/install/include
-----------------------------------------

Using C linker: /data/raid1/tmp/petsc/install/bin/mpicc
Using Fortran linker: /data/raid1/tmp/petsc/install/bin/mpif90
Using libraries: -Wl,-rpath,/data/raid1/tmp/petsc/install/lib -L/data/raid1/tmp/petsc/install/lib -lpetsc -Wl,-rpath,/data/raid1/tmp/petsc/install/lib -L/data/raid1/tmp/petsc/install/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 /usr/lib/liblapack.so /usr/lib/libblas.so -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl
-----------------------------------------

0.18user 0.03system 0:00.12elapsed 177%CPU (0avgtext+0avgdata 20060maxresident)k
0inputs+0outputs (0major+9903minor)pagefaults 0swaps
user at kitesrv:/data/raid1/tmp/petsc/src/ksp/ksp/tutorials$ 
-------------- next part --------------
user at kitesrv:/data/raid1/tmp/petsc/src/ksp/ksp/tutorials$ LD_LIBRARY_PATH=/data/raid1/tmp/petsc/install/lib time /data/raid1/tmp/petsc/install/bin/mpirun -n 2 ./ex5 -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view
  0 KSP preconditioned resid norm 2.540548908415e+02 true resid norm 5.690404203569e+02 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 5.524657937376e+01 true resid norm 1.536964982563e+02 ||r(i)||/||b|| 2.700976815669e-01
  2 KSP preconditioned resid norm 1.904775180107e+01 true resid norm 6.861661839730e+01 ||r(i)||/||b|| 1.205830305592e-01
  3 KSP preconditioned resid norm 4.708471233594e+00 true resid norm 2.115510975962e+01 ||r(i)||/||b|| 3.717681381290e-02
  4 KSP preconditioned resid norm 1.055333034486e+00 true resid norm 4.779687437415e+00 ||r(i)||/||b|| 8.399556984752e-03
  5 KSP preconditioned resid norm 5.287930275880e-02 true resid norm 2.395448983884e-01 ||r(i)||/||b|| 4.209628873783e-04
  6 KSP preconditioned resid norm 6.852115363810e-03 true resid norm 2.760282135279e-02 ||r(i)||/||b|| 4.850766371829e-05
  7 KSP preconditioned resid norm 8.937861499755e-04 true resid norm 3.024963693385e-03 ||r(i)||/||b|| 5.315903027570e-06
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 2 MPI processes
  type: gmres
    restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 2 MPI processes
  type: bjacobi
    number of blocks = 2
    Local solver information for first block is in the following KSP and PC objects on rank 0:
    Use -ksp_view ::ascii_info_detail to display information for all blocks
  KSP Object: (sub_) 1 MPI processes
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (sub_) 1 MPI processes
    type: ilu
      out-of-place factorization
      0 levels of fill
      tolerance for zero pivot 2.22045e-14
      matrix ordering: natural
      factor fill ratio given 1., needed 1.
        Factored matrix follows:
          Mat Object: 1 MPI processes
            type: seqaij
            rows=6, cols=6
            package used to perform factorization: petsc
            total: nonzeros=18, allocated nonzeros=18
              not using I-node routines
    linear system matrix = precond matrix:
    Mat Object: (sub_) 1 MPI processes
      type: seqaij
      rows=6, cols=6
      total: nonzeros=18, allocated nonzeros=30
      total number of mallocs used during MatSetValues calls=0
        not using I-node routines
  linear system matrix = precond matrix:
  Mat Object: 2 MPI processes
    type: mpiaij
    rows=12, cols=12
    total: nonzeros=46, allocated nonzeros=120
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
Norm of error 0.00121238, Iterations 7
  0 KSP preconditioned resid norm 2.502924928760e+02 true resid norm 1.031693268370e+03 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 4.655858214018e+01 true resid norm 2.232934339942e+02 ||r(i)||/||b|| 2.164339352015e-01
  2 KSP preconditioned resid norm 7.917304062742e+00 true resid norm 4.856470565410e+01 ||r(i)||/||b|| 4.707281431702e-02
  3 KSP preconditioned resid norm 9.318670651891e-01 true resid norm 6.270700755466e+00 ||r(i)||/||b|| 6.078066948496e-03
  4 KSP preconditioned resid norm 1.408588814076e-01 true resid norm 8.958183659370e-01 ||r(i)||/||b|| 8.682991286279e-04
  5 KSP preconditioned resid norm 4.306995949139e-03 true resid norm 2.763097772714e-02 ||r(i)||/||b|| 2.678216343390e-05
  6 KSP preconditioned resid norm 3.435999542630e-04 true resid norm 2.406928714402e-03 ||r(i)||/||b|| 2.332988678122e-06
Linear solve converged due to CONVERGED_RTOL iterations 6
KSP Object: 2 MPI processes
  type: gmres
    restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 2 MPI processes
  type: bjacobi
    number of blocks = 2
    Local solver information for first block is in the following KSP and PC objects on rank 0:
    Use -ksp_view ::ascii_info_detail to display information for all blocks
  KSP Object: (sub_) 1 MPI processes
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (sub_) 1 MPI processes
    type: ilu
      out-of-place factorization
      0 levels of fill
      tolerance for zero pivot 2.22045e-14
      matrix ordering: natural
      factor fill ratio given 1., needed 1.
        Factored matrix follows:
          Mat Object: 1 MPI processes
            type: seqaij
            rows=6, cols=6
            package used to perform factorization: petsc
            total: nonzeros=18, allocated nonzeros=18
              not using I-node routines
    linear system matrix = precond matrix:
    Mat Object: (sub_) 1 MPI processes
      type: seqaij
      rows=6, cols=6
      total: nonzeros=18, allocated nonzeros=30
      total number of mallocs used during MatSetValues calls=0
        not using I-node routines
  linear system matrix = precond matrix:
  Mat Object: 2 MPI processes
    type: mpiaij
    rows=12, cols=12
    total: nonzeros=46, allocated nonzeros=120
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
Norm of error 0.000322889, Iterations 6
**************************************** ***********************************************************************************************************************
***                                WIDEN YOUR WINDOW TO 160 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document                                 ***
****************************************************************************************************************************************************************

------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------

./ex5 on a  named kitesrv with 2 processors, by user Tue May 31 13:42:24 2022
Using 3 OpenMP threads
Using Petsc Release Version 3.17.1, unknown 

                         Max       Max/Min     Avg       Total
Time (sec):           3.459e-03     1.000   3.459e-03
Objects:              7.700e+01     1.000   7.700e+01
Flops:                4.400e+03     1.002   4.396e+03  8.792e+03
Flops/sec:            1.272e+06     1.002   1.271e+06  2.542e+06
MPI Msg Count:        3.600e+01     1.000   3.600e+01  7.200e+01
MPI Msg Len (bytes):  1.104e+03     1.000   3.067e+01  2.208e+03
MPI Reductions:       8.700e+01     1.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 1.2309e-04   3.6%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  2.000e+00   2.3%
 1:  Original Solve: 2.7587e-03  79.8%  4.7880e+03  54.5%  3.800e+01  52.8%  2.821e+01       48.6%  4.100e+01  47.1%
 2:    Second Solve: 5.7332e-04  16.6%  4.0040e+03  45.5%  3.400e+01  47.2%  3.341e+01       51.4%  2.600e+01  29.9%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage


--- Event Stage 1: Original Solve

BuildTwoSided          3 1.0 2.1347e-05 1.0 0.00e+00 0.0 2.0e+00 4.0e+00 3.0e+00  1  0  3  0  3   1  0  5  1  7     0
BuildTwoSidedF         2 1.0 1.9346e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  1  0  0  0  2   1  0  0  0  5     0
MatMult               16 1.0 1.5458e-04 1.8 6.40e+02 1.0 3.6e+01 3.0e+01 1.0e+00  4 15 50 48  1   4 27 95 99  2     8
MatSolve               8 1.0 2.2670e-06 1.5 2.40e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0 10  0  0  0   212
MatLUFactorNum         1 1.0 2.4620e-06 1.3 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0    24
MatILUFactorSym        1 1.0 1.1469e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       1 1.0 1.7848e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  1   1  0  0  0  2     0
MatAssemblyEnd         1 1.0 1.5707e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00  5  0  0  0  6   6  0  0  0 12     0
MatGetRowIJ            1 1.0 3.0100e-07 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 8.4270e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView                3 3.0 9.6613e-05 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  2  0  0  0  1   2  0  0  0  2     0
VecMDot                7 1.0 3.1623e-05 2.2 3.08e+02 1.0 0.0e+00 0.0e+00 7.0e+00  1  7  0  0  8   1 13  0  0 17    19
VecNorm               18 1.0 2.5209e-04 1.0 2.16e+02 1.0 0.0e+00 0.0e+00 1.8e+01  7  5  0  0 21   9  9  0  0 44     2
VecScale               8 1.0 2.9390e-06 1.0 4.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0    33
VecCopy                9 1.0 2.5980e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                17 1.0 3.0360e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                9 1.0 5.6680e-06 1.2 1.08e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  5  0  0  0    38
VecAYPX                8 1.0 1.1690e-06 1.1 4.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0    82
VecMAXPY              15 1.0 2.9050e-06 1.0 7.56e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0 17  0  0  0   0 32  0  0  0   520
VecAssemblyBegin       1 1.0 1.5341e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  1   1  0  0  0  2     0
VecAssemblyEnd         1 1.0 9.4300e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin       16 1.0 5.1321e-05 1.0 0.00e+00 0.0 3.6e+01 3.0e+01 1.0e+00  1  0 50 48  1   2  0 95 99  2     0
VecScatterEnd         16 1.0 9.3558e-05 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
VecNormalize           8 1.0 2.0754e-05 1.0 1.44e+02 1.0 0.0e+00 0.0e+00 8.0e+00  1  3  0  0  9   1  6  0  0 20    14
SFSetGraph             1 1.0 1.0930e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp                1 1.0 2.0803e-05 1.0 0.00e+00 0.0 4.0e+00 1.0e+01 1.0e+00  1  0  6  2  1   1  0 11  4  2     0
SFPack                16 1.0 2.1140e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFUnpack              16 1.0 1.5950e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               2 1.0 2.2458e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  1  0  0  0  2   1  0  0  0  5     0
KSPSolve               1 1.0 9.9155e-04 1.0 2.30e+03 1.0 3.0e+01 3.2e+01 2.4e+01 29 52 42 43 28  36 96 79 90 59     5
KSPGMRESOrthog         7 1.0 3.8334e-05 1.9 6.44e+02 1.0 0.0e+00 0.0e+00 7.0e+00  1 15  0  0  8   1 27  0  0 17    34
PCSetUp                2 1.0 1.1958e-04 1.0 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00  3  1  0  0  0   4  1  0  0  0     1
PCSetUpOnBlocks        1 1.0 5.4691e-05 1.0 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00  2  1  0  0  0   2  1  0  0  0     1
PCApply                8 1.0 5.9135e-05 1.5 2.40e+02 1.0 0.0e+00 0.0e+00 0.0e+00  1  5  0  0  0   2 10  0  0  0     8

--- Event Stage 2: Second Solve

BuildTwoSided          1 1.0 4.5420e-06 1.3 0.00e+00 0.0 2.0e+00 4.0e+00 1.0e+00  0  0  3  0  1   1  0  6  1  4     0
BuildTwoSidedF         1 1.0 5.9960e-06 1.2 0.00e+00 0.0 4.0e+00 5.8e+01 1.0e+00  0  0  6 11  1   1  0 12 20  4     0
MatMult               14 1.0 9.5330e-05 1.8 5.60e+02 1.0 2.8e+01 3.2e+01 0.0e+00  2 13 39 41  0  13 28 82 79  0    12
MatSolve               7 1.0 1.4380e-06 1.3 2.10e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0 10  0  0  0   292
MatLUFactorNum         1 1.0 1.4460e-06 1.0 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0    41
MatAssemblyBegin       1 1.0 1.4872e-05 1.1 0.00e+00 0.0 4.0e+00 5.8e+01 1.0e+00  0  0  6 11  1   2  0 12 20  4     0
MatAssemblyEnd         1 1.0 1.9540e-05 1.0 7.00e+00 1.0 0.0e+00 0.0e+00 2.0e+00  1  0  0  0  2   3  0  0  0  8     1
MatZeroEntries         1 1.0 2.3710e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView                3 3.0 7.2415e-05 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  1  0  0  0  1   8  0  0  0  4     0
VecMDot                6 1.0 1.1563e-05 1.3 2.31e+02 1.0 0.0e+00 0.0e+00 6.0e+00  0  5  0  0  7   2 12  0  0 23    40
VecNorm               16 1.0 3.3041e-05 2.4 1.92e+02 1.0 0.0e+00 0.0e+00 1.6e+01  1  4  0  0 18   4 10  0  0 62    12
VecScale               7 1.0 1.4170e-06 1.1 4.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0    59
VecCopy                8 1.0 2.3300e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                15 1.0 1.9460e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                8 1.0 1.4530e-06 1.4 9.60e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  5  0  0  0   132
VecAYPX                7 1.0 1.0070e-06 1.2 4.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0    83
VecMAXPY              13 1.0 2.4410e-06 1.2 5.76e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0 13  0  0  0   0 29  0  0  0   472
VecScatterBegin       14 1.0 1.5003e-05 1.1 0.00e+00 0.0 2.8e+01 3.2e+01 0.0e+00  0  0 39 41  0   2  0 82 79  0     0
VecScatterEnd         14 1.0 7.5876e-05 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   9  0  0  0  0     0
VecNormalize           7 1.0 1.0780e-05 1.2 1.26e+02 1.0 0.0e+00 0.0e+00 7.0e+00  0  3  0  0  8   2  6  0  0 27    23
SFPack                14 1.0 1.7980e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFUnpack              14 1.0 1.3100e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               2 1.0 2.2500e-07 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 2.0139e-04 1.0 1.88e+03 1.0 2.6e+01 3.2e+01 2.1e+01  6 43 36 38 24  34 94 76 73 81    19
KSPGMRESOrthog         6 1.0 1.4939e-05 1.2 4.83e+02 1.0 0.0e+00 0.0e+00 6.0e+00  0 11  0  0  7   2 24  0  0 23    65
PCSetUp                2 1.0 5.7350e-06 1.2 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   1  1  0  0  0    10
PCSetUpOnBlocks        1 1.0 3.6650e-06 1.1 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   1  1  0  0  0    16
PCApply                7 1.0 1.4785e-05 1.2 2.10e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   2 10  0  0  0    28
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Viewer     1              1          856     0.

--- Event Stage 1: Original Solve

              Matrix     5              1         3376     0.
              Vector    39             18        32928     0.
           Index Set     5              2         1832     0.
   Star Forest Graph     3              0            0     0.
       Krylov Solver     2              0            0     0.
      Preconditioner     2              0            0     0.
              Viewer     2              1          856     0.
    Distributed Mesh     1              0            0     0.
     Discrete System     1              0            0     0.
           Weak Form     1              0            0     0.

--- Event Stage 2: Second Solve

              Matrix     0              4        14744     0.
              Vector    14             35        63624     0.
           Index Set     0              3         2808     0.
   Star Forest Graph     0              3         3408     0.
       Krylov Solver     0              2        20534     0.
      Preconditioner     0              2         1968     0.
              Viewer     1              1          856     0.
    Distributed Mesh     0              1         5080     0.
     Discrete System     0              1          976     0.
           Weak Form     0              1          632     0.
========================================================================================================================
Average time to get PetscTime(): 2.31e-08
Average time for MPI_Barrier(): 3.43e-07
Average time for zero size MPI_Send(): 5.015e-07
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_view
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-python --prefix=/data/raid1/tmp/petsc/install --with-debugging=no --with-blas-lib=/usr/lib/libblas.so --with-lapack-lib=/usr/lib/liblapack.so --with-openmp=true --with-mpi=true --download-openmpi=yes
-----------------------------------------
Libraries compiled on 2022-05-31 10:31:52 on kitesrv 
Machine characteristics: Linux-4.4.0-116-generic-x86_64-with-Ubuntu-16.04-xenial
Using PETSc directory: /data/raid1/tmp/petsc/install
Using PETSc arch: 
-----------------------------------------

Using C compiler: /data/raid1/tmp/petsc/install/bin/mpicc  -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g -O -fopenmp   
Using Fortran compiler: /data/raid1/tmp/petsc/install/bin/mpif90  -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O -fopenmp    
-----------------------------------------

Using include paths: -I/data/raid1/tmp/petsc/install/include
-----------------------------------------

Using C linker: /data/raid1/tmp/petsc/install/bin/mpicc
Using Fortran linker: /data/raid1/tmp/petsc/install/bin/mpif90
Using libraries: -Wl,-rpath,/data/raid1/tmp/petsc/install/lib -L/data/raid1/tmp/petsc/install/lib -lpetsc -Wl,-rpath,/data/raid1/tmp/petsc/install/lib -L/data/raid1/tmp/petsc/install/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 /usr/lib/liblapack.so /usr/lib/libblas.so -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl
-----------------------------------------

0.07user 0.01system 0:00.09elapsed 92%CPU (0avgtext+0avgdata 19980maxresident)k
0inputs+0outputs (0major+6673minor)pagefaults 0swaps
user at kitesrv:/data/raid1/tmp/petsc/src/ksp/ksp/tutorials$ 
-------------- next part --------------
[lida at head1 tutorials]$ time mpirun -n 3 ./ex5 -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view 2>/dev/null
  0 KSP preconditioned resid norm 4.020939481591e+02 true resid norm 9.763918270858e+02 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 1.222668532006e+02 true resid norm 2.875957243877e+02 ||r(i)||/||b|| 2.945494999135e-01
  2 KSP preconditioned resid norm 6.915159267765e+01 true resid norm 1.565583438606e+02 ||r(i)||/||b|| 1.603437672434e-01
  3 KSP preconditioned resid norm 1.304338629042e+01 true resid norm 4.745383154681e+01 ||r(i)||/||b|| 4.860121749323e-02
  4 KSP preconditioned resid norm 4.706344725891e+00 true resid norm 1.682392035308e+01 ||r(i)||/||b|| 1.723070583589e-02
  5 KSP preconditioned resid norm 5.039609363554e-01 true resid norm 2.504548411461e+00 ||r(i)||/||b|| 2.565105874489e-03
  6 KSP preconditioned resid norm 1.055361110378e-01 true resid norm 5.260792846119e-01 ||r(i)||/||b|| 5.387993529013e-04
  7 KSP preconditioned resid norm 1.234936672024e-02 true resid norm 3.467184429455e-02 ||r(i)||/||b|| 3.551017463761e-05
  8 KSP preconditioned resid norm 6.180717049285e-03 true resid norm 1.824456980445e-02 ||r(i)||/||b|| 1.868570516296e-05
  9 KSP preconditioned resid norm 1.482144308741e-04 true resid norm 6.740391026576e-04 ||r(i)||/||b|| 6.903366906188e-07
Linear solve converged due to CONVERGED_RTOL iterations 9
KSP Object: 3 MPI processes
  type: gmres
    restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 3 MPI processes
  type: bjacobi
    number of blocks = 3
    Local solver information for first block is in the following KSP and PC objects on rank 0:
    Use -ksp_view ::ascii_info_detail to display information for all blocks
  KSP Object: (sub_) 1 MPI processes
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (sub_) 1 MPI processes
    type: ilu
      out-of-place factorization
      0 levels of fill
      tolerance for zero pivot 2.22045e-14
      matrix ordering: natural
      factor fill ratio given 1., needed 1.
        Factored matrix follows:
          Mat Object: 1 MPI processes
            type: seqaij
            rows=6, cols=6
            package used to perform factorization: petsc
            total: nonzeros=16, allocated nonzeros=16
              not using I-node routines
    linear system matrix = precond matrix:
    Mat Object: (sub_) 1 MPI processes
      type: seqaij
      rows=6, cols=6
      total: nonzeros=16, allocated nonzeros=30
      total number of mallocs used during MatSetValues calls=0
        not using I-node routines
  linear system matrix = precond matrix:
  Mat Object: 3 MPI processes
    type: mpiaij
    rows=18, cols=18
    total: nonzeros=72, allocated nonzeros=180
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
Norm of error 0.000208575, Iterations 9
  0 KSP preconditioned resid norm 4.608512576274e+02 true resid norm 2.016631101615e+03 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 9.412072370874e+01 true resid norm 4.038990960334e+02 ||r(i)||/||b|| 2.002840756101e-01
  2 KSP preconditioned resid norm 2.282248495386e+01 true resid norm 9.733980188154e+01 ||r(i)||/||b|| 4.826852159703e-02
  3 KSP preconditioned resid norm 1.365448582262e+00 true resid norm 7.954929613540e+00 ||r(i)||/||b|| 3.944662763145e-03
  4 KSP preconditioned resid norm 2.252869372987e-01 true resid norm 1.285361707036e+00 ||r(i)||/||b|| 6.373806820723e-04
  5 KSP preconditioned resid norm 1.586897237676e-02 true resid norm 1.098248593144e-01 ||r(i)||/||b|| 5.445956834962e-05
  6 KSP preconditioned resid norm 1.905899805612e-03 true resid norm 1.346481495832e-02 ||r(i)||/||b|| 6.676885498560e-06
Linear solve converged due to CONVERGED_RTOL iterations 6
KSP Object: 3 MPI processes
  type: gmres
    restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 3 MPI processes
  type: bjacobi
    number of blocks = 3
    Local solver information for first block is in the following KSP and PC objects on rank 0:
    Use -ksp_view ::ascii_info_detail to display information for all blocks
  KSP Object: (sub_) 1 MPI processes
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: (sub_) 1 MPI processes
    type: ilu
      out-of-place factorization
      0 levels of fill
      tolerance for zero pivot 2.22045e-14
      matrix ordering: natural
      factor fill ratio given 1., needed 1.
        Factored matrix follows:
          Mat Object: 1 MPI processes
            type: seqaij
            rows=6, cols=6
            package used to perform factorization: petsc
            total: nonzeros=16, allocated nonzeros=16
              not using I-node routines
    linear system matrix = precond matrix:
    Mat Object: (sub_) 1 MPI processes
      type: seqaij
      rows=6, cols=6
      total: nonzeros=16, allocated nonzeros=30
      total number of mallocs used during MatSetValues calls=0
        not using I-node routines
  linear system matrix = precond matrix:
  Mat Object: 3 MPI processes
    type: mpiaij
    rows=18, cols=18
    total: nonzeros=72, allocated nonzeros=180
    total number of mallocs used during MatSetValues calls=0
      not using I-node (on process 0) routines
Norm of error 0.00172305, Iterations 6
**************************************** ***********************************************************************************************************************
***                                WIDEN YOUR WINDOW TO 160 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document                                 ***
****************************************************************************************************************************************************************

------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------

./ex5 on a  named head1.hpc with 3 processors, by lida Tue May 31 12:18:30 2022
Using Petsc Release Version 3.17.1, unknown 

                         Max       Max/Min     Avg       Total
Time (sec):           1.783e-02     1.000   1.783e-02
Objects:              8.100e+01     1.000   8.100e+01
Flops:                5.590e+03     1.080   5.314e+03  1.594e+04
Flops/sec:            3.136e+05     1.080   2.981e+05  8.942e+05
MPI Msg Count:        7.800e+01     1.857   5.467e+01  1.640e+02
MPI Msg Len (bytes):  3.808e+03     1.827   4.868e+01  7.984e+03
MPI Reductions:       9.300e+01     1.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 2.8862e-04   1.6%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  2.000e+00   2.2%
 1:  Original Solve: 1.3756e-02  77.2%  9.9600e+03  62.5%  9.200e+01  56.1%  4.435e+01       51.1%  4.700e+01  50.5%
 2:    Second Solve: 3.7698e-03  21.1%  5.9820e+03  37.5%  7.200e+01  43.9%  5.422e+01       48.9%  2.600e+01  28.0%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage


--- Event Stage 1: Original Solve

BuildTwoSided          3 1.0 1.1443e-04 1.5 0.00e+00 0.0 4.0e+00 8.0e+00 3.0e+00  1  0  2  0  3   1  0  4  1  6     0
BuildTwoSidedF         2 1.0 7.2485e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  2   0  0  0  0  4     0
MatMult               20 1.0 5.2744e-0311.5 1.00e+03 1.3 8.8e+01 4.6e+01 1.0e+00 12 16 54 51  1  15 25 96100  2     0
MatSolve              10 1.0 9.1130e-06 1.2 2.60e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0  8  0  0  0    86
MatLUFactorNum         1 1.0 2.2792e-05 2.1 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0     3
MatILUFactorSym        1 1.0 3.5026e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       1 1.0 7.2907e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  1   0  0  0  0  2     0
MatAssemblyEnd         1 1.0 5.0211e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00  3  0  0  0  5   4  0  0  0 11     0
MatGetRowIJ            1 1.0 4.2003e-07 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 2.0961e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView                3 3.0 5.7218e-04 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  2  0  0  0  1   3  0  0  0  2     0
VecMDot                9 1.0 4.9467e-0372.0 4.95e+02 1.0 0.0e+00 0.0e+00 9.0e+00 10  9  0  0 10  12 15  0  0 19     0
VecNorm               22 1.0 4.1558e-04 2.1 2.64e+02 1.0 0.0e+00 0.0e+00 2.2e+01  2  5  0  0 24   2  8  0  0 47     2
VecScale              10 1.0 1.0288e-05 1.1 6.00e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0    17
VecCopy               11 1.0 8.5477e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                21 1.0 1.1027e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               11 1.0 7.3556e-06 1.3 1.32e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  4  0  0  0    54
VecAYPX               10 1.0 4.7944e-06 1.1 6.00e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0    38
VecMAXPY              19 1.0 8.6166e-06 1.2 1.19e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0 22  0  0  0   0 36  0  0  0   414
VecAssemblyBegin       1 1.0 3.5299e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  1   0  0  0  0  2     0
VecAssemblyEnd         1 1.0 1.8599e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin       20 1.0 2.5115e-04 1.1 0.00e+00 0.0 8.8e+01 4.6e+01 1.0e+00  1  0 54 51  1   2  0 96100  2     0
VecScatterEnd         20 1.0 4.9759e-0325.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10  0  0  0  0  13  0  0  0  0     0
VecNormalize          10 1.0 1.1400e-04 1.2 1.80e+02 1.0 0.0e+00 0.0e+00 1.0e+01  1  3  0  0 11   1  5  0  0 21     5
SFSetGraph             1 1.0 8.3325e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp                1 1.0 1.1758e-04 1.2 0.00e+00 0.0 8.0e+00 2.8e+01 1.0e+00  1  0  5  3  1   1  0  9  5  2     0
SFPack                20 1.0 1.2662e-05 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFUnpack              20 1.0 4.5374e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               2 1.0 2.3160e-04 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  1  0  0  0  2   1  0  0  0  4     0
KSPSolve               1 1.0 7.0816e-03 1.0 3.38e+03 1.1 7.6e+01 4.8e+01 3.0e+01 39 61 46 46 32  51 97 83 89 64     1
KSPGMRESOrthog         9 1.0 4.9614e-0358.1 1.04e+03 1.0 0.0e+00 0.0e+00 9.0e+00 10 19  0  0 10  12 31  0  0 19     1
PCSetUp                2 1.0 3.2492e-04 1.1 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   2  1  0  0  0     0
PCSetUpOnBlocks        1 1.0 1.5849e-04 1.0 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  1  0  0  0     0
PCApply               10 1.0 7.6554e-05 1.4 2.60e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0  8  0  0  0    10

--- Event Stage 2: Second Solve

BuildTwoSided          1 1.0 5.5462e-05 2.6 0.00e+00 0.0 6.0e+00 8.0e+00 1.0e+00  0  0  4  1  1   1  0  8  1  4     0
BuildTwoSidedF         1 1.0 6.0628e-05 2.3 0.00e+00 0.0 1.2e+01 1.0e+02 1.0e+00  0  0  7 15  1   1  0 17 31  4     0
MatMult               14 1.0 9.3008e-04 6.9 7.00e+02 1.3 5.6e+01 4.8e+01 0.0e+00  2 11 34 34  0  11 29 78 69  0     2
MatSolve               7 1.0 3.2000e-06 1.2 1.82e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  9  0  0  0   171
MatLUFactorNum         1 1.0 2.4736e-06 1.2 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0    25
MatAssemblyBegin       1 1.0 8.0139e-05 1.9 0.00e+00 0.0 1.2e+01 1.0e+02 1.0e+00  0  0  7 15  1   2  0 17 31  4     0
MatAssemblyEnd         1 1.0 3.1728e-05 1.1 1.80e+01 1.2 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  2   1  1  0  0  8     2
MatZeroEntries         1 1.0 4.5244e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView                3 3.0 4.9627e-04 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  2  0  0  0  1   9  0  0  0  4     0
VecMDot                6 1.0 8.6141e-0420.9 2.31e+02 1.0 0.0e+00 0.0e+00 6.0e+00  2  4  0  0  6   8 12  0  0 23     1
VecNorm               16 1.0 3.6575e-04 3.1 1.92e+02 1.0 0.0e+00 0.0e+00 1.6e+01  2  4  0  0 17   7 10  0  0 62     2
VecScale               7 1.0 4.1863e-06 2.5 4.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0    30
VecCopy                8 1.0 5.2368e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                15 1.0 5.1260e-06 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                8 1.0 4.3251e-06 2.1 9.60e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  5  0  0  0    67
VecAYPX                7 1.0 1.7947e-06 1.1 4.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0    70
VecMAXPY              13 1.0 5.2378e-06 1.5 5.76e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0 11  0  0  0   0 29  0  0  0   330
VecScatterBegin       14 1.0 4.4449e-05 1.6 0.00e+00 0.0 5.6e+01 4.8e+01 0.0e+00  0  0 34 34  0   1  0 78 69  0     0
VecScatterEnd         14 1.0 8.6884e-0411.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   9  0  0  0  0     0
VecNormalize           7 1.0 5.4329e-05 1.2 1.26e+02 1.0 0.0e+00 0.0e+00 7.0e+00  0  2  0  0  8   1  6  0  0 27     7
SFPack                14 1.0 5.7872e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFUnpack              14 1.0 2.3032e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               2 1.0 4.6380e-07 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 1.3922e-03 1.1 1.99e+03 1.1 5.2e+01 4.8e+01 2.1e+01  7 35 32 31 23  35 94 72 64 81     4
KSPGMRESOrthog         6 1.0 8.6742e-0417.4 4.83e+02 1.0 0.0e+00 0.0e+00 6.0e+00  2  9  0  0  6   9 24  0  0 23     2
PCSetUp                2 1.0 8.6613e-06 1.2 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0     7
PCSetUpOnBlocks        1 1.0 6.2576e-06 1.2 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0    10
PCApply                7 1.0 3.7187e-05 1.9 1.82e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   1  9  0  0  0    15
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Viewer     1              1          896     0.

--- Event Stage 1: Original Solve

              Matrix     5              1         3616     0.
              Vector    43             22        42896     0.
           Index Set     5              2         1952     0.
   Star Forest Graph     3              0            0     0.
       Krylov Solver     2              0            0     0.
      Preconditioner     2              0            0     0.
              Viewer     2              1          896     0.
    Distributed Mesh     1              0            0     0.
     Discrete System     1              0            0     0.
           Weak Form     1              0            0     0.

--- Event Stage 2: Second Solve

              Matrix     0              4        16328     0.
              Vector    14             35        67840     0.
           Index Set     0              3         3000     0.
   Star Forest Graph     0              3         3688     0.
       Krylov Solver     0              2        20942     0.
      Preconditioner     0              2         2064     0.
              Viewer     1              1          896     0.
    Distributed Mesh     0              1         5128     0.
     Discrete System     0              1         1024     0.
           Weak Form     0              1          664     0.
========================================================================================================================
Average time to get PetscTime(): 3.89293e-08
Average time for MPI_Barrier(): 4.57484e-06
Average time for zero size MPI_Send(): 1.97627e-06
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_view
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with 64 bit PetscInt
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-python --prefix=/home/lida -with-mpi-dir=/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4 LDFLAGS="-L/home/lida/lib64 -L/home/lida/lib -L/home/lida/jdk/lib" CPPFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CXXFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" --with-debugging=no --with-64-bit-indices
-----------------------------------------
Libraries compiled on 2022-05-25 08:44:48 on head1.hpc 
Machine characteristics: Linux-3.10.0-1062.el7.x86_64-x86_64-with-centos-7.7.1908-Core
Using PETSc directory: /home/lida
Using PETSc arch: 
-----------------------------------------

Using C compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 -fPIC -I/home/lida/include -I/home/lida/jdk/include -march=native -O3  
Using Fortran compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90  -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O    -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 
-----------------------------------------

Using include paths: -I/home/lida/include -I/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/include
-----------------------------------------

Using C linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc
Using Fortran linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90
Using libraries: -Wl,-rpath,/home/lida/lib -L/home/lida/lib -lpetsc -Wl,-rpath,/home/lida/lib64 -L/home/lida/lib64 -Wl,-rpath,/home/lida/lib -L/home/lida/lib -Wl,-rpath,/home/lida/jdk/lib -L/home/lida/jdk/lib -Wl,-rpath,/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -Wl,-rpath,/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -L/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -Wl,-rpath,/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -L/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib -lopenblas -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl
-----------------------------------------


real    0m2.268s
user    0m1.456s
sys     0m0.583s
[lida at head1 tutorials]$ 
-------------- next part --------------
[lida at head1 tutorials]$ time mpirun -n 1 ./ex5 -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view 2>/dev/null
  0 KSP preconditioned resid norm 6.889609116885e+00 true resid norm 1.664331697709e+01 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 3.735626684805e-01 true resid norm 1.277849582311e+00 ||r(i)||/||b|| 7.677854024348e-02
  2 KSP preconditioned resid norm 1.736201489518e-02 true resid norm 6.598680019925e-02 ||r(i)||/||b|| 3.964762570470e-03
  3 KSP preconditioned resid norm 6.503363094580e-04 true resid norm 3.819222199389e-03 ||r(i)||/||b|| 2.294748219147e-04
  4 KSP preconditioned resid norm 7.801321330590e-06 true resid norm 3.420262185151e-05 ||r(i)||/||b|| 2.055036378781e-06
Linear solve converged due to CONVERGED_RTOL iterations 4
KSP Object: 1 MPI processes
  type: gmres
    restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: icc
    out-of-place factorization
    0 levels of fill
    tolerance for zero pivot 2.22045e-14
    using Manteuffel shift [POSITIVE_DEFINITE]
    matrix ordering: natural
    factor fill ratio given 1., needed 1.
      Factored matrix follows:
        Mat Object: 1 MPI processes
          type: seqsbaij
          rows=6, cols=6
          package used to perform factorization: petsc
          total: nonzeros=13, allocated nonzeros=13
              block size is 1
  linear system matrix = precond matrix:
  Mat Object: 1 MPI processes
    type: seqaij
    rows=6, cols=6
    total: nonzeros=20, allocated nonzeros=30
    total number of mallocs used during MatSetValues calls=0
      not using I-node routines
Norm of error 7.36852e-06, Iterations 4
  0 KSP preconditioned resid norm 7.247863732790e+00 true resid norm 3.100000000000e+01 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 1.273569592250e-01 true resid norm 7.519049129476e-01 ||r(i)||/||b|| 2.425499719186e-02
  2 KSP preconditioned resid norm 2.160197987659e-03 true resid norm 1.245148592634e-02 ||r(i)||/||b|| 4.016608363334e-04
  3 KSP preconditioned resid norm 3.358674359432e-05 true resid norm 2.551377456617e-04 ||r(i)||/||b|| 8.230249860056e-06
Linear solve converged due to CONVERGED_RTOL iterations 3
KSP Object: 1 MPI processes
  type: gmres
    restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: icc
    out-of-place factorization
    0 levels of fill
    tolerance for zero pivot 2.22045e-14
    using Manteuffel shift [POSITIVE_DEFINITE]
    matrix ordering: natural
    factor fill ratio given 1., needed 1.
      Factored matrix follows:
        Mat Object: 1 MPI processes
          type: seqsbaij
          rows=6, cols=6
          package used to perform factorization: petsc
          total: nonzeros=13, allocated nonzeros=13
              block size is 1
  linear system matrix = precond matrix:
  Mat Object: 1 MPI processes
    type: seqaij
    rows=6, cols=6
    total: nonzeros=20, allocated nonzeros=30
    total number of mallocs used during MatSetValues calls=0
      not using I-node routines
Norm of error 3.41929e-05, Iterations 3
**************************************** ***********************************************************************************************************************
***                                WIDEN YOUR WINDOW TO 160 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document                                 ***
****************************************************************************************************************************************************************

------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------

./ex5 on a  named head1.hpc with 1 processor, by lida Tue May 31 12:18:46 2022
Using Petsc Release Version 3.17.1, unknown 

                         Max       Max/Min     Avg       Total
Time (sec):           1.397e-02     1.000   1.397e-02
Objects:              5.200e+01     1.000   5.200e+01
Flops:                2.076e+03     1.000   2.076e+03  2.076e+03
Flops/sec:            1.486e+05     1.000   1.486e+05  1.486e+05
MPI Msg Count:        0.000e+00     0.000   0.000e+00  0.000e+00
MPI Msg Len (bytes):  0.000e+00     0.000   0.000e+00  0.000e+00
MPI Reductions:       0.000e+00     0.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 2.2292e-04   1.6%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0%
 1:  Original Solve: 9.2670e-03  66.3%  1.1780e+03  56.7%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0%
 2:    Second Solve: 4.4709e-03  32.0%  8.9800e+02  43.3%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage


--- Event Stage 1: Original Solve

MatMult               10 1.0 1.5418e-05 1.0 3.40e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0 16  0  0  0   0 29  0  0  0    22
MatSolve               5 1.0 9.0795e-06 1.0 1.70e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  8  0  0  0   0 14  0  0  0    19
MatCholFctrNum         1 1.0 1.2116e-05 1.0 6.00e+00 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0     0
MatICCFactorSym        1 1.0 1.7540e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       1 1.0 2.3469e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         1 1.0 2.3169e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 1.6764e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 1.0921e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatView                2 1.0 2.4787e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 18  0  0  0  0  27  0  0  0  0     0
VecMDot                4 1.0 6.2725e-06 1.0 1.10e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0  9  0  0  0    18
VecNorm               12 1.0 1.2450e-05 1.0 1.32e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  6  0  0  0   0 11  0  0  0    11
VecScale               5 1.0 8.2748e-06 1.0 3.00e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  3  0  0  0     4
VecCopy                6 1.0 4.5085e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                 6 1.0 6.6161e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                6 1.0 4.7898e-06 1.0 7.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  6  0  0  0    15
VecAYPX                5 1.0 3.7868e-06 1.0 3.00e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  3  0  0  0     8
VecMAXPY               9 1.0 3.1283e-06 1.0 2.88e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0 14  0  0  0   0 24  0  0  0    92
VecAssemblyBegin       1 1.0 1.3318e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd         1 1.0 1.5367e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize           5 1.0 2.7158e-05 1.0 8.50e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  4  0  0  0   0  7  0  0  0     3
KSPSetUp               1 1.0 4.6344e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
KSPSolve               1 1.0 2.0673e-03 1.0 1.12e+03 1.0 0.0e+00 0.0e+00 0.0e+00 15 54  0  0  0  22 95  0  0  0     1
KSPGMRESOrthog         4 1.0 1.6174e-05 1.0 2.30e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0 11  0  0  0   0 20  0  0  0    14
PCSetUp                1 1.0 2.1121e-04 1.0 6.00e+00 1.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   2  1  0  0  0     0
PCApply                5 1.0 1.4246e-05 1.0 1.70e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  8  0  0  0   0 14  0  0  0    12

--- Event Stage 2: Second Solve

MatMult                8 1.0 3.8277e-06 1.0 2.72e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0 13  0  0  0   0 30  0  0  0    71
MatSolve               4 1.0 1.9213e-06 1.0 1.36e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  7  0  0  0   0 15  0  0  0    71
MatCholFctrNum         1 1.0 3.3919e-06 1.0 6.00e+00 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0     2
MatAssemblyBegin       1 1.0 1.1362e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         1 1.0 4.7777e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries         1 1.0 4.4834e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView                2 1.0 2.4329e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17  0  0  0  0  54  0  0  0  0     0
VecMDot                3 1.0 1.1474e-06 1.0 6.60e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  7  0  0  0    58
VecNorm               10 1.0 2.7772e-06 1.0 1.10e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0 12  0  0  0    40
VecScale               4 1.0 1.5302e-06 1.0 2.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  3  0  0  0    16
VecCopy                5 1.0 3.1414e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                 5 1.0 1.7434e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                5 1.0 2.1365e-06 1.0 6.00e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  7  0  0  0    28
VecAYPX                4 1.0 9.7509e-07 1.0 2.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  3  0  0  0    25
VecMAXPY               7 1.0 2.1346e-06 1.0 1.80e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  9  0  0  0   0 20  0  0  0    84
VecNormalize           4 1.0 5.5209e-06 1.0 6.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  8  0  0  0    12
KSPSetUp               1 1.0 4.0047e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 3.7580e-04 1.0 8.15e+02 1.0 0.0e+00 0.0e+00 0.0e+00  3 39  0  0  0   8 91  0  0  0     2
KSPGMRESOrthog         3 1.0 4.7795e-06 1.0 1.38e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  7  0  0  0   0 15  0  0  0    29
PCSetUp                1 1.0 6.9309e-06 1.0 6.00e+00 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0     1
PCApply                4 1.0 3.5232e-06 1.0 1.36e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  7  0  0  0   0 15  0  0  0    39
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Viewer     1              1          896     0.

--- Event Stage 1: Original Solve

              Matrix     3              1         3672     0.
              Vector    29             11        20064     0.
       Krylov Solver     1              0            0     0.
      Preconditioner     1              0            0     0.
              Viewer     1              0            0     0.
           Index Set     3              1         1000     0.
    Distributed Mesh     1              0            0     0.
   Star Forest Graph     2              0            0     0.
     Discrete System     1              0            0     0.
           Weak Form     1              0            0     0.

--- Event Stage 2: Second Solve

              Matrix     0              2         8224     0.
              Vector     8             26        47424     0.
       Krylov Solver     0              1        19198     0.
      Preconditioner     0              1         1048     0.
           Index Set     0              2         2000     0.
    Distributed Mesh     0              1         5128     0.
   Star Forest Graph     0              2         2352     0.
     Discrete System     0              1         1024     0.
           Weak Form     0              1          664     0.
========================================================================================================================
Average time to get PetscTime(): 3.68804e-08
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_view
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with 64 bit PetscInt
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-python --prefix=/home/lida -with-mpi-dir=/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4 LDFLAGS="-L/home/lida/lib64 -L/home/lida/lib -L/home/lida/jdk/lib" CPPFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CXXFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" --with-debugging=no --with-64-bit-indices
-----------------------------------------
Libraries compiled on 2022-05-25 08:44:48 on head1.hpc 
Machine characteristics: Linux-3.10.0-1062.el7.x86_64-x86_64-with-centos-7.7.1908-Core
Using PETSc directory: /home/lida
Using PETSc arch: 
-----------------------------------------

Using C compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 -fPIC -I/home/lida/include -I/home/lida/jdk/include -march=native -O3  
Using Fortran compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90  -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O    -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 
-----------------------------------------

Using include paths: -I/home/lida/include -I/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/include
-----------------------------------------

Using C linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc
Using Fortran linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90
Using libraries: -Wl,-rpath,/home/lida/lib -L/home/lida/lib -lpetsc -Wl,-rpath,/home/lida/lib64 -L/home/lida/lib64 -Wl,-rpath,/home/lida/lib -L/home/lida/lib -Wl,-rpath,/home/lida/jdk/lib -L/home/lida/jdk/lib -Wl,-rpath,/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -Wl,-rpath,/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -L/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -Wl,-rpath,/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -L/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib -lopenblas -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl
-----------------------------------------


real    0m1.711s
user    0m0.368s
sys     0m0.259s
[lida at head1 tutorials]$ 



More information about the petsc-users mailing list