[petsc-users] Sparse linear system solving
Lidia
lidia.varsh at mail.ioffe.ru
Tue May 31 06:34:15 CDT 2022
Matt, Mark, thank you much for your answers!
Now we have run example # 5 on our computer cluster and on the local
server and also have not seen any performance increase, but by unclear
reason running times on the local server are much better than on the
cluster.
Now we will try to run petsc #5 example inside a docker container on our
server and see if the problem is in our environment. I'll write you the
results of this test as soon as we get it.
The ksp_monitor outs for the 5th test at the current local server
configuration (for 2 and 4 mpi processes) and for the cluster (for 1 and
3 mpi processes) are attached .
And one more question. Potentially we can use 10 nodes and 96 threads at
each node on our cluster. What do you think, which combination of
numbers of mpi processes and openmp threads may be the best for the 5th
example?
Thank you!
Best,
Lidiia
On 31.05.2022 05:42, Mark Adams wrote:
> And if you see "NO" change in performance I suspect the solver/matrix
> is all on one processor.
> (PETSc does not use threads by default so threads should not change
> anything).
>
> As Matt said, it is best to start with a PETSc example that does
> something like what you want (parallel linear solve, see
> src/ksp/ksp/tutorials for examples), and then add your code to it.
> That way you get the basic infrastructure in place for you, which is
> pretty obscure to the uninitiated.
>
> Mark
>
> On Mon, May 30, 2022 at 10:18 PM Matthew Knepley <knepley at gmail.com>
> wrote:
>
> On Mon, May 30, 2022 at 10:12 PM Lidia <lidia.varsh at mail.ioffe.ru>
> wrote:
>
> Dear colleagues,
>
> Is here anyone who have solved big sparse linear matrices
> using PETSC?
>
>
> There are lots of publications with this kind of data. Here is one
> recent one: https://arxiv.org/abs/2204.01722
>
> We have found NO performance improvement while using more and
> more mpi
> processes (1-2-3) and open-mp threads (from 1 to 72 threads).
> Did anyone
> faced to this problem? Does anyone know any possible reasons
> of such
> behaviour?
>
>
> Solver behavior is dependent on the input matrix. The only
> general-purpose solvers
> are direct, but they do not scale linearly and have high memory
> requirements.
>
> Thus, in order to make progress you will have to be specific about
> your matrices.
>
> We use AMG preconditioner and GMRES solver from KSP package,
> as our
> matrix is large (from 100 000 to 1e+6 rows and columns), sparse,
> non-symmetric and includes both positive and negative values. But
> performance problems also exist while using CG solvers with
> symmetric
> matrices.
>
>
> There are many PETSc examples, such as example 5 for the
> Laplacian, that exhibit
> good scaling with both AMG and GMG.
>
> Could anyone help us to set appropriate options of the
> preconditioner
> and solver? Now we use default parameters, maybe they are not
> the best,
> but we do not know a good combination. Or maybe you could
> suggest any
> other pairs of preconditioner+solver for such tasks?
>
> I can provide more information: the matrices that we solve,
> c++ script
> to run solving using petsc and any statistics obtained by our
> runs.
>
>
> First, please provide a description of the linear system, and the
> output of
>
> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view
>
> for each test case.
>
> Thanks,
>
> Matt
>
> Thank you in advance!
>
> Best regards,
> Lidiia Varshavchik,
> Ioffe Institute, St. Petersburg, Russia
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to
> which their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220531/e2762299/attachment-0001.html>
-------------- next part --------------
user at kitesrv:/data/raid1/tmp/petsc/src/ksp/ksp/tutorials$ LD_LIBRARY_PATH=/data/raid1/tmp/petsc/install/lib time /data/raid1/tmp/petsc/install/bin/mpirun -n 4 ./ex5 -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view
0 KSP preconditioned resid norm 5.925618307774e+02 true resid norm 1.489143042155e+03 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 2.022444592869e+02 true resid norm 5.516335411837e+02 ||r(i)||/||b|| 3.704369060378e-01
2 KSP preconditioned resid norm 1.058407283487e+02 true resid norm 2.759478800294e+02 ||r(i)||/||b|| 1.853064965673e-01
3 KSP preconditioned resid norm 3.693395834711e+01 true resid norm 1.275254917089e+02 ||r(i)||/||b|| 8.563683145197e-02
4 KSP preconditioned resid norm 1.408997772215e+01 true resid norm 4.766518770804e+01 ||r(i)||/||b|| 3.200846819863e-02
5 KSP preconditioned resid norm 5.361143512330e+00 true resid norm 2.037934685918e+01 ||r(i)||/||b|| 1.368528494730e-02
6 KSP preconditioned resid norm 1.510583748885e+00 true resid norm 6.398081426940e+00 ||r(i)||/||b|| 4.296485458965e-03
7 KSP preconditioned resid norm 5.309564077280e-01 true resid norm 2.119763900049e+00 ||r(i)||/||b|| 1.423479034614e-03
8 KSP preconditioned resid norm 1.241952019288e-01 true resid norm 4.237260240924e-01 ||r(i)||/||b|| 2.845435341652e-04
9 KSP preconditioned resid norm 5.821568520561e-02 true resid norm 2.020452441170e-01 ||r(i)||/||b|| 1.356788692539e-04
10 KSP preconditioned resid norm 1.821252523719e-02 true resid norm 7.650904617862e-02 ||r(i)||/||b|| 5.137790260087e-05
11 KSP preconditioned resid norm 4.368507888148e-03 true resid norm 1.808437158965e-02 ||r(i)||/||b|| 1.214414671909e-05
Linear solve converged due to CONVERGED_RTOL iterations 11
KSP Object: 4 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 4 MPI processes
type: bjacobi
number of blocks = 4
Local solver information for first block is in the following KSP and PC objects on rank 0:
Use -ksp_view ::ascii_info_detail to display information for all blocks
KSP Object: (sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (sub_) 1 MPI processes
type: ilu
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=6, cols=6
package used to perform factorization: petsc
total: nonzeros=16, allocated nonzeros=16
not using I-node routines
linear system matrix = precond matrix:
Mat Object: (sub_) 1 MPI processes
type: seqaij
rows=6, cols=6
total: nonzeros=16, allocated nonzeros=30
total number of mallocs used during MatSetValues calls=0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 4 MPI processes
type: mpiaij
rows=24, cols=24
total: nonzeros=98, allocated nonzeros=240
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Norm of error 0.00593779, Iterations 11
0 KSP preconditioned resid norm 7.186228668401e+02 true resid norm 3.204758181205e+03 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 1.689435569629e+02 true resid norm 8.091360976081e+02 ||r(i)||/||b|| 2.524796105845e-01
2 KSP preconditioned resid norm 4.515826460465e+01 true resid norm 2.188560284713e+02 ||r(i)||/||b|| 6.829096490175e-02
3 KSP preconditioned resid norm 6.171620381119e+00 true resid norm 3.403822057849e+01 ||r(i)||/||b|| 1.062115100544e-02
4 KSP preconditioned resid norm 1.386278591005e+00 true resid norm 7.417896285993e+00 ||r(i)||/||b|| 2.314650861802e-03
5 KSP preconditioned resid norm 2.592583219309e-01 true resid norm 1.555506429237e+00 ||r(i)||/||b|| 4.853740411242e-04
6 KSP preconditioned resid norm 4.031309698756e-02 true resid norm 2.532670674279e-01 ||r(i)||/||b|| 7.902844867150e-05
7 KSP preconditioned resid norm 8.307351448472e-03 true resid norm 5.128232470272e-02 ||r(i)||/||b|| 1.600193268980e-05
8 KSP preconditioned resid norm 7.297751517013e-04 true resid norm 4.388588867606e-03 ||r(i)||/||b|| 1.369397820199e-06
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 4 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 4 MPI processes
type: bjacobi
number of blocks = 4
Local solver information for first block is in the following KSP and PC objects on rank 0:
Use -ksp_view ::ascii_info_detail to display information for all blocks
KSP Object: (sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (sub_) 1 MPI processes
type: ilu
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=6, cols=6
package used to perform factorization: petsc
total: nonzeros=16, allocated nonzeros=16
not using I-node routines
linear system matrix = precond matrix:
Mat Object: (sub_) 1 MPI processes
type: seqaij
rows=6, cols=6
total: nonzeros=16, allocated nonzeros=30
total number of mallocs used during MatSetValues calls=0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 4 MPI processes
type: mpiaij
rows=24, cols=24
total: nonzeros=98, allocated nonzeros=240
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Norm of error 0.000783616, Iterations 8
**************************************** ***********************************************************************************************************************
*** WIDEN YOUR WINDOW TO 160 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
****************************************************************************************************************************************************************
------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------
./ex5 on a named kitesrv with 4 processors, by user Tue May 31 13:41:49 2022
Using 3 OpenMP threads
Using Petsc Release Version 3.17.1, unknown
Max Max/Min Avg Total
Time (sec): 3.720e-03 1.002 3.718e-03
Objects: 8.900e+01 1.000 8.900e+01
Flops: 7.334e+03 1.023 7.252e+03 2.901e+04
Flops/sec: 1.975e+06 1.025 1.950e+06 7.801e+06
MPI Msg Count: 1.380e+02 1.468 1.170e+02 4.680e+02
MPI Msg Len (bytes): 4.364e+03 1.641 3.005e+01 1.406e+04
MPI Reductions: 1.050e+02 1.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 1.2380e-04 3.3% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 2.000e+00 1.9%
1: Original Solve: 2.9587e-03 79.6% 1.7574e+04 60.6% 2.660e+02 56.8% 2.824e+01 53.4% 5.300e+01 50.5%
2: Second Solve: 6.3269e-04 17.0% 1.1432e+04 39.4% 2.020e+02 43.2% 3.244e+01 46.6% 3.200e+01 30.5%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
--- Event Stage 1: Original Solve
BuildTwoSided 3 1.0 4.6371e-05 2.1 0.00e+00 0.0 1.0e+01 4.0e+00 3.0e+00 1 0 2 0 3 1 0 4 1 6 0
BuildTwoSidedF 2 1.0 3.5755e-05 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 1 0 0 0 2 1 0 0 0 4 0
MatMult 24 1.0 4.3441e-04 2.8 1.10e+03 1.1 2.6e+02 2.9e+01 1.0e+00 8 14 56 53 1 10 23 98100 2 10
MatSolve 12 1.0 2.7930e-06 1.7 3.12e+02 1.2 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 7 0 0 0 412
MatLUFactorNum 1 1.0 2.2540e-06 1.3 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 35
MatILUFactorSym 1 1.0 9.7690e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 1 1.0 2.9545e-05 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 1 0 0 0 1 1 0 0 0 2 0
MatAssemblyEnd 1 1.0 1.8802e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 5 0 0 0 5 6 0 0 0 9 0
MatGetRowIJ 1 1.0 2.3500e-07 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 9.4980e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 3 3.0 8.1759e-05 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 1 0 0 0 1 1 0 0 0 2 0
VecMDot 11 1.0 8.3470e-05 3.4 7.26e+02 1.0 0.0e+00 0.0e+00 1.1e+01 1 10 0 0 10 1 17 0 0 21 35
VecNorm 26 1.0 9.3217e-05 1.3 3.12e+02 1.0 0.0e+00 0.0e+00 2.6e+01 2 4 0 0 25 3 7 0 0 49 13
VecScale 12 1.0 4.7450e-06 1.9 7.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 61
VecCopy 13 1.0 4.1490e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 25 1.0 4.0900e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 13 1.0 5.6990e-06 1.3 1.56e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 4 0 0 0 109
VecAYPX 12 1.0 2.0830e-06 1.5 7.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 138
VecMAXPY 23 1.0 5.4380e-06 1.6 1.72e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 24 0 0 0 0 39 0 0 0 1262
VecAssemblyBegin 1 1.0 1.6203e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0
VecAssemblyEnd 1 1.0 1.0170e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 24 1.0 8.1684e-05 1.1 0.00e+00 0.0 2.6e+02 2.9e+01 1.0e+00 2 0 56 53 1 3 0 98100 2 0
VecScatterEnd 24 1.0 3.4411e-04 7.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 7 0 0 0 0 0
VecNormalize 12 1.0 5.7443e-05 1.7 2.16e+02 1.0 0.0e+00 0.0e+00 1.2e+01 1 3 0 0 11 2 5 0 0 23 15
SFSetGraph 1 1.0 1.0780e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFSetUp 1 1.0 3.1964e-05 1.1 0.00e+00 0.0 2.0e+01 9.6e+00 1.0e+00 1 0 4 1 1 1 0 8 3 2 0
SFPack 24 1.0 7.5830e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFUnpack 24 1.0 1.6541e-05 9.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 2 1.0 3.2793e-0413.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 3 0 0 0 2 3 0 0 0 4 0
KSPSolve 1 1.0 1.0170e-03 1.0 4.35e+03 1.0 2.3e+02 3.0e+01 3.6e+01 27 59 49 50 34 34 98 86 93 68 17
KSPGMRESOrthog 11 1.0 9.3328e-05 2.7 1.52e+03 1.0 0.0e+00 0.0e+00 1.1e+01 1 21 0 0 10 2 35 0 0 21 65
PCSetUp 2 1.0 1.6209e-04 1.1 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 5 0 0 0 0 0
PCSetUpOnBlocks 1 1.0 5.6611e-05 1.2 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 1
PCApply 12 1.0 4.0174e-05 1.4 3.12e+02 1.2 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 7 0 0 0 29
--- Event Stage 2: Second Solve
BuildTwoSided 1 1.0 1.9497e-05 2.0 0.00e+00 0.0 8.0e+00 4.0e+00 1.0e+00 0 0 2 0 1 2 0 4 0 3 0
BuildTwoSidedF 1 1.0 2.1054e-05 1.9 0.00e+00 0.0 1.6e+01 6.6e+01 1.0e+00 0 0 3 8 1 2 0 8 16 3 0
MatMult 18 1.0 1.1072e-04 1.7 8.28e+02 1.1 1.8e+02 3.0e+01 0.0e+00 2 11 38 39 0 14 27 89 84 0 28
MatSolve 9 1.0 1.8810e-06 1.5 2.34e+02 1.2 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 8 0 0 0 459
MatLUFactorNum 1 1.0 1.3380e-06 1.4 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 58
MatAssemblyBegin 1 1.0 3.0984e-05 1.6 0.00e+00 0.0 1.6e+01 6.6e+01 1.0e+00 1 0 3 8 1 4 0 8 16 3 0
MatAssemblyEnd 1 1.0 2.4243e-05 1.1 1.60e+01 1.0 0.0e+00 0.0e+00 2.0e+00 1 0 0 0 2 4 1 0 0 6 3
MatZeroEntries 1 1.0 2.1270e-06 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 3 3.0 5.9168e-05 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 1 0 0 0 1 5 0 0 0 3 0
VecMDot 8 1.0 4.9608e-05 3.0 3.96e+02 1.0 0.0e+00 0.0e+00 8.0e+00 1 5 0 0 8 4 14 0 0 25 32
VecNorm 20 1.0 5.5958e-05 1.1 2.40e+02 1.0 0.0e+00 0.0e+00 2.0e+01 1 3 0 0 19 8 8 0 0 62 17
VecScale 9 1.0 2.6620e-06 1.6 5.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 81
VecCopy 10 1.0 3.7840e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
VecSet 19 1.0 2.4540e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 10 1.0 2.1130e-06 1.2 1.20e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 4 0 0 0 227
VecAYPX 9 1.0 1.3480e-06 1.5 5.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 160
VecMAXPY 17 1.0 3.6180e-06 1.3 9.60e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 13 0 0 0 1 34 0 0 0 1061
VecScatterBegin 18 1.0 2.5301e-05 1.3 0.00e+00 0.0 1.8e+02 3.0e+01 0.0e+00 1 0 38 39 0 4 0 89 84 0 0
VecScatterEnd 18 1.0 7.8795e-05 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 9 0 0 0 0 0
VecNormalize 9 1.0 2.5470e-05 1.3 1.62e+02 1.0 0.0e+00 0.0e+00 9.0e+00 1 2 0 0 9 4 6 0 0 28 25
SFPack 18 1.0 3.6420e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
SFUnpack 18 1.0 1.5920e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 2 1.0 2.6500e-07 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 2.8180e-04 1.0 2.78e+03 1.0 1.7e+02 3.0e+01 2.7e+01 7 38 36 37 26 44 96 84 79 84 39
KSPGMRESOrthog 8 1.0 5.5511e-05 2.6 8.28e+02 1.0 0.0e+00 0.0e+00 8.0e+00 1 11 0 0 8 5 29 0 0 25 60
PCSetUp 2 1.0 4.3870e-06 1.2 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 18
PCSetUpOnBlocks 1 1.0 3.4260e-06 1.4 2.10e+01 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 23
PCApply 9 1.0 2.2834e-05 1.5 2.34e+02 1.2 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 3 8 0 0 0 38
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Viewer 1 1 856 0.
--- Event Stage 1: Original Solve
Matrix 5 1 3376 0.
Vector 47 26 47584 0.
Index Set 5 2 1844 0.
Star Forest Graph 3 0 0 0.
Krylov Solver 2 0 0 0.
Preconditioner 2 0 0 0.
Viewer 2 1 856 0.
Distributed Mesh 1 0 0 0.
Discrete System 1 0 0 0.
Weak Form 1 0 0 0.
--- Event Stage 2: Second Solve
Matrix 0 4 14732 0.
Vector 18 39 70976 0.
Index Set 0 3 2808 0.
Star Forest Graph 0 3 3408 0.
Krylov Solver 0 2 20534 0.
Preconditioner 0 2 1968 0.
Viewer 1 1 856 0.
Distributed Mesh 0 1 5080 0.
Discrete System 0 1 976 0.
Weak Form 0 1 632 0.
========================================================================================================================
Average time to get PetscTime(): 2.95e-08
Average time for MPI_Barrier(): 1.2316e-06
Average time for zero size MPI_Send(): 6.7525e-07
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_view
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-python --prefix=/data/raid1/tmp/petsc/install --with-debugging=no --with-blas-lib=/usr/lib/libblas.so --with-lapack-lib=/usr/lib/liblapack.so --with-openmp=true --with-mpi=true --download-openmpi=yes
-----------------------------------------
Libraries compiled on 2022-05-31 10:31:52 on kitesrv
Machine characteristics: Linux-4.4.0-116-generic-x86_64-with-Ubuntu-16.04-xenial
Using PETSc directory: /data/raid1/tmp/petsc/install
Using PETSc arch:
-----------------------------------------
Using C compiler: /data/raid1/tmp/petsc/install/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g -O -fopenmp
Using Fortran compiler: /data/raid1/tmp/petsc/install/bin/mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O -fopenmp
-----------------------------------------
Using include paths: -I/data/raid1/tmp/petsc/install/include
-----------------------------------------
Using C linker: /data/raid1/tmp/petsc/install/bin/mpicc
Using Fortran linker: /data/raid1/tmp/petsc/install/bin/mpif90
Using libraries: -Wl,-rpath,/data/raid1/tmp/petsc/install/lib -L/data/raid1/tmp/petsc/install/lib -lpetsc -Wl,-rpath,/data/raid1/tmp/petsc/install/lib -L/data/raid1/tmp/petsc/install/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 /usr/lib/liblapack.so /usr/lib/libblas.so -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl
-----------------------------------------
0.18user 0.03system 0:00.12elapsed 177%CPU (0avgtext+0avgdata 20060maxresident)k
0inputs+0outputs (0major+9903minor)pagefaults 0swaps
user at kitesrv:/data/raid1/tmp/petsc/src/ksp/ksp/tutorials$
-------------- next part --------------
user at kitesrv:/data/raid1/tmp/petsc/src/ksp/ksp/tutorials$ LD_LIBRARY_PATH=/data/raid1/tmp/petsc/install/lib time /data/raid1/tmp/petsc/install/bin/mpirun -n 2 ./ex5 -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view
0 KSP preconditioned resid norm 2.540548908415e+02 true resid norm 5.690404203569e+02 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 5.524657937376e+01 true resid norm 1.536964982563e+02 ||r(i)||/||b|| 2.700976815669e-01
2 KSP preconditioned resid norm 1.904775180107e+01 true resid norm 6.861661839730e+01 ||r(i)||/||b|| 1.205830305592e-01
3 KSP preconditioned resid norm 4.708471233594e+00 true resid norm 2.115510975962e+01 ||r(i)||/||b|| 3.717681381290e-02
4 KSP preconditioned resid norm 1.055333034486e+00 true resid norm 4.779687437415e+00 ||r(i)||/||b|| 8.399556984752e-03
5 KSP preconditioned resid norm 5.287930275880e-02 true resid norm 2.395448983884e-01 ||r(i)||/||b|| 4.209628873783e-04
6 KSP preconditioned resid norm 6.852115363810e-03 true resid norm 2.760282135279e-02 ||r(i)||/||b|| 4.850766371829e-05
7 KSP preconditioned resid norm 8.937861499755e-04 true resid norm 3.024963693385e-03 ||r(i)||/||b|| 5.315903027570e-06
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 2 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 2 MPI processes
type: bjacobi
number of blocks = 2
Local solver information for first block is in the following KSP and PC objects on rank 0:
Use -ksp_view ::ascii_info_detail to display information for all blocks
KSP Object: (sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (sub_) 1 MPI processes
type: ilu
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=6, cols=6
package used to perform factorization: petsc
total: nonzeros=18, allocated nonzeros=18
not using I-node routines
linear system matrix = precond matrix:
Mat Object: (sub_) 1 MPI processes
type: seqaij
rows=6, cols=6
total: nonzeros=18, allocated nonzeros=30
total number of mallocs used during MatSetValues calls=0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 2 MPI processes
type: mpiaij
rows=12, cols=12
total: nonzeros=46, allocated nonzeros=120
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Norm of error 0.00121238, Iterations 7
0 KSP preconditioned resid norm 2.502924928760e+02 true resid norm 1.031693268370e+03 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 4.655858214018e+01 true resid norm 2.232934339942e+02 ||r(i)||/||b|| 2.164339352015e-01
2 KSP preconditioned resid norm 7.917304062742e+00 true resid norm 4.856470565410e+01 ||r(i)||/||b|| 4.707281431702e-02
3 KSP preconditioned resid norm 9.318670651891e-01 true resid norm 6.270700755466e+00 ||r(i)||/||b|| 6.078066948496e-03
4 KSP preconditioned resid norm 1.408588814076e-01 true resid norm 8.958183659370e-01 ||r(i)||/||b|| 8.682991286279e-04
5 KSP preconditioned resid norm 4.306995949139e-03 true resid norm 2.763097772714e-02 ||r(i)||/||b|| 2.678216343390e-05
6 KSP preconditioned resid norm 3.435999542630e-04 true resid norm 2.406928714402e-03 ||r(i)||/||b|| 2.332988678122e-06
Linear solve converged due to CONVERGED_RTOL iterations 6
KSP Object: 2 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 2 MPI processes
type: bjacobi
number of blocks = 2
Local solver information for first block is in the following KSP and PC objects on rank 0:
Use -ksp_view ::ascii_info_detail to display information for all blocks
KSP Object: (sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (sub_) 1 MPI processes
type: ilu
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=6, cols=6
package used to perform factorization: petsc
total: nonzeros=18, allocated nonzeros=18
not using I-node routines
linear system matrix = precond matrix:
Mat Object: (sub_) 1 MPI processes
type: seqaij
rows=6, cols=6
total: nonzeros=18, allocated nonzeros=30
total number of mallocs used during MatSetValues calls=0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 2 MPI processes
type: mpiaij
rows=12, cols=12
total: nonzeros=46, allocated nonzeros=120
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Norm of error 0.000322889, Iterations 6
**************************************** ***********************************************************************************************************************
*** WIDEN YOUR WINDOW TO 160 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
****************************************************************************************************************************************************************
------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------
./ex5 on a named kitesrv with 2 processors, by user Tue May 31 13:42:24 2022
Using 3 OpenMP threads
Using Petsc Release Version 3.17.1, unknown
Max Max/Min Avg Total
Time (sec): 3.459e-03 1.000 3.459e-03
Objects: 7.700e+01 1.000 7.700e+01
Flops: 4.400e+03 1.002 4.396e+03 8.792e+03
Flops/sec: 1.272e+06 1.002 1.271e+06 2.542e+06
MPI Msg Count: 3.600e+01 1.000 3.600e+01 7.200e+01
MPI Msg Len (bytes): 1.104e+03 1.000 3.067e+01 2.208e+03
MPI Reductions: 8.700e+01 1.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 1.2309e-04 3.6% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 2.000e+00 2.3%
1: Original Solve: 2.7587e-03 79.8% 4.7880e+03 54.5% 3.800e+01 52.8% 2.821e+01 48.6% 4.100e+01 47.1%
2: Second Solve: 5.7332e-04 16.6% 4.0040e+03 45.5% 3.400e+01 47.2% 3.341e+01 51.4% 2.600e+01 29.9%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
--- Event Stage 1: Original Solve
BuildTwoSided 3 1.0 2.1347e-05 1.0 0.00e+00 0.0 2.0e+00 4.0e+00 3.0e+00 1 0 3 0 3 1 0 5 1 7 0
BuildTwoSidedF 2 1.0 1.9346e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 1 0 0 0 2 1 0 0 0 5 0
MatMult 16 1.0 1.5458e-04 1.8 6.40e+02 1.0 3.6e+01 3.0e+01 1.0e+00 4 15 50 48 1 4 27 95 99 2 8
MatSolve 8 1.0 2.2670e-06 1.5 2.40e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 10 0 0 0 212
MatLUFactorNum 1 1.0 2.4620e-06 1.3 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 24
MatILUFactorSym 1 1.0 1.1469e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 1 1.0 1.7848e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 1 0 0 0 2 0
MatAssemblyEnd 1 1.0 1.5707e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 5 0 0 0 6 6 0 0 0 12 0
MatGetRowIJ 1 1.0 3.0100e-07 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 8.4270e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 3 3.0 9.6613e-05 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 2 0 0 0 1 2 0 0 0 2 0
VecMDot 7 1.0 3.1623e-05 2.2 3.08e+02 1.0 0.0e+00 0.0e+00 7.0e+00 1 7 0 0 8 1 13 0 0 17 19
VecNorm 18 1.0 2.5209e-04 1.0 2.16e+02 1.0 0.0e+00 0.0e+00 1.8e+01 7 5 0 0 21 9 9 0 0 44 2
VecScale 8 1.0 2.9390e-06 1.0 4.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 33
VecCopy 9 1.0 2.5980e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 17 1.0 3.0360e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 9 1.0 5.6680e-06 1.2 1.08e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 5 0 0 0 38
VecAYPX 8 1.0 1.1690e-06 1.1 4.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 82
VecMAXPY 15 1.0 2.9050e-06 1.0 7.56e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 17 0 0 0 0 32 0 0 0 520
VecAssemblyBegin 1 1.0 1.5341e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 1 0 0 0 2 0
VecAssemblyEnd 1 1.0 9.4300e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 16 1.0 5.1321e-05 1.0 0.00e+00 0.0 3.6e+01 3.0e+01 1.0e+00 1 0 50 48 1 2 0 95 99 2 0
VecScatterEnd 16 1.0 9.3558e-05 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
VecNormalize 8 1.0 2.0754e-05 1.0 1.44e+02 1.0 0.0e+00 0.0e+00 8.0e+00 1 3 0 0 9 1 6 0 0 20 14
SFSetGraph 1 1.0 1.0930e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFSetUp 1 1.0 2.0803e-05 1.0 0.00e+00 0.0 4.0e+00 1.0e+01 1.0e+00 1 0 6 2 1 1 0 11 4 2 0
SFPack 16 1.0 2.1140e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFUnpack 16 1.0 1.5950e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 2 1.0 2.2458e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 1 0 0 0 2 1 0 0 0 5 0
KSPSolve 1 1.0 9.9155e-04 1.0 2.30e+03 1.0 3.0e+01 3.2e+01 2.4e+01 29 52 42 43 28 36 96 79 90 59 5
KSPGMRESOrthog 7 1.0 3.8334e-05 1.9 6.44e+02 1.0 0.0e+00 0.0e+00 7.0e+00 1 15 0 0 8 1 27 0 0 17 34
PCSetUp 2 1.0 1.1958e-04 1.0 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00 3 1 0 0 0 4 1 0 0 0 1
PCSetUpOnBlocks 1 1.0 5.4691e-05 1.0 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 1
PCApply 8 1.0 5.9135e-05 1.5 2.40e+02 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 2 10 0 0 0 8
--- Event Stage 2: Second Solve
BuildTwoSided 1 1.0 4.5420e-06 1.3 0.00e+00 0.0 2.0e+00 4.0e+00 1.0e+00 0 0 3 0 1 1 0 6 1 4 0
BuildTwoSidedF 1 1.0 5.9960e-06 1.2 0.00e+00 0.0 4.0e+00 5.8e+01 1.0e+00 0 0 6 11 1 1 0 12 20 4 0
MatMult 14 1.0 9.5330e-05 1.8 5.60e+02 1.0 2.8e+01 3.2e+01 0.0e+00 2 13 39 41 0 13 28 82 79 0 12
MatSolve 7 1.0 1.4380e-06 1.3 2.10e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 10 0 0 0 292
MatLUFactorNum 1 1.0 1.4460e-06 1.0 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 41
MatAssemblyBegin 1 1.0 1.4872e-05 1.1 0.00e+00 0.0 4.0e+00 5.8e+01 1.0e+00 0 0 6 11 1 2 0 12 20 4 0
MatAssemblyEnd 1 1.0 1.9540e-05 1.0 7.00e+00 1.0 0.0e+00 0.0e+00 2.0e+00 1 0 0 0 2 3 0 0 0 8 1
MatZeroEntries 1 1.0 2.3710e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 3 3.0 7.2415e-05 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 1 0 0 0 1 8 0 0 0 4 0
VecMDot 6 1.0 1.1563e-05 1.3 2.31e+02 1.0 0.0e+00 0.0e+00 6.0e+00 0 5 0 0 7 2 12 0 0 23 40
VecNorm 16 1.0 3.3041e-05 2.4 1.92e+02 1.0 0.0e+00 0.0e+00 1.6e+01 1 4 0 0 18 4 10 0 0 62 12
VecScale 7 1.0 1.4170e-06 1.1 4.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 59
VecCopy 8 1.0 2.3300e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 15 1.0 1.9460e-06 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 8 1.0 1.4530e-06 1.4 9.60e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 5 0 0 0 132
VecAYPX 7 1.0 1.0070e-06 1.2 4.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 83
VecMAXPY 13 1.0 2.4410e-06 1.2 5.76e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 13 0 0 0 0 29 0 0 0 472
VecScatterBegin 14 1.0 1.5003e-05 1.1 0.00e+00 0.0 2.8e+01 3.2e+01 0.0e+00 0 0 39 41 0 2 0 82 79 0 0
VecScatterEnd 14 1.0 7.5876e-05 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 9 0 0 0 0 0
VecNormalize 7 1.0 1.0780e-05 1.2 1.26e+02 1.0 0.0e+00 0.0e+00 7.0e+00 0 3 0 0 8 2 6 0 0 27 23
SFPack 14 1.0 1.7980e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFUnpack 14 1.0 1.3100e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 2 1.0 2.2500e-07 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 2.0139e-04 1.0 1.88e+03 1.0 2.6e+01 3.2e+01 2.1e+01 6 43 36 38 24 34 94 76 73 81 19
KSPGMRESOrthog 6 1.0 1.4939e-05 1.2 4.83e+02 1.0 0.0e+00 0.0e+00 6.0e+00 0 11 0 0 7 2 24 0 0 23 65
PCSetUp 2 1.0 5.7350e-06 1.2 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 10
PCSetUpOnBlocks 1 1.0 3.6650e-06 1.1 3.20e+01 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 16
PCApply 7 1.0 1.4785e-05 1.2 2.10e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 2 10 0 0 0 28
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Viewer 1 1 856 0.
--- Event Stage 1: Original Solve
Matrix 5 1 3376 0.
Vector 39 18 32928 0.
Index Set 5 2 1832 0.
Star Forest Graph 3 0 0 0.
Krylov Solver 2 0 0 0.
Preconditioner 2 0 0 0.
Viewer 2 1 856 0.
Distributed Mesh 1 0 0 0.
Discrete System 1 0 0 0.
Weak Form 1 0 0 0.
--- Event Stage 2: Second Solve
Matrix 0 4 14744 0.
Vector 14 35 63624 0.
Index Set 0 3 2808 0.
Star Forest Graph 0 3 3408 0.
Krylov Solver 0 2 20534 0.
Preconditioner 0 2 1968 0.
Viewer 1 1 856 0.
Distributed Mesh 0 1 5080 0.
Discrete System 0 1 976 0.
Weak Form 0 1 632 0.
========================================================================================================================
Average time to get PetscTime(): 2.31e-08
Average time for MPI_Barrier(): 3.43e-07
Average time for zero size MPI_Send(): 5.015e-07
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_view
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-python --prefix=/data/raid1/tmp/petsc/install --with-debugging=no --with-blas-lib=/usr/lib/libblas.so --with-lapack-lib=/usr/lib/liblapack.so --with-openmp=true --with-mpi=true --download-openmpi=yes
-----------------------------------------
Libraries compiled on 2022-05-31 10:31:52 on kitesrv
Machine characteristics: Linux-4.4.0-116-generic-x86_64-with-Ubuntu-16.04-xenial
Using PETSc directory: /data/raid1/tmp/petsc/install
Using PETSc arch:
-----------------------------------------
Using C compiler: /data/raid1/tmp/petsc/install/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g -O -fopenmp
Using Fortran compiler: /data/raid1/tmp/petsc/install/bin/mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O -fopenmp
-----------------------------------------
Using include paths: -I/data/raid1/tmp/petsc/install/include
-----------------------------------------
Using C linker: /data/raid1/tmp/petsc/install/bin/mpicc
Using Fortran linker: /data/raid1/tmp/petsc/install/bin/mpif90
Using libraries: -Wl,-rpath,/data/raid1/tmp/petsc/install/lib -L/data/raid1/tmp/petsc/install/lib -lpetsc -Wl,-rpath,/data/raid1/tmp/petsc/install/lib -L/data/raid1/tmp/petsc/install/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 /usr/lib/liblapack.so /usr/lib/libblas.so -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl
-----------------------------------------
0.07user 0.01system 0:00.09elapsed 92%CPU (0avgtext+0avgdata 19980maxresident)k
0inputs+0outputs (0major+6673minor)pagefaults 0swaps
user at kitesrv:/data/raid1/tmp/petsc/src/ksp/ksp/tutorials$
-------------- next part --------------
[lida at head1 tutorials]$ time mpirun -n 3 ./ex5 -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view 2>/dev/null
0 KSP preconditioned resid norm 4.020939481591e+02 true resid norm 9.763918270858e+02 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 1.222668532006e+02 true resid norm 2.875957243877e+02 ||r(i)||/||b|| 2.945494999135e-01
2 KSP preconditioned resid norm 6.915159267765e+01 true resid norm 1.565583438606e+02 ||r(i)||/||b|| 1.603437672434e-01
3 KSP preconditioned resid norm 1.304338629042e+01 true resid norm 4.745383154681e+01 ||r(i)||/||b|| 4.860121749323e-02
4 KSP preconditioned resid norm 4.706344725891e+00 true resid norm 1.682392035308e+01 ||r(i)||/||b|| 1.723070583589e-02
5 KSP preconditioned resid norm 5.039609363554e-01 true resid norm 2.504548411461e+00 ||r(i)||/||b|| 2.565105874489e-03
6 KSP preconditioned resid norm 1.055361110378e-01 true resid norm 5.260792846119e-01 ||r(i)||/||b|| 5.387993529013e-04
7 KSP preconditioned resid norm 1.234936672024e-02 true resid norm 3.467184429455e-02 ||r(i)||/||b|| 3.551017463761e-05
8 KSP preconditioned resid norm 6.180717049285e-03 true resid norm 1.824456980445e-02 ||r(i)||/||b|| 1.868570516296e-05
9 KSP preconditioned resid norm 1.482144308741e-04 true resid norm 6.740391026576e-04 ||r(i)||/||b|| 6.903366906188e-07
Linear solve converged due to CONVERGED_RTOL iterations 9
KSP Object: 3 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 3 MPI processes
type: bjacobi
number of blocks = 3
Local solver information for first block is in the following KSP and PC objects on rank 0:
Use -ksp_view ::ascii_info_detail to display information for all blocks
KSP Object: (sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (sub_) 1 MPI processes
type: ilu
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=6, cols=6
package used to perform factorization: petsc
total: nonzeros=16, allocated nonzeros=16
not using I-node routines
linear system matrix = precond matrix:
Mat Object: (sub_) 1 MPI processes
type: seqaij
rows=6, cols=6
total: nonzeros=16, allocated nonzeros=30
total number of mallocs used during MatSetValues calls=0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 3 MPI processes
type: mpiaij
rows=18, cols=18
total: nonzeros=72, allocated nonzeros=180
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Norm of error 0.000208575, Iterations 9
0 KSP preconditioned resid norm 4.608512576274e+02 true resid norm 2.016631101615e+03 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 9.412072370874e+01 true resid norm 4.038990960334e+02 ||r(i)||/||b|| 2.002840756101e-01
2 KSP preconditioned resid norm 2.282248495386e+01 true resid norm 9.733980188154e+01 ||r(i)||/||b|| 4.826852159703e-02
3 KSP preconditioned resid norm 1.365448582262e+00 true resid norm 7.954929613540e+00 ||r(i)||/||b|| 3.944662763145e-03
4 KSP preconditioned resid norm 2.252869372987e-01 true resid norm 1.285361707036e+00 ||r(i)||/||b|| 6.373806820723e-04
5 KSP preconditioned resid norm 1.586897237676e-02 true resid norm 1.098248593144e-01 ||r(i)||/||b|| 5.445956834962e-05
6 KSP preconditioned resid norm 1.905899805612e-03 true resid norm 1.346481495832e-02 ||r(i)||/||b|| 6.676885498560e-06
Linear solve converged due to CONVERGED_RTOL iterations 6
KSP Object: 3 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 3 MPI processes
type: bjacobi
number of blocks = 3
Local solver information for first block is in the following KSP and PC objects on rank 0:
Use -ksp_view ::ascii_info_detail to display information for all blocks
KSP Object: (sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (sub_) 1 MPI processes
type: ilu
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=6, cols=6
package used to perform factorization: petsc
total: nonzeros=16, allocated nonzeros=16
not using I-node routines
linear system matrix = precond matrix:
Mat Object: (sub_) 1 MPI processes
type: seqaij
rows=6, cols=6
total: nonzeros=16, allocated nonzeros=30
total number of mallocs used during MatSetValues calls=0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 3 MPI processes
type: mpiaij
rows=18, cols=18
total: nonzeros=72, allocated nonzeros=180
total number of mallocs used during MatSetValues calls=0
not using I-node (on process 0) routines
Norm of error 0.00172305, Iterations 6
**************************************** ***********************************************************************************************************************
*** WIDEN YOUR WINDOW TO 160 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
****************************************************************************************************************************************************************
------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------
./ex5 on a named head1.hpc with 3 processors, by lida Tue May 31 12:18:30 2022
Using Petsc Release Version 3.17.1, unknown
Max Max/Min Avg Total
Time (sec): 1.783e-02 1.000 1.783e-02
Objects: 8.100e+01 1.000 8.100e+01
Flops: 5.590e+03 1.080 5.314e+03 1.594e+04
Flops/sec: 3.136e+05 1.080 2.981e+05 8.942e+05
MPI Msg Count: 7.800e+01 1.857 5.467e+01 1.640e+02
MPI Msg Len (bytes): 3.808e+03 1.827 4.868e+01 7.984e+03
MPI Reductions: 9.300e+01 1.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 2.8862e-04 1.6% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 2.000e+00 2.2%
1: Original Solve: 1.3756e-02 77.2% 9.9600e+03 62.5% 9.200e+01 56.1% 4.435e+01 51.1% 4.700e+01 50.5%
2: Second Solve: 3.7698e-03 21.1% 5.9820e+03 37.5% 7.200e+01 43.9% 5.422e+01 48.9% 2.600e+01 28.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
--- Event Stage 1: Original Solve
BuildTwoSided 3 1.0 1.1443e-04 1.5 0.00e+00 0.0 4.0e+00 8.0e+00 3.0e+00 1 0 2 0 3 1 0 4 1 6 0
BuildTwoSidedF 2 1.0 7.2485e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 2 0 0 0 0 4 0
MatMult 20 1.0 5.2744e-0311.5 1.00e+03 1.3 8.8e+01 4.6e+01 1.0e+00 12 16 54 51 1 15 25 96100 2 0
MatSolve 10 1.0 9.1130e-06 1.2 2.60e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 8 0 0 0 86
MatLUFactorNum 1 1.0 2.2792e-05 2.1 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 3
MatILUFactorSym 1 1.0 3.5026e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 1 1.0 7.2907e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0
MatAssemblyEnd 1 1.0 5.0211e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 3 0 0 0 5 4 0 0 0 11 0
MatGetRowIJ 1 1.0 4.2003e-07 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 2.0961e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 3 3.0 5.7218e-04 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 2 0 0 0 1 3 0 0 0 2 0
VecMDot 9 1.0 4.9467e-0372.0 4.95e+02 1.0 0.0e+00 0.0e+00 9.0e+00 10 9 0 0 10 12 15 0 0 19 0
VecNorm 22 1.0 4.1558e-04 2.1 2.64e+02 1.0 0.0e+00 0.0e+00 2.2e+01 2 5 0 0 24 2 8 0 0 47 2
VecScale 10 1.0 1.0288e-05 1.1 6.00e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 17
VecCopy 11 1.0 8.5477e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 21 1.0 1.1027e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 11 1.0 7.3556e-06 1.3 1.32e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 4 0 0 0 54
VecAYPX 10 1.0 4.7944e-06 1.1 6.00e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 38
VecMAXPY 19 1.0 8.6166e-06 1.2 1.19e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 22 0 0 0 0 36 0 0 0 414
VecAssemblyBegin 1 1.0 3.5299e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0
VecAssemblyEnd 1 1.0 1.8599e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 20 1.0 2.5115e-04 1.1 0.00e+00 0.0 8.8e+01 4.6e+01 1.0e+00 1 0 54 51 1 2 0 96100 2 0
VecScatterEnd 20 1.0 4.9759e-0325.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 13 0 0 0 0 0
VecNormalize 10 1.0 1.1400e-04 1.2 1.80e+02 1.0 0.0e+00 0.0e+00 1.0e+01 1 3 0 0 11 1 5 0 0 21 5
SFSetGraph 1 1.0 8.3325e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFSetUp 1 1.0 1.1758e-04 1.2 0.00e+00 0.0 8.0e+00 2.8e+01 1.0e+00 1 0 5 3 1 1 0 9 5 2 0
SFPack 20 1.0 1.2662e-05 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFUnpack 20 1.0 4.5374e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 2 1.0 2.3160e-04 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 1 0 0 0 2 1 0 0 0 4 0
KSPSolve 1 1.0 7.0816e-03 1.0 3.38e+03 1.1 7.6e+01 4.8e+01 3.0e+01 39 61 46 46 32 51 97 83 89 64 1
KSPGMRESOrthog 9 1.0 4.9614e-0358.1 1.04e+03 1.0 0.0e+00 0.0e+00 9.0e+00 10 19 0 0 10 12 31 0 0 19 1
PCSetUp 2 1.0 3.2492e-04 1.1 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 1 0 0 0 0
PCSetUpOnBlocks 1 1.0 1.5849e-04 1.0 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 1 0 0 0 0
PCApply 10 1.0 7.6554e-05 1.4 2.60e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 8 0 0 0 10
--- Event Stage 2: Second Solve
BuildTwoSided 1 1.0 5.5462e-05 2.6 0.00e+00 0.0 6.0e+00 8.0e+00 1.0e+00 0 0 4 1 1 1 0 8 1 4 0
BuildTwoSidedF 1 1.0 6.0628e-05 2.3 0.00e+00 0.0 1.2e+01 1.0e+02 1.0e+00 0 0 7 15 1 1 0 17 31 4 0
MatMult 14 1.0 9.3008e-04 6.9 7.00e+02 1.3 5.6e+01 4.8e+01 0.0e+00 2 11 34 34 0 11 29 78 69 0 2
MatSolve 7 1.0 3.2000e-06 1.2 1.82e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 9 0 0 0 171
MatLUFactorNum 1 1.0 2.4736e-06 1.2 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 25
MatAssemblyBegin 1 1.0 8.0139e-05 1.9 0.00e+00 0.0 1.2e+01 1.0e+02 1.0e+00 0 0 7 15 1 2 0 17 31 4 0
MatAssemblyEnd 1 1.0 3.1728e-05 1.1 1.80e+01 1.2 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 2 1 1 0 0 8 2
MatZeroEntries 1 1.0 4.5244e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 3 3.0 4.9627e-04 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 2 0 0 0 1 9 0 0 0 4 0
VecMDot 6 1.0 8.6141e-0420.9 2.31e+02 1.0 0.0e+00 0.0e+00 6.0e+00 2 4 0 0 6 8 12 0 0 23 1
VecNorm 16 1.0 3.6575e-04 3.1 1.92e+02 1.0 0.0e+00 0.0e+00 1.6e+01 2 4 0 0 17 7 10 0 0 62 2
VecScale 7 1.0 4.1863e-06 2.5 4.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 30
VecCopy 8 1.0 5.2368e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 15 1.0 5.1260e-06 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 8 1.0 4.3251e-06 2.1 9.60e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 5 0 0 0 67
VecAYPX 7 1.0 1.7947e-06 1.1 4.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 70
VecMAXPY 13 1.0 5.2378e-06 1.5 5.76e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 0 29 0 0 0 330
VecScatterBegin 14 1.0 4.4449e-05 1.6 0.00e+00 0.0 5.6e+01 4.8e+01 0.0e+00 0 0 34 34 0 1 0 78 69 0 0
VecScatterEnd 14 1.0 8.6884e-0411.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 9 0 0 0 0 0
VecNormalize 7 1.0 5.4329e-05 1.2 1.26e+02 1.0 0.0e+00 0.0e+00 7.0e+00 0 2 0 0 8 1 6 0 0 27 7
SFPack 14 1.0 5.7872e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFUnpack 14 1.0 2.3032e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 2 1.0 4.6380e-07 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 1.3922e-03 1.1 1.99e+03 1.1 5.2e+01 4.8e+01 2.1e+01 7 35 32 31 23 35 94 72 64 81 4
KSPGMRESOrthog 6 1.0 8.6742e-0417.4 4.83e+02 1.0 0.0e+00 0.0e+00 6.0e+00 2 9 0 0 6 9 24 0 0 23 2
PCSetUp 2 1.0 8.6613e-06 1.2 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 7
PCSetUpOnBlocks 1 1.0 6.2576e-06 1.2 2.10e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 10
PCApply 7 1.0 3.7187e-05 1.9 1.82e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 1 9 0 0 0 15
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Viewer 1 1 896 0.
--- Event Stage 1: Original Solve
Matrix 5 1 3616 0.
Vector 43 22 42896 0.
Index Set 5 2 1952 0.
Star Forest Graph 3 0 0 0.
Krylov Solver 2 0 0 0.
Preconditioner 2 0 0 0.
Viewer 2 1 896 0.
Distributed Mesh 1 0 0 0.
Discrete System 1 0 0 0.
Weak Form 1 0 0 0.
--- Event Stage 2: Second Solve
Matrix 0 4 16328 0.
Vector 14 35 67840 0.
Index Set 0 3 3000 0.
Star Forest Graph 0 3 3688 0.
Krylov Solver 0 2 20942 0.
Preconditioner 0 2 2064 0.
Viewer 1 1 896 0.
Distributed Mesh 0 1 5128 0.
Discrete System 0 1 1024 0.
Weak Form 0 1 664 0.
========================================================================================================================
Average time to get PetscTime(): 3.89293e-08
Average time for MPI_Barrier(): 4.57484e-06
Average time for zero size MPI_Send(): 1.97627e-06
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_view
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with 64 bit PetscInt
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-python --prefix=/home/lida -with-mpi-dir=/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4 LDFLAGS="-L/home/lida/lib64 -L/home/lida/lib -L/home/lida/jdk/lib" CPPFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CXXFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" --with-debugging=no --with-64-bit-indices
-----------------------------------------
Libraries compiled on 2022-05-25 08:44:48 on head1.hpc
Machine characteristics: Linux-3.10.0-1062.el7.x86_64-x86_64-with-centos-7.7.1908-Core
Using PETSc directory: /home/lida
Using PETSc arch:
-----------------------------------------
Using C compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 -fPIC -I/home/lida/include -I/home/lida/jdk/include -march=native -O3
Using Fortran compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O -I/home/lida/include -I/home/lida/jdk/include -march=native -O3
-----------------------------------------
Using include paths: -I/home/lida/include -I/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/include
-----------------------------------------
Using C linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc
Using Fortran linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90
Using libraries: -Wl,-rpath,/home/lida/lib -L/home/lida/lib -lpetsc -Wl,-rpath,/home/lida/lib64 -L/home/lida/lib64 -Wl,-rpath,/home/lida/lib -L/home/lida/lib -Wl,-rpath,/home/lida/jdk/lib -L/home/lida/jdk/lib -Wl,-rpath,/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -Wl,-rpath,/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -L/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -Wl,-rpath,/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -L/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib -lopenblas -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl
-----------------------------------------
real 0m2.268s
user 0m1.456s
sys 0m0.583s
[lida at head1 tutorials]$
-------------- next part --------------
[lida at head1 tutorials]$ time mpirun -n 1 ./ex5 -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view 2>/dev/null
0 KSP preconditioned resid norm 6.889609116885e+00 true resid norm 1.664331697709e+01 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 3.735626684805e-01 true resid norm 1.277849582311e+00 ||r(i)||/||b|| 7.677854024348e-02
2 KSP preconditioned resid norm 1.736201489518e-02 true resid norm 6.598680019925e-02 ||r(i)||/||b|| 3.964762570470e-03
3 KSP preconditioned resid norm 6.503363094580e-04 true resid norm 3.819222199389e-03 ||r(i)||/||b|| 2.294748219147e-04
4 KSP preconditioned resid norm 7.801321330590e-06 true resid norm 3.420262185151e-05 ||r(i)||/||b|| 2.055036378781e-06
Linear solve converged due to CONVERGED_RTOL iterations 4
KSP Object: 1 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
type: icc
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using Manteuffel shift [POSITIVE_DEFINITE]
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqsbaij
rows=6, cols=6
package used to perform factorization: petsc
total: nonzeros=13, allocated nonzeros=13
block size is 1
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=6, cols=6
total: nonzeros=20, allocated nonzeros=30
total number of mallocs used during MatSetValues calls=0
not using I-node routines
Norm of error 7.36852e-06, Iterations 4
0 KSP preconditioned resid norm 7.247863732790e+00 true resid norm 3.100000000000e+01 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 1.273569592250e-01 true resid norm 7.519049129476e-01 ||r(i)||/||b|| 2.425499719186e-02
2 KSP preconditioned resid norm 2.160197987659e-03 true resid norm 1.245148592634e-02 ||r(i)||/||b|| 4.016608363334e-04
3 KSP preconditioned resid norm 3.358674359432e-05 true resid norm 2.551377456617e-04 ||r(i)||/||b|| 8.230249860056e-06
Linear solve converged due to CONVERGED_RTOL iterations 3
KSP Object: 1 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
type: icc
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using Manteuffel shift [POSITIVE_DEFINITE]
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqsbaij
rows=6, cols=6
package used to perform factorization: petsc
total: nonzeros=13, allocated nonzeros=13
block size is 1
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=6, cols=6
total: nonzeros=20, allocated nonzeros=30
total number of mallocs used during MatSetValues calls=0
not using I-node routines
Norm of error 3.41929e-05, Iterations 3
**************************************** ***********************************************************************************************************************
*** WIDEN YOUR WINDOW TO 160 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
****************************************************************************************************************************************************************
------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------
./ex5 on a named head1.hpc with 1 processor, by lida Tue May 31 12:18:46 2022
Using Petsc Release Version 3.17.1, unknown
Max Max/Min Avg Total
Time (sec): 1.397e-02 1.000 1.397e-02
Objects: 5.200e+01 1.000 5.200e+01
Flops: 2.076e+03 1.000 2.076e+03 2.076e+03
Flops/sec: 1.486e+05 1.000 1.486e+05 1.486e+05
MPI Msg Count: 0.000e+00 0.000 0.000e+00 0.000e+00
MPI Msg Len (bytes): 0.000e+00 0.000 0.000e+00 0.000e+00
MPI Reductions: 0.000e+00 0.000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total Avg %Total Count %Total
0: Main Stage: 2.2292e-04 1.6% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
1: Original Solve: 9.2670e-03 66.3% 1.1780e+03 56.7% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
2: Second Solve: 4.4709e-03 32.0% 8.9800e+02 43.3% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage ---- Total
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
--- Event Stage 1: Original Solve
MatMult 10 1.0 1.5418e-05 1.0 3.40e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 16 0 0 0 0 29 0 0 0 22
MatSolve 5 1.0 9.0795e-06 1.0 1.70e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 8 0 0 0 0 14 0 0 0 19
MatCholFctrNum 1 1.0 1.2116e-05 1.0 6.00e+00 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 0
MatICCFactorSym 1 1.0 1.7540e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 1 1.0 2.3469e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 1 1.0 2.3169e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 1.0 1.6764e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 1.0921e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatView 2 1.0 2.4787e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 18 0 0 0 0 27 0 0 0 0 0
VecMDot 4 1.0 6.2725e-06 1.0 1.10e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 9 0 0 0 18
VecNorm 12 1.0 1.2450e-05 1.0 1.32e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 6 0 0 0 0 11 0 0 0 11
VecScale 5 1.0 8.2748e-06 1.0 3.00e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 3 0 0 0 4
VecCopy 6 1.0 4.5085e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 6 1.0 6.6161e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 6 1.0 4.7898e-06 1.0 7.20e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 6 0 0 0 15
VecAYPX 5 1.0 3.7868e-06 1.0 3.00e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 3 0 0 0 8
VecMAXPY 9 1.0 3.1283e-06 1.0 2.88e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 14 0 0 0 0 24 0 0 0 92
VecAssemblyBegin 1 1.0 1.3318e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyEnd 1 1.0 1.5367e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 5 1.0 2.7158e-05 1.0 8.50e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 7 0 0 0 3
KSPSetUp 1 1.0 4.6344e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
KSPSolve 1 1.0 2.0673e-03 1.0 1.12e+03 1.0 0.0e+00 0.0e+00 0.0e+00 15 54 0 0 0 22 95 0 0 0 1
KSPGMRESOrthog 4 1.0 1.6174e-05 1.0 2.30e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 0 20 0 0 0 14
PCSetUp 1 1.0 2.1121e-04 1.0 6.00e+00 1.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 1 0 0 0 0
PCApply 5 1.0 1.4246e-05 1.0 1.70e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 8 0 0 0 0 14 0 0 0 12
--- Event Stage 2: Second Solve
MatMult 8 1.0 3.8277e-06 1.0 2.72e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 13 0 0 0 0 30 0 0 0 71
MatSolve 4 1.0 1.9213e-06 1.0 1.36e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 15 0 0 0 71
MatCholFctrNum 1 1.0 3.3919e-06 1.0 6.00e+00 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 2
MatAssemblyBegin 1 1.0 1.1362e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 1 1.0 4.7777e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 1 1.0 4.4834e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 2 1.0 2.4329e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17 0 0 0 0 54 0 0 0 0 0
VecMDot 3 1.0 1.1474e-06 1.0 6.60e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 7 0 0 0 58
VecNorm 10 1.0 2.7772e-06 1.0 1.10e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 12 0 0 0 40
VecScale 4 1.0 1.5302e-06 1.0 2.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 3 0 0 0 16
VecCopy 5 1.0 3.1414e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 5 1.0 1.7434e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 5 1.0 2.1365e-06 1.0 6.00e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 7 0 0 0 28
VecAYPX 4 1.0 9.7509e-07 1.0 2.40e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 3 0 0 0 25
VecMAXPY 7 1.0 2.1346e-06 1.0 1.80e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 9 0 0 0 0 20 0 0 0 84
VecNormalize 4 1.0 5.5209e-06 1.0 6.80e+01 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 8 0 0 0 12
KSPSetUp 1 1.0 4.0047e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 3.7580e-04 1.0 8.15e+02 1.0 0.0e+00 0.0e+00 0.0e+00 3 39 0 0 0 8 91 0 0 0 2
KSPGMRESOrthog 3 1.0 4.7795e-06 1.0 1.38e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 15 0 0 0 29
PCSetUp 1 1.0 6.9309e-06 1.0 6.00e+00 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 1
PCApply 4 1.0 3.5232e-06 1.0 1.36e+02 1.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 15 0 0 0 39
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Viewer 1 1 896 0.
--- Event Stage 1: Original Solve
Matrix 3 1 3672 0.
Vector 29 11 20064 0.
Krylov Solver 1 0 0 0.
Preconditioner 1 0 0 0.
Viewer 1 0 0 0.
Index Set 3 1 1000 0.
Distributed Mesh 1 0 0 0.
Star Forest Graph 2 0 0 0.
Discrete System 1 0 0 0.
Weak Form 1 0 0 0.
--- Event Stage 2: Second Solve
Matrix 0 2 8224 0.
Vector 8 26 47424 0.
Krylov Solver 0 1 19198 0.
Preconditioner 0 1 1048 0.
Index Set 0 2 2000 0.
Distributed Mesh 0 1 5128 0.
Star Forest Graph 0 2 2352 0.
Discrete System 0 1 1024 0.
Weak Form 0 1 664 0.
========================================================================================================================
Average time to get PetscTime(): 3.68804e-08
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_view
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with 64 bit PetscInt
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-python --prefix=/home/lida -with-mpi-dir=/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4 LDFLAGS="-L/home/lida/lib64 -L/home/lida/lib -L/home/lida/jdk/lib" CPPFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CXXFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" --with-debugging=no --with-64-bit-indices
-----------------------------------------
Libraries compiled on 2022-05-25 08:44:48 on head1.hpc
Machine characteristics: Linux-3.10.0-1062.el7.x86_64-x86_64-with-centos-7.7.1908-Core
Using PETSc directory: /home/lida
Using PETSc arch:
-----------------------------------------
Using C compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 -fPIC -I/home/lida/include -I/home/lida/jdk/include -march=native -O3
Using Fortran compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O -I/home/lida/include -I/home/lida/jdk/include -march=native -O3
-----------------------------------------
Using include paths: -I/home/lida/include -I/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/include
-----------------------------------------
Using C linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc
Using Fortran linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90
Using libraries: -Wl,-rpath,/home/lida/lib -L/home/lida/lib -lpetsc -Wl,-rpath,/home/lida/lib64 -L/home/lida/lib64 -Wl,-rpath,/home/lida/lib -L/home/lida/lib -Wl,-rpath,/home/lida/jdk/lib -L/home/lida/jdk/lib -Wl,-rpath,/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -Wl,-rpath,/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -L/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -Wl,-rpath,/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -L/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib -lopenblas -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl
-----------------------------------------
real 0m1.711s
user 0m0.368s
sys 0m0.259s
[lida at head1 tutorials]$
More information about the petsc-users
mailing list