[petsc-users] How to speed up geometric multigrid
Michele Rosso
mrosso at uci.edu
Tue Oct 1 21:02:28 CDT 2013
Barry,
I repeated the previous runs since I noticed that it was not using the
options
-mg_levels_ksp_max_it 3
They are faster then before now but still slower than my initial test
and anyway the solution time increases considerably in time.
I attached the diagnostics (run1.txt and run2.txt, please see the files
for the list of the options I used).
I also run a case by using your last proposed options (run3.txt): there
is a divergence condition since 30 iterations seem not be enough to low
the error below the need tolerance and thus after some time steps my
solution blows up.
Please let me know what you think about it.
In the mean time I will try run my initial test with the option
-mg_levels_ksp_max_it 3 instead of -mg_levels_ksp_max_it 1.
As usual, thank you very much.
Michele
On 09/30/2013 07:17 PM, Barry Smith wrote:
> I wasn't expecting this. Try
>
> -pc_mg_type full -ksp_type richardson -mg_levels_pc_type bjacobi -mg_levels_ksp_type gmres -mg_levels_ksp_max_it 3
> -mg_coarse_pc_factor_mat_solver_package superlu_dist -mg_coarse_pc_type lu -pc_mg_galerkin -pc_mg_levels 5 -pc_type mg
> -log_summary -pc_mg_log -ksp_monitor_true_residual -options_left -ksp_view -ksp_max_it 30
>
>
>
>
> On Sep 30, 2013, at 8:00 PM, Michele Rosso <mrosso at uci.edu> wrote:
>
>> Barry,
>>
>> sorry again for the very late answer. I tried all the variations you proposed: all of them converge very slow , except the last one (CG instead of fgmres) diverges.
>> I attached the diagnostics for the two options that convergence: each of the attached files starts with a list of the options I used for the run.
>> As you pointed out earlier, the residual I used to require was too small, therefore I increased atol to e-9.
>> After some tests, I noticed that any further increase of the absolute tolerance changes significantly the solution.
>> What would you suggest to try next?
>> Thank you very much,
>>
>> Michele
>>
>>
>>
>>
>>
>>
>> On 09/24/2013 05:08 PM, Barry Smith wrote:
>>> Thanks. The balance of work on the different levels and across processes looks ok. So it is is a matter of improving the convergence rate.
>>>
>>> The initial residual norm is very small. Are you sure you need to decrease it to 10^-12 ????
>>>
>>> Start with a really robust multigrid smoother use
>>>
>>> -pc_mg_type full -ksp_type fgmres -mg_levels_pc_type bjacobi -mg_levels_ksp_type gmres -mg_levels_ksp_max_it 3 PLUS -mg_coarse_pc_factor_mat_solver_package superlu_dist
>>> -mg_coarse_pc_type lu -pc_mg_galerkin -pc_mg_levels 5 -pc_mg_log -pc_type mg
>>>
>>> run with the -log_summary and -pc_mg_log
>>>
>>> Now back off a little on the smoother and use -mg_levels_pc_type sor instead how does that change the convergence and time.
>>>
>>> Back off even more an replace the -ksp_type fgmres with -ksp_type cg and the -mg_levels_ksp_type gmres with -mg_levels_ksp_type richardson how does that change the convergence and the time?
>>>
>>> There are some additional variants we can try based on the results from above.
>>>
>>> Barry
>>>
>>>
>>>
>>> On Sep 24, 2013, at 4:29 PM, Michele Rosso <mrosso at uci.edu> wrote:
>>>
>>>> Barry,
>>>>
>>>> I re-rerun the test case with the option -pc_mg_log as you suggested.
>>>> I attached the new output ("final_new.txt').
>>>> Thanks for your help.
>>>>
>>>> Michele
>>>>
>>>> On 09/23/2013 09:35 AM, Barry Smith wrote:
>>>>> Run with the additional option -pc_mg_log and send us the log file.
>>>>>
>>>>> Barry
>>>>>
>>>>> Maybe we should make this the default somehow.
>>>>>
>>>>>
>>>>> On Sep 23, 2013, at 10:55 AM, Michele Rosso <mrosso at uci.edu> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am successfully using PETSc to solve a 3D Poisson's equation with CG + MG . Such equation arises from a projection algorithm for a multiphase incompressible flow simulation.
>>>>>> I set up the solver as I was suggested to do in a previous thread (title: "GAMG speed") and run a test case (liquid droplet with surface tension falling under the effect of gravity in a quiescent fluid).
>>>>>> The solution of the Poisson Equation via multigrid is correct but it becomes progressively slower and slower as the simulation progresses (I am performing successive solves) due to an increase in the number of iterations.
>>>>>> Since the solution of the Poisson equation is mission-critical, I need to speed it up as much as I can.
>>>>>> Could you please help me out with this?
>>>>>>
>>>>>> I run the test case with the following options:
>>>>>>
>>>>>> -pc_type mg -pc_mg_galerkin -pc_mg_levels 5 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1
>>>>>> -mg_coarse_pc_type lu -mg_coarse_pc_factor_mat_solver_package superlu_dist
>>>>>> -log_summary -ksp_view -ksp_monitor_true_residual -options_left
>>>>>>
>>>>>> Please find the diagnostic for the final solve in the attached file "final.txt'.
>>>>>> Thank you,
>>>>>>
>>>>>> Michele
>>>>>> <final.txt>
>>>> <final_new.txt>
>> <final1.txt><final2.txt>
>
-------------- next part --------------
OPTIONS:
-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
0 KSP unpreconditioned resid norm 1.036334906411e-06 true resid norm 1.036334906411e-06 ||r(i)||/||b|| 1.000000000000e+00
1 KSP unpreconditioned resid norm 7.265278566968e-07 true resid norm 7.265278566968e-07 ||r(i)||/||b|| 7.010550857663e-01
2 KSP unpreconditioned resid norm 5.582868896378e-07 true resid norm 5.582868896378e-07 ||r(i)||/||b|| 5.387128101002e-01
3 KSP unpreconditioned resid norm 3.634998226503e-07 true resid norm 3.634998226503e-07 ||r(i)||/||b|| 3.507551665022e-01
4 KSP unpreconditioned resid norm 3.158574836963e-07 true resid norm 3.158574836963e-07 ||r(i)||/||b|| 3.047832141350e-01
5 KSP unpreconditioned resid norm 2.777502658984e-07 true resid norm 2.777502658984e-07 ||r(i)||/||b|| 2.680120723331e-01
6 KSP unpreconditioned resid norm 2.284215345140e-07 true resid norm 2.284215345140e-07 ||r(i)||/||b|| 2.204128540889e-01
7 KSP unpreconditioned resid norm 1.969414140052e-07 true resid norm 1.969414140052e-07 ||r(i)||/||b|| 1.900364571210e-01
8 KSP unpreconditioned resid norm 1.782653889853e-07 true resid norm 1.782653889853e-07 ||r(i)||/||b|| 1.720152316423e-01
9 KSP unpreconditioned resid norm 1.536397911477e-07 true resid norm 1.536397911477e-07 ||r(i)||/||b|| 1.482530311362e-01
10 KSP unpreconditioned resid norm 1.302441466920e-07 true resid norm 1.302441466920e-07 ||r(i)||/||b|| 1.256776606542e-01
11 KSP unpreconditioned resid norm 1.142838199081e-07 true resid norm 1.142838199081e-07 ||r(i)||/||b|| 1.102769183988e-01
12 KSP unpreconditioned resid norm 1.031156644411e-07 true resid norm 1.031156644411e-07 ||r(i)||/||b|| 9.950032928847e-02
13 KSP unpreconditioned resid norm 9.027533291890e-08 true resid norm 9.027533291890e-08 ||r(i)||/||b|| 8.711019223651e-02
14 KSP unpreconditioned resid norm 7.493550605983e-08 true resid norm 7.493550605983e-08 ||r(i)||/||b|| 7.230819457710e-02
15 KSP unpreconditioned resid norm 6.742982059332e-08 true resid norm 6.742982059333e-08 ||r(i)||/||b|| 6.506566571884e-02
16 KSP unpreconditioned resid norm 5.379988711018e-08 true resid norm 5.379988711019e-08 ||r(i)||/||b|| 5.191361091608e-02
17 KSP unpreconditioned resid norm 4.872990226442e-08 true resid norm 4.872990226444e-08 ||r(i)||/||b|| 4.702138465372e-02
18 KSP unpreconditioned resid norm 4.543576707229e-08 true resid norm 4.543576707230e-08 ||r(i)||/||b|| 4.384274503465e-02
19 KSP unpreconditioned resid norm 4.243835036633e-08 true resid norm 4.243835036635e-08 ||r(i)||/||b|| 4.095042066401e-02
20 KSP unpreconditioned resid norm 3.855396651833e-08 true resid norm 3.855396651834e-08 ||r(i)||/||b|| 3.720222707913e-02
21 KSP unpreconditioned resid norm 3.540838965507e-08 true resid norm 3.540838965509e-08 ||r(i)||/||b|| 3.416693718994e-02
22 KSP unpreconditioned resid norm 3.114021467696e-08 true resid norm 3.114021467699e-08 ||r(i)||/||b|| 3.004840856401e-02
23 KSP unpreconditioned resid norm 2.687679086480e-08 true resid norm 2.687679086485e-08 ||r(i)||/||b|| 2.593446452356e-02
24 KSP unpreconditioned resid norm 2.231320925522e-08 true resid norm 2.231320925524e-08 ||r(i)||/||b|| 2.153088650898e-02
25 KSP unpreconditioned resid norm 1.849367046766e-08 true resid norm 1.849367046769e-08 ||r(i)||/||b|| 1.784526445387e-02
26 KSP unpreconditioned resid norm 1.597347720732e-08 true resid norm 1.597347720735e-08 ||r(i)||/||b|| 1.541343161224e-02
27 KSP unpreconditioned resid norm 1.351813033069e-08 true resid norm 1.351813033073e-08 ||r(i)||/||b|| 1.304417157726e-02
28 KSP unpreconditioned resid norm 1.135895547453e-08 true resid norm 1.135895547456e-08 ||r(i)||/||b|| 1.096069948458e-02
29 KSP unpreconditioned resid norm 9.644960881002e-09 true resid norm 9.644960881027e-09 ||r(i)||/||b|| 9.306799202997e-03
30 KSP unpreconditioned resid norm 8.454149815651e-09 true resid norm 8.454149815651e-09 ||r(i)||/||b|| 8.157739127910e-03
31 KSP unpreconditioned resid norm 7.380097753084e-09 true resid norm 7.380097753084e-09 ||r(i)||/||b|| 7.121344371812e-03
32 KSP unpreconditioned resid norm 6.949063499474e-09 true resid norm 6.949063499474e-09 ||r(i)||/||b|| 6.705422596965e-03
33 KSP unpreconditioned resid norm 6.732114039970e-09 true resid norm 6.732114039970e-09 ||r(i)||/||b|| 6.496079595816e-03
34 KSP unpreconditioned resid norm 5.348043445752e-09 true resid norm 5.348043445752e-09 ||r(i)||/||b|| 5.160535858308e-03
35 KSP unpreconditioned resid norm 4.753111075163e-09 true resid norm 4.753111075163e-09 ||r(i)||/||b|| 4.586462393342e-03
36 KSP unpreconditioned resid norm 4.053751219961e-09 true resid norm 4.053751219961e-09 ||r(i)||/||b|| 3.911622772602e-03
37 KSP unpreconditioned resid norm 3.648046750367e-09 true resid norm 3.648046750367e-09 ||r(i)||/||b|| 3.520142694990e-03
38 KSP unpreconditioned resid norm 3.154002693751e-09 true resid norm 3.154002693751e-09 ||r(i)||/||b|| 3.043420301911e-03
39 KSP unpreconditioned resid norm 2.796711364093e-09 true resid norm 2.796711364093e-09 ||r(i)||/||b|| 2.698655952618e-03
40 KSP unpreconditioned resid norm 2.490859697022e-09 true resid norm 2.490859697022e-09 ||r(i)||/||b|| 2.403527741479e-03
41 KSP unpreconditioned resid norm 2.214978925969e-09 true resid norm 2.214978925969e-09 ||r(i)||/||b|| 2.137319617689e-03
42 KSP unpreconditioned resid norm 2.062308903139e-09 true resid norm 2.062308903139e-09 ||r(i)||/||b|| 1.990002353853e-03
43 KSP unpreconditioned resid norm 1.893218419438e-09 true resid norm 1.893218419438e-09 ||r(i)||/||b|| 1.826840346422e-03
44 KSP unpreconditioned resid norm 1.633410997565e-09 true resid norm 1.633410997565e-09 ||r(i)||/||b|| 1.576142024610e-03
45 KSP unpreconditioned resid norm 1.497658367438e-09 true resid norm 1.497658367438e-09 ||r(i)||/||b|| 1.445149013290e-03
46 KSP unpreconditioned resid norm 1.384720917112e-09 true resid norm 1.384720917112e-09 ||r(i)||/||b|| 1.336171259450e-03
47 KSP unpreconditioned resid norm 1.149245204430e-09 true resid norm 1.149245204429e-09 ||r(i)||/||b|| 1.108951553517e-03
48 KSP unpreconditioned resid norm 1.044541114051e-09 true resid norm 1.044541114051e-09 ||r(i)||/||b|| 1.007918490045e-03
49 KSP unpreconditioned resid norm 8.702453784205e-10 true resid norm 8.702453784202e-10 ||r(i)||/||b|| 8.397337318626e-04
KSP Object: 128 MPI processes
type: fgmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=0.001, absolute=1e-50, divergence=10000
right preconditioning
has attached null space
using UNPRECONDITIONED norm type for convergence test
PC Object: 128 MPI processes
type: mg
MG: type is FULL, levels=5 cycles=v
Using Galerkin computed coarse grid matrices
Coarse grid solver -- level -------------------------------
KSP Object: (mg_coarse_) 128 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_) 128 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 0, needed 0
Factored matrix follows:
Matrix Object: 128 MPI processes
type: mpiaij
rows=1024, cols=1024
package used to perform factorization: superlu_dist
total: nonzeros=0, allocated nonzeros=0
total number of mallocs used during MatSetValues calls =0
SuperLU_DIST run parameters:
Process grid nprow 16 x npcol 8
Equilibrate matrix TRUE
Matrix input mode 1
Replace tiny pivots TRUE
Use iterative refinement FALSE
Processors in row 16 col partition 8
Row permutation LargeDiag
Column permutation METIS_AT_PLUS_A
Parallel symbolic factorization FALSE
Repeated factorization SamePattern_SameRowPerm
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=1024, cols=1024
total: nonzeros=27648, allocated nonzeros=27648
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_1_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_1_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_1_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=64, cols=64
package used to perform factorization: petsc
total: nonzeros=768, allocated nonzeros=768
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 16 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=64, cols=64
total: nonzeros=768, allocated nonzeros=768
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 16 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=8192, cols=8192
total: nonzeros=221184, allocated nonzeros=221184
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 16 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_2_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_2_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_2_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=512, cols=512
package used to perform factorization: petsc
total: nonzeros=9600, allocated nonzeros=9600
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=512, cols=512
total: nonzeros=9600, allocated nonzeros=9600
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=65536, cols=65536
total: nonzeros=1769472, allocated nonzeros=1769472
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3 -------------------------------
KSP Object: (mg_levels_3_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_3_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_3_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_3_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=4096, cols=4096
package used to perform factorization: petsc
total: nonzeros=92928, allocated nonzeros=92928
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=4096, cols=4096
total: nonzeros=92928, allocated nonzeros=92928
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=524288, cols=524288
total: nonzeros=14155776, allocated nonzeros=14155776
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 4 -------------------------------
KSP Object: (mg_levels_4_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_4_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_4_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_4_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=32768, cols=32768
package used to perform factorization: petsc
total: nonzeros=221184, allocated nonzeros=221184
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=32768, cols=32768
total: nonzeros=221184, allocated nonzeros=221184
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=4194304, cols=4194304
total: nonzeros=29360128, allocated nonzeros=29360128
total number of mallocs used during MatSetValues calls =0
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=4194304, cols=4194304
total: nonzeros=29360128, allocated nonzeros=29360128
total number of mallocs used during MatSetValues calls =0
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./hit on a interlagos-64idx-pgi-opt named nid12058 with 128 processors, by Unknown Tue Oct 1 12:47:42 2013
Using Petsc Release Version 3.4.2, Jul, 02, 2013
Max Max/Min Avg Total
Time (sec): 5.564e+01 1.00031 5.563e+01
Objects: 1.344e+03 1.00000 1.344e+03
Flops: 7.519e+09 1.00000 7.519e+09 9.624e+11
Flops/sec: 1.352e+08 1.00031 1.352e+08 1.730e+10
MPI Messages: 2.584e+05 1.09854 2.354e+05 3.014e+07
MPI Message Lengths: 4.162e+08 1.00022 1.767e+03 5.326e+10
MPI Reductions: 4.880e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 1.4617e+01 26.3% 1.3758e+11 14.3% 4.615e+05 1.5% 2.117e+02 12.0% 3.681e+03 7.5%
1: MG Apply: 4.1013e+01 73.7% 8.2479e+11 85.7% 2.968e+07 98.5% 1.556e+03 88.0% 4.512e+04 92.5%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %f - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecMDot 322 1.0 6.3548e-01 1.1 2.20e+08 1.0 0.0e+00 0.0e+00 3.2e+02 1 3 0 0 1 4 20 0 0 9 44353
VecNorm 786 1.0 2.7858e-01 2.9 5.15e+07 1.0 0.0e+00 0.0e+00 7.9e+02 0 1 0 0 2 1 5 0 0 21 23668
VecScale 372 1.0 1.7262e-02 1.1 1.22e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 90388
VecCopy 414 1.0 5.8294e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 528 1.0 8.1342e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
VecAXPY 368 1.0 6.1460e-02 1.3 2.41e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 2 0 0 0 50228
VecAYPX 368 1.0 4.9858e-02 1.5 1.21e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 30958
VecWAXPY 4 1.0 1.1320e-03 2.1 1.31e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14821
VecMAXPY 690 1.0 1.0875e+00 1.2 4.54e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 6 0 0 0 7 42 0 0 0 53393
VecScatterBegin 790 1.0 1.3550e-01 1.3 0.00e+00 0.0 3.8e+05 1.6e+04 0.0e+00 0 0 1 12 0 1 0 82 97 0 0
VecScatterEnd 790 1.0 3.5084e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0
MatMult 694 1.0 2.4980e+00 1.1 2.96e+08 1.0 3.6e+05 1.6e+04 0.0e+00 4 4 1 11 0 16 28 77 91 0 15149
MatMultTranspose 4 1.0 2.2562e-03 1.1 2.53e+05 1.0 1.5e+03 9.9e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 14338
MatLUFactorSym 1 1.0 5.0211e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 1.2263e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatAssemblyBegin 63 1.0 2.0496e-01 8.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+02 0 0 0 0 0 1 0 0 0 3 0
MatAssemblyEnd 63 1.0 2.0343e-01 1.1 0.00e+00 0.0 1.2e+04 1.1e+03 7.2e+01 0 0 0 0 0 1 0 3 0 2 0
MatGetRowIJ 1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 2.0981e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 690 2.1 2.1315e-01 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+02 0 0 0 0 1 1 0 0 0 9 0
MatPtAP 4 1.0 2.0783e-01 1.0 5.11e+06 1.0 2.5e+04 6.0e+03 1.0e+02 0 0 0 0 0 1 0 5 2 3 3144
MatPtAPSymbolic 4 1.0 1.4338e-01 1.0 0.00e+00 0.0 1.5e+04 7.8e+03 6.0e+01 0 0 0 0 0 1 0 3 2 2 0
MatPtAPNumeric 4 1.0 7.0436e-02 1.1 5.11e+06 1.0 9.7e+03 3.1e+03 4.0e+01 0 0 0 0 0 0 0 2 0 1 9277
MatGetLocalMat 4 1.0 2.3359e-02 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 4 1.0 3.0560e-02 3.2 0.00e+00 0.0 1.1e+04 8.4e+03 8.0e+00 0 0 0 0 0 0 0 2 1 0 0
MatGetSymTrans 8 1.0 1.0388e-02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 322 1.0 1.0764e+00 1.1 4.40e+08 1.0 0.0e+00 0.0e+00 3.2e+02 2 6 0 0 1 7 41 0 0 9 52372
KSPSetUp 6 1.0 3.7848e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+01 0 0 0 0 0 0 0 0 0 2 0
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
KSPSolve 46 1.0 4.6268e+01 1.0 7.52e+09 1.0 3.0e+07 1.8e+03 4.8e+04 83100100 99 99 31670065158291309 20800
PCSetUp 1 1.0 4.2101e-01 1.0 5.36e+06 1.0 3.4e+04 4.6e+03 3.3e+02 1 0 0 0 1 3 0 7 2 9 1629
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
PCApply 322 1.0 4.1021e+01 1.0 6.44e+09 1.0 3.0e+07 1.6e+03 4.5e+04 74 86 98 88 92 28160064317351226 20106
MGSetup Level 0 1 1.0 1.2542e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 1 0 0 0 0 0
MGSetup Level 1 1 1.0 2.4819e-03 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
MGSetup Level 2 1 1.0 5.2500e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
MGSetup Level 3 1 1.0 5.1689e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
MGSetup Level 4 1 1.0 1.1669e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
--- Event Stage 1: MG Apply
VecMDot 19320 1.0 2.9361e+00 1.6 3.30e+08 1.0 0.0e+00 0.0e+00 1.9e+04 4 4 0 0 40 6 5 0 0 43 14401
VecNorm 25760 1.0 1.5193e+00 1.5 2.20e+08 1.0 0.0e+00 0.0e+00 2.6e+04 2 3 0 0 53 3 3 0 0 57 18556
VecScale 25760 1.0 1.7513e-01 1.1 1.10e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 80490
VecCopy 8050 1.0 2.0040e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 46368 1.0 5.5128e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
VecAXPY 10948 1.0 1.9405e-01 1.2 1.07e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 70388
VecAYPX 3220 1.0 5.9333e-02 1.1 1.38e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 29698
VecMAXPY 25760 1.0 5.2558e-01 1.1 4.96e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 8 0 0 0 120695
VecScatterBegin 36064 1.0 2.7229e+00 2.4 0.00e+00 0.0 3.0e+07 1.6e+03 0.0e+00 4 0 98 88 0 6 0100100 0 0
VecScatterEnd 36064 1.0 2.0120e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 4 0 0 0 0 0
VecNormalize 25760 1.0 1.7211e+00 1.4 3.30e+08 1.0 0.0e+00 0.0e+00 2.6e+04 3 4 0 0 53 4 5 0 0 57 24571
MatMult 28336 1.0 1.9003e+01 1.1 2.75e+09 1.0 2.7e+07 1.7e+03 0.0e+00 33 37 89 85 0 45 43 90 96 0 18501
MatMultAdd 3220 1.0 7.4839e-01 1.2 9.29e+07 1.0 1.2e+06 5.3e+02 0.0e+00 1 1 4 1 0 2 1 4 1 0 15893
MatMultTranspose 4508 1.0 1.4513e+00 1.2 1.74e+08 1.0 1.7e+06 6.6e+02 0.0e+00 2 2 6 2 0 3 3 6 2 0 15372
MatSolve 27370 1.0 1.4336e+01 1.1 2.15e+09 1.0 0.0e+00 0.0e+00 0.0e+00 25 29 0 0 0 34 33 0 0 0 19206
MatLUFactorNum 4 1.0 1.0956e-02 1.2 1.86e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 21770
MatILUFactorSym 4 1.0 5.2656e-02 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 4 1.0 1.0967e-05 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 4 1.0 5.1585e-0230.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 19320 1.0 3.2936e+00 1.5 6.61e+08 1.0 0.0e+00 0.0e+00 1.9e+04 5 9 0 0 40 7 10 0 0 43 25678
KSPSetUp 4 1.0 5.0068e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 8050 1.0 3.6157e+01 1.0 5.79e+09 1.0 2.3e+07 1.7e+03 4.5e+04 65 77 77 74 92 88 90 78 84100 20482
PCSetUp 4 1.0 7.6262e-02 2.3 1.86e+06 1.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 3127
PCSetUpOnBlocks 6440 1.0 8.1504e-02 2.1 1.86e+06 1.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 2926
PCApply 27370 1.0 1.5573e+01 1.1 2.15e+09 1.0 0.0e+00 0.0e+00 0.0e+00 27 29 0 0 0 37 33 0 0 0 17681
MGSmooth Level 0 1610 1.0 2.2521e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 5 0 0 0 0 0
MGSmooth Level 1 2576 1.0 1.4212e+00 1.0 5.43e+07 1.0 9.6e+06 1.9e+02 1.8e+04 3 1 32 3 37 3 1 32 4 40 4890
MGResid Level 1 1288 1.0 7.1253e-02 1.3 4.45e+06 1.0 1.3e+06 1.9e+02 0.0e+00 0 0 4 0 0 0 0 4 1 0 7996
MGInterp Level 1 3220 1.0 1.6670e-01 3.5 1.37e+06 1.0 1.2e+06 6.4e+01 0.0e+00 0 0 4 0 0 0 0 4 0 0 1052
MGSmooth Level 2 1932 1.0 2.2821e+00 1.0 3.82e+08 1.0 7.3e+06 6.4e+02 1.4e+04 4 5 24 9 28 5 6 24 10 30 21402
MGResid Level 2 966 1.0 1.4905e-01 1.5 2.67e+07 1.0 9.9e+05 6.4e+02 0.0e+00 0 0 3 1 0 0 0 3 1 0 22936
MGInterp Level 2 2576 1.0 1.8992e-01 2.3 8.74e+06 1.0 9.9e+05 2.1e+02 0.0e+00 0 0 3 0 0 0 0 3 0 0 5889
MGSmooth Level 3 1288 1.0 1.2010e+01 1.0 2.23e+09 1.0 4.9e+06 2.3e+03 9.0e+03 22 30 16 21 18 29 35 17 24 20 23726
MGResid Level 3 644 1.0 8.5489e-01 1.1 1.42e+08 1.0 6.6e+05 2.3e+03 0.0e+00 1 2 2 3 0 2 2 2 3 0 21327
MGInterp Level 3 1932 1.0 4.4246e-01 1.4 5.21e+07 1.0 7.4e+05 7.7e+02 0.0e+00 1 1 2 1 0 1 1 2 1 0 15071
MGSmooth Level 4 644 1.0 1.8477e+01 1.0 3.12e+09 1.0 1.3e+06 1.6e+04 4.5e+03 33 42 4 41 9 45 48 4 46 10 21640
MGResid Level 4 322 1.0 1.1910e+00 1.1 1.48e+08 1.0 1.6e+05 1.6e+04 0.0e+00 2 2 1 5 0 3 2 1 6 0 15876
MGInterp Level 4 1288 1.0 2.1723e+00 1.1 2.74e+08 1.0 4.9e+05 2.9e+03 0.0e+00 4 4 2 3 0 5 4 2 3 0 16165
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 979 991 247462152 0
Vector Scatter 19 19 22572 0
Matrix 38 42 19508928 0
Matrix Null Space 1 1 652 0
Distributed Mesh 5 5 830792 0
Bipartite Graph 10 10 8560 0
Index Set 47 59 844496 0
IS L to G Mapping 5 5 405756 0
Krylov Solver 11 11 102272 0
DMKSP interface 3 3 2088 0
Preconditioner 11 11 11864 0
Viewer 185 184 144256 0
--- Event Stage 1: MG Apply
Vector 12 0 0 0
Matrix 4 0 0 0
Index Set 14 2 1792 0
========================================================================================================================
Average time to get PetscTime(): 2.14577e-07
Average time for MPI_Barrier(): 0.000448608
Average time for zero size MPI_Send(): 2.44565e-06
#PETSc Option Table entries:
-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure run at: Wed Aug 28 23:25:43 2013
Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=0 --known-mpi-c-double-complex=0 --with-batch="1 " --known-mpi-shared="0 " --known-memcmp-ok --with-blas-lapack-lib="-L/opt/acml/5.3.0/pgi64/lib -lacml" --COPTFLAGS="-O3 -fastsse" --FOPTFLAGS="-O3 -fastsse" --CXXOPTFLAGS="-O3 -fastsse" --with-x="0 " --with-debugging="0 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries=0 --with-dynamic-loading=0 --with-mpi-compilers="1 " --known-mpi-shared-libraries=0 --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " --with-cc=cc --with-cxx=CC --with-fc=ftn PETSC_ARCH=interlagos-64idx-pgi-opt
-----------------------------------------
Libraries compiled on Wed Aug 28 23:25:43 2013 on h2ologin3
Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64
Using PETSc directory: /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2
Using PETSc arch: interlagos-64idx-pgi-opt
-----------------------------------------
Using C compiler: cc -O3 -fastsse ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: ftn -O3 -fastsse ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/opt/cray/udreg/2.3.2-1.0402.7311.2.1.gem/include -I/opt/cray/ugni/5.0-1.0402.7128.7.6.gem/include -I/opt/cray/pmi/4.0.1-1.0000.9421.73.3.gem/include -I/opt/cray/dmapp/4.0.1-1.0402.7439.5.1.gem/include -I/opt/cray/gni-headers/2.1-1.0402.7082.6.2.gem/include -I/opt/cray/xpmem/0.1-2.0402.44035.2.1.gem/include -I/opt/cray/rca/1.0.0-2.0402.42153.2.106.gem/include -I/opt/cray-hss-devel/7.0.0/include -I/opt/cray/krca/1.0.0-2.0402.42157.2.94.gem/include -I/opt/cray/mpt/6.0.1/gni/mpich2-pgi/121/include -I/opt/acml/5.3.0/pgi64_fma4/include -I/opt/cray/libsci/12.1.01/pgi/121/interlagos/include -I/opt/fftw/3.3.0.3/interlagos/include -I/usr/include/alps -I/opt/pgi/13.6.0/linux86-64/13.6/include -I/opt/cray/xe-sysroot/4.2.24/usr/include
-----------------------------------------
Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lpetsc -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lsuperlu_dist_3.3 -L/opt/acml/5.3.0/pgi64/lib -lacml -lpthread -lparmetis -lmetis -ldl
-----------------------------------------
#PETSc Option Table entries:
-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
There are no unused options.
-------------- next part --------------
OPTIONS:
-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type sor
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
0 KSP unpreconditioned resid norm 5.609297891476e-07 true resid norm 5.609297891476e-07 ||r(i)||/||b|| 1.000000000000e+00
1 KSP unpreconditioned resid norm 2.581822560386e-07 true resid norm 2.581822560386e-07 ||r(i)||/||b|| 4.602755300819e-01
2 KSP unpreconditioned resid norm 2.428085555532e-07 true resid norm 2.428085555532e-07 ||r(i)||/||b|| 4.328679992592e-01
3 KSP unpreconditioned resid norm 2.335657385482e-07 true resid norm 2.335657385482e-07 ||r(i)||/||b|| 4.163903273940e-01
4 KSP unpreconditioned resid norm 2.234855409323e-07 true resid norm 2.234855409323e-07 ||r(i)||/||b|| 3.984198116345e-01
5 KSP unpreconditioned resid norm 2.078097503833e-07 true resid norm 2.078097503833e-07 ||r(i)||/||b|| 3.704737284484e-01
6 KSP unpreconditioned resid norm 2.060299595561e-07 true resid norm 2.060299595561e-07 ||r(i)||/||b|| 3.673007986779e-01
7 KSP unpreconditioned resid norm 1.304371586686e-07 true resid norm 1.304371586686e-07 ||r(i)||/||b|| 2.325374069842e-01
8 KSP unpreconditioned resid norm 1.282069723612e-07 true resid norm 1.282069723612e-07 ||r(i)||/||b|| 2.285615327294e-01
9 KSP unpreconditioned resid norm 8.994010091283e-08 true resid norm 8.994010091283e-08 ||r(i)||/||b|| 1.603411026708e-01
10 KSP unpreconditioned resid norm 7.624200700050e-08 true resid norm 7.624200700050e-08 ||r(i)||/||b|| 1.359207666905e-01
11 KSP unpreconditioned resid norm 7.614198614178e-08 true resid norm 7.614198614178e-08 ||r(i)||/||b|| 1.357424540734e-01
12 KSP unpreconditioned resid norm 6.892442767094e-08 true resid norm 6.892442767094e-08 ||r(i)||/||b|| 1.228753205917e-01
13 KSP unpreconditioned resid norm 6.891270426886e-08 true resid norm 6.891270426886e-08 ||r(i)||/||b|| 1.228544206461e-01
14 KSP unpreconditioned resid norm 5.299112064401e-08 true resid norm 5.299112064401e-08 ||r(i)||/||b|| 9.447014879446e-02
15 KSP unpreconditioned resid norm 4.214889484631e-08 true resid norm 4.214889484631e-08 ||r(i)||/||b|| 7.514112400835e-02
16 KSP unpreconditioned resid norm 2.789939104957e-08 true resid norm 2.789939104957e-08 ||r(i)||/||b|| 4.973775967928e-02
17 KSP unpreconditioned resid norm 2.786722600854e-08 true resid norm 2.786722600854e-08 ||r(i)||/||b|| 4.968041731371e-02
18 KSP unpreconditioned resid norm 2.457366893493e-08 true resid norm 2.457366893493e-08 ||r(i)||/||b|| 4.380881424085e-02
19 KSP unpreconditioned resid norm 2.430122634853e-08 true resid norm 2.430122634853e-08 ||r(i)||/||b|| 4.332311604534e-02
20 KSP unpreconditioned resid norm 1.694910033682e-08 true resid norm 1.694910033683e-08 ||r(i)||/||b|| 3.021608169997e-02
21 KSP unpreconditioned resid norm 1.383837294163e-08 true resid norm 1.383837294164e-08 ||r(i)||/||b|| 2.467041902458e-02
22 KSP unpreconditioned resid norm 1.156264774412e-08 true resid norm 1.156264774414e-08 ||r(i)||/||b|| 2.061336011715e-02
23 KSP unpreconditioned resid norm 7.091619620968e-09 true resid norm 7.091619620978e-09 ||r(i)||/||b|| 1.264261545416e-02
24 KSP unpreconditioned resid norm 7.065272914870e-09 true resid norm 7.065272914878e-09 ||r(i)||/||b|| 1.259564575027e-02
25 KSP unpreconditioned resid norm 6.618876503623e-09 true resid norm 6.618876503634e-09 ||r(i)||/||b|| 1.179983062353e-02
26 KSP unpreconditioned resid norm 6.462504586080e-09 true resid norm 6.462504586091e-09 ||r(i)||/||b|| 1.152105791335e-02
27 KSP unpreconditioned resid norm 6.440405749487e-09 true resid norm 6.440405749498e-09 ||r(i)||/||b|| 1.148166111713e-02
28 KSP unpreconditioned resid norm 5.744119795026e-09 true resid norm 5.744119795044e-09 ||r(i)||/||b|| 1.024035432986e-02
29 KSP unpreconditioned resid norm 2.587935833329e-09 true resid norm 2.587935833348e-09 ||r(i)||/||b|| 4.613653764548e-03
30 KSP unpreconditioned resid norm 7.019604467372e-10 true resid norm 7.019604467372e-10 ||r(i)||/||b|| 1.251423012859e-03
31 KSP unpreconditioned resid norm 7.001655404576e-10 true resid norm 7.001655404576e-10 ||r(i)||/||b|| 1.248223135950e-03
32 KSP unpreconditioned resid norm 6.876834537207e-10 true resid norm 6.876834537207e-10 ||r(i)||/||b|| 1.225970642004e-03
33 KSP unpreconditioned resid norm 6.378517201634e-10 true resid norm 6.378517201634e-10 ||r(i)||/||b|| 1.137132904160e-03
34 KSP unpreconditioned resid norm 5.450153316415e-10 true resid norm 5.450153316415e-10 ||r(i)||/||b|| 9.716284322674e-04
KSP Object: 128 MPI processes
type: fgmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=0.001, absolute=1e-50, divergence=10000
right preconditioning
has attached null space
using UNPRECONDITIONED norm type for convergence test
PC Object: 128 MPI processes
type: mg
MG: type is FULL, levels=5 cycles=v
Using Galerkin computed coarse grid matrices
Coarse grid solver -- level -------------------------------
KSP Object: (mg_coarse_) 128 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_) 128 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 0, needed 0
Factored matrix follows:
Matrix Object: 128 MPI processes
type: mpiaij
rows=1024, cols=1024
package used to perform factorization: superlu_dist
total: nonzeros=0, allocated nonzeros=0
total number of mallocs used during MatSetValues calls =0
SuperLU_DIST run parameters:
Process grid nprow 16 x npcol 8
Equilibrate matrix TRUE
Matrix input mode 1
Replace tiny pivots TRUE
Use iterative refinement FALSE
Processors in row 16 col partition 8
Row permutation LargeDiag
Column permutation METIS_AT_PLUS_A
Parallel symbolic factorization FALSE
Repeated factorization SamePattern_SameRowPerm
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=1024, cols=1024
total: nonzeros=27648, allocated nonzeros=27648
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_1_) 128 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=8192, cols=8192
total: nonzeros=221184, allocated nonzeros=221184
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 16 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_2_) 128 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=65536, cols=65536
total: nonzeros=1769472, allocated nonzeros=1769472
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3 -------------------------------
KSP Object: (mg_levels_3_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_3_) 128 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=524288, cols=524288
total: nonzeros=14155776, allocated nonzeros=14155776
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 4 -------------------------------
KSP Object: (mg_levels_4_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_4_) 128 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=4194304, cols=4194304
total: nonzeros=29360128, allocated nonzeros=29360128
total number of mallocs used during MatSetValues calls =0
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=4194304, cols=4194304
total: nonzeros=29360128, allocated nonzeros=29360128
total number of mallocs used during MatSetValues calls =0
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./hit on a interlagos-64idx-pgi-opt named nid12058 with 128 processors, by Unknown Tue Oct 1 13:05:12 2013
Using Petsc Release Version 3.4.2, Jul, 02, 2013
Max Max/Min Avg Total
Time (sec): 2.265e+02 1.00006 2.265e+02
Objects: 3.366e+03 1.00000 3.366e+03
Flops: 2.626e+10 1.00000 2.626e+10 3.361e+12
Flops/sec: 1.159e+08 1.00006 1.159e+08 1.484e+10
MPI Messages: 8.260e+05 1.00000 8.260e+05 1.057e+08
MPI Message Lengths: 1.464e+09 1.00000 1.773e+03 1.874e+11
MPI Reductions: 1.710e+05 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 7.1195e+01 31.4% 4.1158e+11 12.2% 1.404e+06 1.3% 2.136e+02 12.1% 1.244e+04 7.3%
1: MG Apply: 1.5530e+02 68.6% 2.9491e+12 87.8% 1.043e+08 98.7% 1.559e+03 87.9% 1.585e+05 92.7%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %f - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecMDot 1132 1.0 1.6774e+00 1.2 5.77e+08 1.0 0.0e+00 0.0e+00 1.1e+03 1 2 0 0 1 2 18 0 0 9 44049
VecNorm 2873 1.0 9.3535e-01 2.5 1.88e+08 1.0 0.0e+00 0.0e+00 2.9e+03 0 1 0 0 2 1 6 0 0 23 25766
VecScale 1339 1.0 6.0955e-02 1.0 4.39e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 92136
VecCopy 1534 1.0 2.0990e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 1958 1.0 3.0763e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 1333 1.0 2.2397e-01 1.3 8.74e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 3 0 0 0 49926
VecAYPX 1333 1.0 1.7871e-01 1.4 4.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 31285
VecWAXPY 6 1.0 1.6773e-03 1.9 1.97e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 15004
VecMAXPY 2465 1.0 2.9482e+00 1.2 1.22e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 4 38 0 0 0 52831
VecScatterBegin 2877 1.0 5.2887e-01 1.3 0.00e+00 0.0 1.4e+06 1.6e+04 0.0e+00 0 0 1 12 0 1 0 98 99 0 0
VecScatterEnd 2877 1.0 1.1966e+00 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatMult 2471 1.0 8.7551e+00 1.1 1.05e+09 1.0 1.3e+06 1.6e+04 0.0e+00 4 4 1 11 0 12 33 90 92 0 15389
MatMultTranspose 4 1.0 2.2500e-03 1.1 2.53e+05 1.0 1.5e+03 9.9e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 14377
MatLUFactorSym 1 1.0 5.0402e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 1.2829e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 218 1.0 8.1404e-0118.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.2e+02 0 0 0 0 0 1 0 0 0 3 0
MatAssemblyEnd 218 1.0 7.6480e-01 1.2 0.00e+00 0.0 1.2e+04 1.1e+03 7.2e+01 0 0 0 0 0 1 0 1 0 1 0
MatGetRowIJ 1 1.0 3.0994e-06 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 2.2173e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 1407 1.0 4.3674e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+03 0 0 0 0 1 1 0 0 0 11 0
MatPtAP 4 1.0 2.1245e-01 1.0 5.11e+06 1.0 2.5e+04 6.0e+03 1.0e+02 0 0 0 0 0 0 0 2 1 1 3076
MatPtAPSymbolic 4 1.0 1.4296e-01 1.0 0.00e+00 0.0 1.5e+04 7.8e+03 6.0e+01 0 0 0 0 0 0 0 1 1 0 0
MatPtAPNumeric 4 1.0 7.0840e-02 1.0 5.11e+06 1.0 9.7e+03 3.1e+03 4.0e+01 0 0 0 0 0 0 0 1 0 0 9224
MatGetLocalMat 4 1.0 2.2167e-02 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 4 1.0 2.9867e-02 3.1 0.00e+00 0.0 1.1e+04 8.4e+03 8.0e+00 0 0 0 0 0 0 0 1 0 0 0
MatGetSymTrans 8 1.0 9.5510e-03 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 1132 1.0 2.8047e+00 1.1 1.15e+09 1.0 0.0e+00 0.0e+00 1.1e+03 1 4 0 0 1 4 36 0 0 9 52688
KSPSetUp 6 1.0 3.5489e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+01 0 0 0 0 0 0 0 0 0 0 0
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
KSPSolve 201 1.0 1.7101e+02 1.0 2.63e+10 1.0 1.1e+08 1.8e+03 1.7e+05 75100100 99 98 24081775248221352 19652
PCSetUp 1 1.0 4.2664e-01 1.0 5.36e+06 1.0 3.4e+04 4.6e+03 3.2e+02 0 0 0 0 0 1 0 2 1 3 1607
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
PCApply 1132 1.0 1.5532e+02 1.0 2.30e+10 1.0 1.0e+08 1.6e+03 1.6e+05 69 88 99 88 93 21871774317301274 18987
MGSetup Level 0 1 1.0 1.3110e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0
MGSetup Level 1 1 1.0 1.7891e-03 7.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0
MGSetup Level 2 1 1.0 2.5201e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0
MGSetup Level 3 1 1.0 3.9697e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0
MGSetup Level 4 1 1.0 1.0642e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0
--- Event Stage 1: MG Apply
VecMDot 67920 1.0 1.0526e+01 1.5 1.16e+09 1.0 0.0e+00 0.0e+00 6.8e+04 4 4 0 0 40 6 5 0 0 43 14122
VecNorm 90560 1.0 5.5366e+00 1.4 7.74e+08 1.0 0.0e+00 0.0e+00 9.1e+04 2 3 0 0 53 3 3 0 0 57 17901
VecScale 90560 1.0 6.1576e-01 1.1 3.87e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 80480
VecCopy 28300 1.0 7.0869e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 72448 1.0 5.1643e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 38488 1.0 7.2360e-01 1.2 3.75e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 2 0 0 0 66359
VecAYPX 11320 1.0 2.0807e-01 1.1 4.84e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 29771
VecMAXPY 90560 1.0 1.7752e+00 1.1 1.74e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 8 0 0 0 125622
VecScatterBegin 126784 1.0 9.4371e+00 2.3 0.00e+00 0.0 1.0e+08 1.6e+03 0.0e+00 4 0 99 88 0 5 0100100 0 0
VecScatterEnd 126784 1.0 6.8194e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 4 0 0 0 0 0
VecNormalize 90560 1.0 6.2612e+00 1.4 1.16e+09 1.0 0.0e+00 0.0e+00 9.1e+04 2 4 0 0 53 4 5 0 0 57 23745
MatMult 99616 1.0 5.7795e+01 1.1 9.66e+09 1.0 9.4e+07 1.7e+03 0.0e+00 25 37 89 85 0 36 42 90 96 0 21385
MatMultAdd 11320 1.0 2.6260e+00 1.2 3.27e+08 1.0 4.3e+06 5.3e+02 0.0e+00 1 1 4 1 0 2 1 4 1 0 15923
MatMultTranspose 15848 1.0 4.8380e+00 1.1 6.13e+08 1.0 6.1e+06 6.6e+02 0.0e+00 2 2 6 2 0 3 3 6 2 0 16212
MatSolve 5660 1.0 7.7166e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 5 0 0 0 0 0
MatSOR 90560 1.0 6.6838e+01 1.0 7.96e+09 1.0 0.0e+00 0.0e+00 0.0e+00 29 30 0 0 0 42 35 0 0 0 15237
KSPGMRESOrthog 67920 1.0 1.1756e+01 1.4 2.32e+09 1.0 0.0e+00 0.0e+00 6.8e+04 4 9 0 0 40 7 10 0 0 43 25290
KSPSolve 28300 1.0 1.3913e+02 1.0 2.07e+10 1.0 8.1e+07 1.7e+03 1.6e+05 61 79 77 74 93 89 90 78 84100 19070
PCApply 96220 1.0 7.4543e+01 1.0 7.96e+09 1.0 0.0e+00 0.0e+00 0.0e+00 32 30 0 0 0 47 35 0 0 0 13662
MGSmooth Level 0 5660 1.0 7.8165e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 5 0 0 0 0 0
MGSmooth Level 1 9056 1.0 3.9531e+00 1.1 1.93e+08 1.0 3.4e+07 1.9e+02 6.3e+04 2 1 32 3 37 2 1 32 4 40 6255
MGResid Level 1 4528 1.0 2.5980e-01 1.4 1.56e+07 1.0 4.6e+06 1.9e+02 0.0e+00 0 0 4 0 0 0 0 4 1 0 7710
MGInterp Level 1 11320 1.0 8.6609e-01 5.1 4.82e+06 1.0 4.3e+06 6.4e+01 0.0e+00 0 0 4 0 0 0 0 4 0 0 712
MGSmooth Level 2 6792 1.0 7.4435e+00 1.0 1.36e+09 1.0 2.6e+07 6.4e+02 4.8e+04 3 5 24 9 28 5 6 24 10 30 23301
MGResid Level 2 3396 1.0 5.4746e-01 1.4 9.39e+07 1.0 3.5e+06 6.4e+02 0.0e+00 0 0 3 1 0 0 0 3 1 0 21953
MGInterp Level 2 9056 1.0 6.0639e-01 2.0 3.07e+07 1.0 3.5e+06 2.1e+02 0.0e+00 0 0 3 0 0 0 0 3 0 0 6484
MGSmooth Level 3 4528 1.0 3.5954e+01 1.0 7.90e+09 1.0 1.7e+07 2.3e+03 3.2e+04 16 30 16 21 19 23 34 17 24 20 28111
MGResid Level 3 2264 1.0 2.0338e+00 1.2 5.01e+08 1.0 2.3e+06 2.3e+03 0.0e+00 1 2 2 3 0 1 2 2 3 0 31516
MGInterp Level 3 6792 1.0 1.5462e+00 1.4 1.83e+08 1.0 2.6e+06 7.7e+02 0.0e+00 1 1 2 1 0 1 1 2 1 0 15161
MGSmooth Level 4 2264 1.0 8.5215e+01 1.0 1.13e+10 1.0 4.6e+06 1.6e+04 1.6e+04 38 43 4 41 9 55 49 4 46 10 16948
MGResid Level 4 1132 1.0 4.1353e+00 1.1 5.19e+08 1.0 5.8e+05 1.6e+04 0.0e+00 2 2 1 5 0 3 2 1 6 0 16074
MGInterp Level 4 4528 1.0 7.5078e+00 1.1 9.64e+08 1.0 1.7e+06 2.9e+03 0.0e+00 3 4 2 3 0 5 4 2 3 0 16442
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 3211 3223 861601128 0
Vector Scatter 19 19 22572 0
Matrix 38 38 14004608 0
Matrix Null Space 1 1 652 0
Distributed Mesh 5 5 830792 0
Bipartite Graph 10 10 8560 0
Index Set 47 47 534480 0
IS L to G Mapping 5 5 405756 0
Krylov Solver 7 7 97216 0
DMKSP interface 3 3 2088 0
Preconditioner 7 7 7352 0
Viewer 1 0 0 0
--- Event Stage 1: MG Apply
Vector 12 0 0 0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 1.62125e-05
Average time for zero size MPI_Send(): 2.36742e-06
#PETSc Option Table entries:
-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type sor
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure run at: Wed Aug 28 23:25:43 2013
Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=0 --known-mpi-c-double-complex=0 --with-batch="1 " --known-mpi-shared="0 " --known-memcmp-ok --with-blas-lapack-lib="-L/opt/acml/5.3.0/pgi64/lib -lacml" --COPTFLAGS="-O3 -fastsse" --FOPTFLAGS="-O3 -fastsse" --CXXOPTFLAGS="-O3 -fastsse" --with-x="0 " --with-debugging="0 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries=0 --with-dynamic-loading=0 --with-mpi-compilers="1 " --known-mpi-shared-libraries=0 --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " --with-cc=cc --with-cxx=CC --with-fc=ftn PETSC_ARCH=interlagos-64idx-pgi-opt
-----------------------------------------
Libraries compiled on Wed Aug 28 23:25:43 2013 on h2ologin3
Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64
Using PETSc directory: /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2
Using PETSc arch: interlagos-64idx-pgi-opt
-----------------------------------------
Using C compiler: cc -O3 -fastsse ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: ftn -O3 -fastsse ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/opt/cray/udreg/2.3.2-1.0402.7311.2.1.gem/include -I/opt/cray/ugni/5.0-1.0402.7128.7.6.gem/include -I/opt/cray/pmi/4.0.1-1.0000.9421.73.3.gem/include -I/opt/cray/dmapp/4.0.1-1.0402.7439.5.1.gem/include -I/opt/cray/gni-headers/2.1-1.0402.7082.6.2.gem/include -I/opt/cray/xpmem/0.1-2.0402.44035.2.1.gem/include -I/opt/cray/rca/1.0.0-2.0402.42153.2.106.gem/include -I/opt/cray-hss-devel/7.0.0/include -I/opt/cray/krca/1.0.0-2.0402.42157.2.94.gem/include -I/opt/cray/mpt/6.0.1/gni/mpich2-pgi/121/include -I/opt/acml/5.3.0/pgi64_fma4/include -I/opt/cray/libsci/12.1.01/pgi/121/interlagos/include -I/opt/fftw/3.3.0.3/interlagos/include -I/usr/include/alps -I/opt/pgi/13.6.0/linux86-64/13.6/include -I/opt/cray/xe-sysroot/4.2.24/usr/include
-----------------------------------------
Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lpetsc -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lsuperlu_dist_3.3 -L/opt/acml/5.3.0/pgi64/lib -lacml -lpthread -lparmetis -lmetis -ldl
-----------------------------------------
#PETSc Option Table entries:
-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type sor
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
There are no unused options.
-------------- next part --------------
OPTIONS:
-ksp_atol 1e-9
-ksp_max_it 30
-ksp_monitor_true_residual
-ksp_type richardson
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
0 KSP unpreconditioned resid norm 6.954195782521e-06 true resid norm 6.954195782521e-06 ||r(i)||/||b|| 1.000000000000e+00
1 KSP unpreconditioned resid norm 4.019686111644e-06 true resid norm 4.019686111644e-06 ||r(i)||/||b|| 5.780231442070e-01
2 KSP unpreconditioned resid norm 3.085633839543e-06 true resid norm 3.085633839543e-06 ||r(i)||/||b|| 4.437082210568e-01
3 KSP unpreconditioned resid norm 3.108638870884e-06 true resid norm 3.108638870884e-06 ||r(i)||/||b|| 4.470163003891e-01
4 KSP unpreconditioned resid norm 2.907823260441e-06 true resid norm 2.907823260441e-06 ||r(i)||/||b|| 4.181394012158e-01
5 KSP unpreconditioned resid norm 3.095105825237e-06 true resid norm 3.095105825237e-06 ||r(i)||/||b|| 4.450702744114e-01
6 KSP unpreconditioned resid norm 2.816885443509e-06 true resid norm 2.816885443509e-06 ||r(i)||/||b|| 4.050627177608e-01
7 KSP unpreconditioned resid norm 3.330756254322e-06 true resid norm 3.330756254322e-06 ||r(i)||/||b|| 4.789563536151e-01
8 KSP unpreconditioned resid norm 2.734103597927e-06 true resid norm 2.734103597927e-06 ||r(i)||/||b|| 3.931588473249e-01
9 KSP unpreconditioned resid norm 3.029844525689e-06 true resid norm 3.029844525689e-06 ||r(i)||/||b|| 4.356858248519e-01
10 KSP unpreconditioned resid norm 2.626258637892e-06 true resid norm 2.626258637892e-06 ||r(i)||/||b|| 3.776509491569e-01
11 KSP unpreconditioned resid norm 2.620796722232e-06 true resid norm 2.620796722232e-06 ||r(i)||/||b|| 3.768655361731e-01
12 KSP unpreconditioned resid norm 2.599366584696e-06 true resid norm 2.599366584696e-06 ||r(i)||/||b|| 3.737839235457e-01
13 KSP unpreconditioned resid norm 2.815136272808e-06 true resid norm 2.815136272808e-06 ||r(i)||/||b|| 4.048111903728e-01
14 KSP unpreconditioned resid norm 2.592704976330e-06 true resid norm 2.592704976330e-06 ||r(i)||/||b|| 3.728259970544e-01
15 KSP unpreconditioned resid norm 2.647297548295e-06 true resid norm 2.647297548295e-06 ||r(i)||/||b|| 3.806763040737e-01
16 KSP unpreconditioned resid norm 2.577657729007e-06 true resid norm 2.577657729007e-06 ||r(i)||/||b|| 3.706622317833e-01
17 KSP unpreconditioned resid norm 2.637186195877e-06 true resid norm 2.637186195877e-06 ||r(i)||/||b|| 3.792223110120e-01
18 KSP unpreconditioned resid norm 2.569979492081e-06 true resid norm 2.569979492081e-06 ||r(i)||/||b|| 3.695581160572e-01
19 KSP unpreconditioned resid norm 2.639092183189e-06 true resid norm 2.639092183189e-06 ||r(i)||/||b|| 3.794963883275e-01
20 KSP unpreconditioned resid norm 2.557359938672e-06 true resid norm 2.557359938672e-06 ||r(i)||/||b|| 3.677434485091e-01
21 KSP unpreconditioned resid norm 2.619919367497e-06 true resid norm 2.619919367497e-06 ||r(i)||/||b|| 3.767393742469e-01
22 KSP unpreconditioned resid norm 2.540615865281e-06 true resid norm 2.540615865281e-06 ||r(i)||/||b|| 3.653356829077e-01
23 KSP unpreconditioned resid norm 2.578329382313e-06 true resid norm 2.578329382313e-06 ||r(i)||/||b|| 3.707588142389e-01
24 KSP unpreconditioned resid norm 2.525920830833e-06 true resid norm 2.525920830833e-06 ||r(i)||/||b|| 3.632225651716e-01
25 KSP unpreconditioned resid norm 2.658560319798e-06 true resid norm 2.658560319798e-06 ||r(i)||/||b|| 3.822958689890e-01
26 KSP unpreconditioned resid norm 2.522426607571e-06 true resid norm 2.522426607571e-06 ||r(i)||/||b|| 3.627201025762e-01
27 KSP unpreconditioned resid norm 2.616030191476e-06 true resid norm 2.616030191476e-06 ||r(i)||/||b|| 3.761801182031e-01
28 KSP unpreconditioned resid norm 2.507602013260e-06 true resid norm 2.507602013260e-06 ||r(i)||/||b|| 3.605883543806e-01
29 KSP unpreconditioned resid norm 2.624604598576e-06 true resid norm 2.624604598576e-06 ||r(i)||/||b|| 3.774131014793e-01
30 KSP unpreconditioned resid norm 2.502026180934e-06 true resid norm 2.502026180934e-06 ||r(i)||/||b|| 3.597865603990e-01
KSP Object: 128 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=30, initial guess is zero
tolerances: relative=1e-05, absolute=1e-09, divergence=10000
left preconditioning
has attached null space
using UNPRECONDITIONED norm type for convergence test
PC Object: 128 MPI processes
type: mg
MG: type is FULL, levels=5 cycles=v
Using Galerkin computed coarse grid matrices
Coarse grid solver -- level -------------------------------
KSP Object: (mg_coarse_) 128 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_) 128 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 0, needed 0
Factored matrix follows:
Matrix Object: 128 MPI processes
type: mpiaij
rows=1024, cols=1024
package used to perform factorization: superlu_dist
total: nonzeros=0, allocated nonzeros=0
total number of mallocs used during MatSetValues calls =0
SuperLU_DIST run parameters:
Process grid nprow 16 x npcol 8
Equilibrate matrix TRUE
Matrix input mode 1
Replace tiny pivots TRUE
Use iterative refinement FALSE
Processors in row 16 col partition 8
Row permutation LargeDiag
Column permutation METIS_AT_PLUS_A
Parallel symbolic factorization FALSE
Repeated factorization SamePattern_SameRowPerm
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=1024, cols=1024
total: nonzeros=27648, allocated nonzeros=27648
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_1_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_1_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_1_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=64, cols=64
package used to perform factorization: petsc
total: nonzeros=768, allocated nonzeros=768
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 16 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=64, cols=64
total: nonzeros=768, allocated nonzeros=768
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 16 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=8192, cols=8192
total: nonzeros=221184, allocated nonzeros=221184
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 16 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_2_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_2_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_2_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=512, cols=512
package used to perform factorization: petsc
total: nonzeros=9600, allocated nonzeros=9600
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=512, cols=512
total: nonzeros=9600, allocated nonzeros=9600
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=65536, cols=65536
total: nonzeros=1769472, allocated nonzeros=1769472
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3 -------------------------------
KSP Object: (mg_levels_3_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_3_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_3_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_3_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=4096, cols=4096
package used to perform factorization: petsc
total: nonzeros=92928, allocated nonzeros=92928
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=4096, cols=4096
total: nonzeros=92928, allocated nonzeros=92928
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=524288, cols=524288
total: nonzeros=14155776, allocated nonzeros=14155776
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 4 -------------------------------
KSP Object: (mg_levels_4_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_4_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_4_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_4_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=32768, cols=32768
package used to perform factorization: petsc
total: nonzeros=221184, allocated nonzeros=221184
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=32768, cols=32768
total: nonzeros=221184, allocated nonzeros=221184
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=4194304, cols=4194304
total: nonzeros=29360128, allocated nonzeros=29360128
total number of mallocs used during MatSetValues calls =0
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=4194304, cols=4194304
total: nonzeros=29360128, allocated nonzeros=29360128
total number of mallocs used during MatSetValues calls =0
0 KSP unpreconditioned resid norm 2.917180555663e-04 true resid norm 2.917180555663e-04 ||r(i)||/||b|| 1.000000000000e+00
1 KSP unpreconditioned resid norm 1.049878179956e-02 true resid norm 1.049878179956e-02 ||r(i)||/||b|| 3.598948230742e+01
2 KSP unpreconditioned resid norm 4.603139618725e-01 true resid norm 4.603139618725e-01 ||r(i)||/||b|| 1.577941279565e+03
3 KSP unpreconditioned resid norm 2.274779569665e+01 true resid norm 2.274779569665e+01 ||r(i)||/||b|| 7.797870328075e+04
KSP Object: 128 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=30, initial guess is zero
tolerances: relative=1e-05, absolute=1e-09, divergence=10000
left preconditioning
has attached null space
using UNPRECONDITIONED norm type for convergence test
PC Object: 128 MPI processes
type: mg
MG: type is FULL, levels=5 cycles=v
Using Galerkin computed coarse grid matrices
Coarse grid solver -- level -------------------------------
KSP Object: (mg_coarse_) 128 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_) 128 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 0, needed 0
Factored matrix follows:
Matrix Object: 128 MPI processes
type: mpiaij
rows=1024, cols=1024
package used to perform factorization: superlu_dist
total: nonzeros=0, allocated nonzeros=0
total number of mallocs used during MatSetValues calls =0
SuperLU_DIST run parameters:
Process grid nprow 16 x npcol 8
Equilibrate matrix TRUE
Matrix input mode 1
Replace tiny pivots TRUE
Use iterative refinement FALSE
Processors in row 16 col partition 8
Row permutation LargeDiag
Column permutation METIS_AT_PLUS_A
Parallel symbolic factorization FALSE
Repeated factorization SamePattern_SameRowPerm
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=1024, cols=1024
total: nonzeros=27648, allocated nonzeros=27648
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_1_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_1_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_1_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=64, cols=64
package used to perform factorization: petsc
total: nonzeros=768, allocated nonzeros=768
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 16 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=64, cols=64
total: nonzeros=768, allocated nonzeros=768
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 16 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=8192, cols=8192
total: nonzeros=221184, allocated nonzeros=221184
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 16 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_2_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_2_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_2_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=512, cols=512
package used to perform factorization: petsc
total: nonzeros=9600, allocated nonzeros=9600
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=512, cols=512
total: nonzeros=9600, allocated nonzeros=9600
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=65536, cols=65536
total: nonzeros=1769472, allocated nonzeros=1769472
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3 -------------------------------
KSP Object: (mg_levels_3_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_3_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_3_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_3_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=4096, cols=4096
package used to perform factorization: petsc
total: nonzeros=92928, allocated nonzeros=92928
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=4096, cols=4096
total: nonzeros=92928, allocated nonzeros=92928
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=524288, cols=524288
total: nonzeros=14155776, allocated nonzeros=14155776
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 4 -------------------------------
KSP Object: (mg_levels_4_) 128 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=3
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_4_) 128 MPI processes
type: bjacobi
block Jacobi: number of blocks = 128
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_levels_4_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_4_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=32768, cols=32768
package used to perform factorization: petsc
total: nonzeros=221184, allocated nonzeros=221184
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=32768, cols=32768
total: nonzeros=221184, allocated nonzeros=221184
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=4194304, cols=4194304
total: nonzeros=29360128, allocated nonzeros=29360128
total number of mallocs used during MatSetValues calls =0
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Matrix Object: 128 MPI processes
type: mpiaij
rows=4194304, cols=4194304
total: nonzeros=29360128, allocated nonzeros=29360128
total number of mallocs used during MatSetValues calls =0
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./hit on a interlagos-64idx-pgi-opt named nid25319 with 128 processors, by Unknown Tue Oct 1 20:23:44 2013
Using Petsc Release Version 3.4.2, Jul, 02, 2013
Max Max/Min Avg Total
Time (sec): 4.942e+01 1.00014 4.942e+01
Objects: 1.206e+03 1.00000 1.206e+03
Flops: 6.078e+09 1.00000 6.078e+09 7.779e+11
Flops/sec: 1.230e+08 1.00014 1.230e+08 1.574e+10
MPI Messages: 2.322e+05 1.11091 2.092e+05 2.678e+07
MPI Message Lengths: 3.703e+08 1.00025 1.769e+03 4.739e+10
MPI Reductions: 4.337e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 1.3192e+01 26.7% 4.5330e+10 5.8% 4.225e+05 1.6% 2.144e+02 12.1% 3.293e+03 7.6%
1: MG Apply: 3.6228e+01 73.3% 7.3260e+11 94.2% 2.636e+07 98.4% 1.555e+03 87.9% 4.008e+04 92.4%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %f - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecNorm 710 1.0 2.7513e-01 4.2 4.65e+07 1.0 0.0e+00 0.0e+00 7.1e+02 0 1 0 0 2 1 13 0 0 22 21647
VecCopy 378 1.0 6.4831e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 160 1.0 2.7622e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 286 1.0 5.0174e-02 1.2 1.87e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 5 0 0 0 47817
VecAYPX 618 1.0 8.3843e-02 1.9 2.03e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 6 0 0 0 30916
VecScatterBegin 714 1.0 1.3197e-01 1.3 0.00e+00 0.0 3.4e+05 1.6e+04 0.0e+00 0 0 1 12 0 1 0 81 97 0 0
VecScatterEnd 714 1.0 3.0915e-01 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatMult 618 1.0 2.2833e+00 1.1 2.63e+08 1.0 3.2e+05 1.6e+04 0.0e+00 4 4 1 11 0 16 74 75 90 0 14758
MatMultTranspose 4 1.0 2.2891e-03 1.1 2.53e+05 1.0 1.5e+03 9.9e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 14132
MatLUFactorSym 1 1.0 5.1403e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 1.1998e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatAssemblyBegin 63 1.0 1.9263e-01 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+02 0 0 0 0 0 1 0 0 0 3 0
MatAssemblyEnd 63 1.0 2.1651e-01 1.2 0.00e+00 0.0 1.2e+04 1.1e+03 7.2e+01 0 0 0 0 0 1 0 3 0 2 0
MatGetRowIJ 1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 2.0981e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 690 2.1 2.0276e-01 4.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+02 0 0 0 0 1 1 0 0 0 10 0
MatPtAP 4 1.0 2.0942e-01 1.0 5.11e+06 1.0 2.5e+04 6.0e+03 1.0e+02 0 0 0 0 0 2 1 6 3 3 3120
MatPtAPSymbolic 4 1.0 1.4736e-01 1.1 0.00e+00 0.0 1.5e+04 7.8e+03 6.0e+01 0 0 0 0 0 1 0 4 2 2 0
MatPtAPNumeric 4 1.0 6.9803e-02 1.1 5.11e+06 1.0 9.7e+03 3.1e+03 4.0e+01 0 0 0 0 0 1 1 2 1 1 9361
MatGetLocalMat 4 1.0 2.3130e-02 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 4 1.0 2.8250e-02 3.0 0.00e+00 0.0 1.1e+04 8.4e+03 8.0e+00 0 0 0 0 0 0 0 3 2 0 0
MatGetSymTrans 8 1.0 9.6016e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 6 1.0 1.7802e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 4.4e+01 0 0 0 0 0 0 0 0 0 1 0
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
KSPSolve 46 1.0 3.9455e+01 1.0 6.08e+09 1.0 2.7e+07 1.8e+03 4.3e+04 80100100 99 99 299171663218181298 19717
PCSetUp 1 1.0 4.1868e-01 1.0 5.36e+06 1.0 3.4e+04 4.6e+03 3.3e+02 1 0 0 0 1 3 2 8 3 10 1638
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
PCApply 286 1.0 3.6235e+01 1.0 5.72e+09 1.0 2.6e+07 1.6e+03 4.0e+04 73 94 98 88 92 275161662387251217 20218
MGSetup Level 0 1 1.0 1.2280e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 1 0 0 0 0 0
MGSetup Level 1 1 1.0 2.5809e-03 5.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
MGSetup Level 2 1 1.0 5.2381e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
MGSetup Level 3 1 1.0 5.4312e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
MGSetup Level 4 1 1.0 1.1581e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
--- Event Stage 1: MG Apply
VecMDot 17160 1.0 2.7126e+00 2.0 2.93e+08 1.0 0.0e+00 0.0e+00 1.7e+04 5 5 0 0 40 6 5 0 0 43 13846
VecNorm 22880 1.0 1.5497e+00 1.7 1.96e+08 1.0 0.0e+00 0.0e+00 2.3e+04 2 3 0 0 53 3 3 0 0 57 16159
VecScale 22880 1.0 1.6083e-01 1.1 9.78e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 77850
VecCopy 7150 1.0 1.9496e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 41184 1.0 4.9180e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
VecAXPY 9724 1.0 1.8098e-01 1.3 9.48e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 67033
VecAYPX 2860 1.0 5.3519e-02 1.2 1.22e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 29243
VecMAXPY 22880 1.0 4.8374e-01 1.2 4.40e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 8 0 0 0 116471
VecScatterBegin 32032 1.0 2.1183e+00 2.1 0.00e+00 0.0 2.6e+07 1.6e+03 0.0e+00 4 0 98 88 0 5 0100100 0 0
VecScatterEnd 32032 1.0 1.8369e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 4 0 0 0 0 0
VecNormalize 22880 1.0 1.7390e+00 1.6 2.93e+08 1.0 0.0e+00 0.0e+00 2.3e+04 3 5 0 0 53 4 5 0 0 57 21600
MatMult 25168 1.0 1.6588e+01 1.1 2.44e+09 1.0 2.4e+07 1.7e+03 0.0e+00 33 40 89 84 0 44 43 90 96 0 18825
MatMultAdd 2860 1.0 6.8537e-01 1.2 8.25e+07 1.0 1.1e+06 5.3e+02 0.0e+00 1 1 4 1 0 2 1 4 1 0 15414
MatMultTranspose 4004 1.0 1.2716e+00 1.2 1.55e+08 1.0 1.5e+06 6.6e+02 0.0e+00 2 3 6 2 0 3 3 6 2 0 15583
MatSolve 24310 1.0 1.3130e+01 1.1 1.91e+09 1.0 0.0e+00 0.0e+00 0.0e+00 25 31 0 0 0 34 33 0 0 0 18627
MatLUFactorNum 4 1.0 1.0485e-02 1.1 1.86e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 22747
MatILUFactorSym 4 1.0 4.9901e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 4 1.0 1.2398e-05 6.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 4 1.0 5.0456e-0224.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 17160 1.0 3.0344e+00 1.8 5.87e+08 1.0 0.0e+00 0.0e+00 1.7e+04 5 10 0 0 40 7 10 0 0 43 24755
KSPSetUp 4 1.0 5.9605e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 7150 1.0 3.1997e+01 1.0 5.14e+09 1.0 2.1e+07 1.7e+03 4.0e+04 64 85 77 74 92 88 90 78 84100 20558
PCSetUp 4 1.0 7.4455e-02 1.9 1.86e+06 1.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 3203
PCSetUpOnBlocks 5720 1.0 7.9641e-02 1.8 1.86e+06 1.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 2995
PCApply 24310 1.0 1.4260e+01 1.1 1.91e+09 1.0 0.0e+00 0.0e+00 0.0e+00 27 31 0 0 0 37 33 0 0 0 17151
MGSmooth Level 0 1430 1.0 2.0421e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 5 0 0 0 0 0
MGSmooth Level 1 2288 1.0 1.2168e+00 1.1 4.82e+07 1.0 8.5e+06 1.9e+02 1.6e+04 2 1 32 3 37 3 1 32 4 40 5072
MGResid Level 1 1144 1.0 5.2052e-02 1.3 3.95e+06 1.0 1.2e+06 1.9e+02 0.0e+00 0 0 4 0 0 0 0 4 1 0 9722
MGInterp Level 1 2860 1.0 1.5052e-01 3.6 1.22e+06 1.0 1.1e+06 6.4e+01 0.0e+00 0 0 4 0 0 0 0 4 0 0 1035
MGSmooth Level 2 1716 1.0 1.9140e+00 1.0 3.39e+08 1.0 6.4e+06 6.4e+02 1.2e+04 4 6 24 9 28 5 6 24 10 30 22666
MGResid Level 2 858 1.0 1.1919e-01 1.5 2.37e+07 1.0 8.8e+05 6.4e+02 0.0e+00 0 0 3 1 0 0 0 3 1 0 25474
MGInterp Level 2 2288 1.0 1.6201e-01 2.2 7.76e+06 1.0 8.8e+05 2.1e+02 0.0e+00 0 0 3 0 0 0 0 3 0 0 6132
MGSmooth Level 3 1144 1.0 1.0630e+01 1.0 1.98e+09 1.0 4.4e+06 2.3e+03 8.0e+03 21 33 16 21 18 29 35 17 24 20 23810
MGResid Level 3 572 1.0 7.5736e-01 1.1 1.27e+08 1.0 5.9e+05 2.3e+03 0.0e+00 1 2 2 3 0 2 2 2 3 0 21382
MGInterp Level 3 1716 1.0 3.9544e-01 1.4 4.63e+07 1.0 6.6e+05 7.7e+02 0.0e+00 1 1 2 1 0 1 1 2 1 0 14978
MGSmooth Level 4 572 1.0 1.6467e+01 1.0 2.77e+09 1.0 1.2e+06 1.6e+04 4.0e+03 33 46 4 41 9 45 48 4 46 10 21568
MGResid Level 4 286 1.0 1.0729e+00 1.1 1.31e+08 1.0 1.5e+05 1.6e+04 0.0e+00 2 2 1 5 0 3 2 1 6 0 15654
MGInterp Level 4 1144 1.0 1.9588e+00 1.1 2.44e+08 1.0 4.4e+05 2.9e+03 0.0e+00 4 4 2 3 0 5 4 2 3 0 15922
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 841 853 211053336 0
Vector Scatter 19 19 22572 0
Matrix 38 42 19508928 0
Matrix Null Space 1 1 652 0
Distributed Mesh 5 5 830792 0
Bipartite Graph 10 10 8560 0
Index Set 47 59 844496 0
IS L to G Mapping 5 5 405756 0
Krylov Solver 11 11 84080 0
DMKSP interface 3 3 2088 0
Preconditioner 11 11 11864 0
Viewer 185 184 144256 0
--- Event Stage 1: MG Apply
Vector 12 0 0 0
Matrix 4 0 0 0
Index Set 14 2 1792 0
========================================================================================================================
Average time to get PetscTime(): 1.90735e-07
Average time for MPI_Barrier(): 1.37806e-05
Average time for zero size MPI_Send(): 2.67848e-06
#PETSc Option Table entries:
-ksp_atol 1e-9
-ksp_max_it 30
-ksp_monitor_true_residual
-ksp_type richardson
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure run at: Wed Aug 28 23:25:43 2013
Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=0 --known-mpi-c-double-complex=0 --with-batch="1 " --known-mpi-shared="0 " --known-memcmp-ok --with-blas-lapack-lib="-L/opt/acml/5.3.0/pgi64/lib -lacml" --COPTFLAGS="-O3 -fastsse" --FOPTFLAGS="-O3 -fastsse" --CXXOPTFLAGS="-O3 -fastsse" --with-x="0 " --with-debugging="0 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries=0 --with-dynamic-loading=0 --with-mpi-compilers="1 " --known-mpi-shared-libraries=0 --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " --with-cc=cc --with-cxx=CC --with-fc=ftn PETSC_ARCH=interlagos-64idx-pgi-opt
-----------------------------------------
Libraries compiled on Wed Aug 28 23:25:43 2013 on h2ologin3
Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64
Using PETSc directory: /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2
Using PETSc arch: interlagos-64idx-pgi-opt
-----------------------------------------
Using C compiler: cc -O3 -fastsse ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: ftn -O3 -fastsse ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/opt/cray/udreg/2.3.2-1.0402.7311.2.1.gem/include -I/opt/cray/ugni/5.0-1.0402.7128.7.6.gem/include -I/opt/cray/pmi/4.0.1-1.0000.9421.73.3.gem/include -I/opt/cray/dmapp/4.0.1-1.0402.7439.5.1.gem/include -I/opt/cray/gni-headers/2.1-1.0402.7082.6.2.gem/include -I/opt/cray/xpmem/0.1-2.0402.44035.2.1.gem/include -I/opt/cray/rca/1.0.0-2.0402.42153.2.106.gem/include -I/opt/cray-hss-devel/7.0.0/include -I/opt/cray/krca/1.0.0-2.0402.42157.2.94.gem/include -I/opt/cray/mpt/6.0.1/gni/mpich2-pgi/121/include -I/opt/acml/5.3.0/pgi64_fma4/include -I/opt/cray/libsci/12.1.01/pgi/121/interlagos/include -I/opt/fftw/3.3.0.3/interlagos/include -I/usr/include/alps -I/opt/pgi/13.6.0/linux86-64/13.6/include -I/opt/cray/xe-sysroot/4.2.24/usr/include
-----------------------------------------
Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lpetsc -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lsuperlu_dist_3.3 -L/opt/acml/5.3.0/pgi64/lib -lacml -lpthread -lparmetis -lmetis -ldl
-----------------------------------------
#PETSc Option Table entries:
-ksp_atol 1e-9
-ksp_max_it 30
-ksp_monitor_true_residual
-ksp_type richardson
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
There are no unused options.
More information about the petsc-users
mailing list