[petsc-users] Memory and Speed Issue of using MG as preconditioner
Alan Z. Wei
zhenglun.wei at gmail.com
Wed Nov 6 12:35:12 CST 2013
Thanks Dave,
I further simulated the problem with -pc_mg_log and output these
files in the attachments.
I found that the smoothing process of the last level always
consumes the most time, i.e. 'MGSmooth Level 5' in out-level5 and
"MGSmooth Level 2' in out-level2. However, as I tested several other
-mg_levels_pc_type such as 'bjacobi', 'asm' etc. The default one, which
is 'jacobi', actually works the best. Therefore, I decide to keep using
it. However, do you have any suggestions to speed up this smoothing
process other than changing -mg_levels_pc_type?
Also, as you suggested to change -mg_levels_ksp_type, it does not
influence much if replacing 'chebyshev' by 'cg'. However, this part
never change while I modify '-mg_levels_ksp_type':
PC Object: (mg_coarse_) 32 MPI processes
type: redundant
Redundant preconditioner: First (color=0) of 32 PCs follows
KSP Object: (mg_coarse_redundant_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_redundant_) 1 MPI processes
type: lu
LU: out-of-place factorization
As you mentioned that the redundant LU for the coarse grid solver
primarily cause the large memory request for the 2-level case. How could
I change the coarse grid solver to reduce the memory requirement or
speed up the solver.
thanks again,
Alan
> Hey Alan,
>
> 1/ One difference in the memory footprint is likely coming from your
> coarse grid solver which is redundant LU.
> The 2 level case has a coarse grid problem with 70785 unknowns whilst
> the 5 level case has a coarse grid problem with 225 unknowns.
>
> 2/ The solve time difference will be affected by your coarse grid
> size. Add the command line argument
> -pc_mg_log
> to profile the setup time spent on the coarse grid and all other levels.
> See
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMG.html
>
> 3/ You can change the smoother on all levels by using the command line
> argument with the appropriate prefix, eg
> -mg_levels_ksp_type cg
> Note the prefix is displayed in the result of -ksp_view
>
> Also, your mesh size can be altered at run time using arguments like
> -da_grid_x 5
> You shouldn't have to modify the source code each time
> See
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMDACreate3d.html
>
>
> Cheers,
> Dave
>
>
> On 6 November 2013 04:21, Alan <zhenglun.wei at gmail.com
> <mailto:zhenglun.wei at gmail.com>> wrote:
>
> Dear all,
> I hope you're having a nice day.
> Recently, I came across a problem on using MG as preconditioner.
> Basically, to achieve the same finest mesh with pc_type = mg, the
> memory
> usage for -da_refine 2 is much more than that for -da_refine 5. To my
> limited knowledge, more refinement should consume more memory,
> which is
> contradict to the behavior of pc_type = mg in PETsc.
> Here, I provide two output files. They are all from
> /src/ksp/ksp/example/tutorial/ex45.c with 32 processes.
> The execute file for out-level2 is
> mpiexec -np 32 ./ex45 -pc_type mg -ksp_type cg -da_refine 2
> -pc_mg_galerkin -ksp_rtol 1.0e-7 -mg_levels_pc_type jacobi
> -mg_levels_ksp_type chebyshev -dm_view -log_summary -pc_mg_log
> -pc_mg_monitor -ksp_view -ksp_monitor > out &
> and in ex45.c, KSPCreate is changed as:
> ierr =
> DMDACreate3d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_STAR,-65,-33,-33,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr);
> On the other hand, the execute file for out-level5 is
> mpiexec -np 32 ./ex45 -pc_type mg -ksp_type cg -da_refine 5
> -pc_mg_galerkin -ksp_rtol 1.0e-7 -mg_levels_pc_type jacobi
> -mg_levels_ksp_type chebyshev -dm_view -log_summary -pc_mg_log
> -pc_mg_monitor -ksp_view -ksp_monitor > out &
> and in ex45.c, KSPCreate is changed as:
> ierr =
> DMDACreate3d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_STAR,-9,-5,-5,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr);
> In summary, the final finest meshes obtained for both cases are
> 257*129*129 as documented in both files. However, the out-level2 shows
> that the Matrix requested 822871308 memory while out-level5 only need
> 36052932.
> Furthermore, although the total iterations for KSP solver are
> shown as 5
> times in both files. the wall time elapsed for out-level2 is around
> 150s, while out-level5 is about 4.7s.
> At last, there is a minor question. In both files, under 'Down solver
> (pre-smoother) on level 1' and 'Down solver (pre-smoother) on
> level 2',
> the type of "KSP Object: (mg_levels_1_est_)" and "KSP Object:
> (mg_levels_2_est_)" are all 'gmres'. Since I'm using uniformly
> Cartesian
> mesh, would it be helpful to speed up the solver if the 'gmres' is
> replaced by 'cg' here? If so, which PETSc option can change the
> type of
> KSP object.
>
> sincerely appreciate,
> Alan
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20131106/20fed90e/attachment-0001.html>
-------------- next part --------------
Processor [0] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 0 65, Z range of indices: 0 33
Processor [1] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 0 65, Z range of indices: 0 33
Processor [2] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 0 65, Z range of indices: 0 33
Processor [3] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 0 65, Z range of indices: 0 33
Processor [4] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 65 129, Z range of indices: 0 33
Processor [5] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 65 129, Z range of indices: 0 33
Processor [6] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 65 129, Z range of indices: 0 33
Processor [7] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 65 129, Z range of indices: 0 33
Processor [8] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 0 65, Z range of indices: 33 65
Processor [9] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 0 65, Z range of indices: 33 65
Processor [10] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 0 65, Z range of indices: 33 65
Processor [11] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 0 65, Z range of indices: 33 65
Processor [12] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 65 129, Z range of indices: 33 65
Processor [13] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 65 129, Z range of indices: 33 65
Processor [14] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 65 129, Z range of indices: 33 65
Processor [15] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 65 129, Z range of indices: 33 65
Processor [16] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 0 65, Z range of indices: 65 97
Processor [17] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 0 65, Z range of indices: 65 97
Processor [18] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 0 65, Z range of indices: 65 97
Processor [19] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 0 65, Z range of indices: 65 97
Processor [20] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 65 129, Z range of indices: 65 97
Processor [21] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 65 129, Z range of indices: 65 97
Processor [22] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 65 129, Z range of indices: 65 97
Processor [23] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 65 129, Z range of indices: 65 97
Processor [24] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 0 65, Z range of indices: 97 129
Processor [25] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 0 65, Z range of indices: 97 129
Processor [26] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 0 65, Z range of indices: 97 129
Processor [27] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 0 65, Z range of indices: 97 129
Processor [28] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 65 129, Z range of indices: 97 129
Processor [29] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 65 129, Z range of indices: 97 129
Processor [30] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 65 129, Z range of indices: 97 129
Processor [31] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 65 129, Z range of indices: 97 129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
Processor [0] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 0 33, Z range of indices: 0 17
Processor [1] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 0 33, Z range of indices: 0 17
Processor [2] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 0 33, Z range of indices: 0 17
Processor [3] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 0 33, Z range of indices: 0 17
Processor [4] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 33 65, Z range of indices: 0 17
Processor [5] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 33 65, Z range of indices: 0 17
Processor [6] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 33 65, Z range of indices: 0 17
Processor [7] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 33 65, Z range of indices: 0 17
Processor [8] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 0 33, Z range of indices: 17 33
Processor [9] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 0 33, Z range of indices: 17 33
Processor [10] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 0 33, Z range of indices: 17 33
Processor [11] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 0 33, Z range of indices: 17 33
Processor [12] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 33 65, Z range of indices: 17 33
Processor [13] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 33 65, Z range of indices: 17 33
Processor [14] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 33 65, Z range of indices: 17 33
Processor [15] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 33 65, Z range of indices: 17 33
Processor [16] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 0 33, Z range of indices: 33 49
Processor [17] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 0 33, Z range of indices: 33 49
Processor [18] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 0 33, Z range of indices: 33 49
Processor [19] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 0 33, Z range of indices: 33 49
Processor [20] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 33 65, Z range of indices: 33 49
Processor [21] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 33 65, Z range of indices: 33 49
Processor [22] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 33 65, Z range of indices: 33 49
Processor [23] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 33 65, Z range of indices: 33 49
Processor [24] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 0 33, Z range of indices: 49 65
Processor [25] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 0 33, Z range of indices: 49 65
Processor [26] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 0 33, Z range of indices: 49 65
Processor [27] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 0 33, Z range of indices: 49 65
Processor [28] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 33 65, Z range of indices: 49 65
Processor [29] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 33 65, Z range of indices: 49 65
Processor [30] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 33 65, Z range of indices: 49 65
Processor [31] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 33 65, Z range of indices: 49 65
Processor [0] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 0 17, Z range of indices: 0 9
Processor [1] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 0 17, Z range of indices: 0 9
Processor [2] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 0 17, Z range of indices: 0 9
Processor [3] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 0 17, Z range of indices: 0 9
Processor [4] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 17 33, Z range of indices: 0 9
Processor [5] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 17 33, Z range of indices: 0 9
Processor [6] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 17 33, Z range of indices: 0 9
Processor [7] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 17 33, Z range of indices: 0 9
Processor [8] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 0 17, Z range of indices: 9 17
Processor [9] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 0 17, Z range of indices: 9 17
Processor [10] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 0 17, Z range of indices: 9 17
Processor [11] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 0 17, Z range of indices: 9 17
Processor [12] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 17 33, Z range of indices: 9 17
Processor [13] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 17 33, Z range of indices: 9 17
Processor [14] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 17 33, Z range of indices: 9 17
Processor [15] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 17 33, Z range of indices: 9 17
Processor [16] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 0 17, Z range of indices: 17 25
Processor [17] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 0 17, Z range of indices: 17 25
Processor [18] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 0 17, Z range of indices: 17 25
Processor [19] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 0 17, Z range of indices: 17 25
Processor [20] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 17 33, Z range of indices: 17 25
Processor [21] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 17 33, Z range of indices: 17 25
Processor [22] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 17 33, Z range of indices: 17 25
Processor [23] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 17 33, Z range of indices: 17 25
Processor [24] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 0 17, Z range of indices: 25 33
Processor [25] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 0 17, Z range of indices: 25 33
Processor [26] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 0 17, Z range of indices: 25 33
Processor [27] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 0 17, Z range of indices: 25 33
Processor [28] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 17 33, Z range of indices: 25 33
Processor [29] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 17 33, Z range of indices: 25 33
Processor [30] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 17 33, Z range of indices: 25 33
Processor [31] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 17 33, Z range of indices: 25 33
0 KSP Residual norm 2.036594349596e+03
1 KSP Residual norm 8.756270777762e+01
2 KSP Residual norm 3.092374574522e+00
3 KSP Residual norm 1.220382147945e-01
4 KSP Residual norm 2.871729837207e-02
KSP Object: 32 MPI processes
type: cg
maximum iterations=10000
tolerances: relative=1e-07, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object: 32 MPI processes
type: mg
MG: type is MULTIPLICATIVE, levels=3 cycles=v
Cycles per PCApply=1
Using Galerkin computed coarse grid matrices
Coarse grid solver -- level -------------------------------
KSP Object: (mg_coarse_) 32 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_) 32 MPI processes
type: redundant
Redundant preconditioner: First (color=0) of 32 PCs follows
KSP Object: (mg_coarse_redundant_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_redundant_) 1 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: nd
factor fill ratio given 5, needed 35.0339
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=70785, cols=70785
package used to perform factorization: petsc
total: nonzeros=63619375, allocated nonzeros=63619375
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=70785, cols=70785
total: nonzeros=1815937, allocated nonzeros=1911195
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=70785, cols=70785
total: nonzeros=1815937, allocated nonzeros=1815937
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 32 MPI processes
type: chebyshev
Chebyshev: eigenvalue estimates: min = 0.231542, max = 2.54696
Chebyshev: estimated using: [0 0.1; 0 1.1]
KSP Object: (mg_levels_1_est_) 32 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_1_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=545025, cols=545025
total: nonzeros=14340865, allocated nonzeros=14340865
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_1_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=545025, cols=545025
total: nonzeros=14340865, allocated nonzeros=14340865
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 32 MPI processes
type: chebyshev
Chebyshev: eigenvalue estimates: min = 0.155706, max = 1.71277
Chebyshev: estimated using: [0 0.1; 0 1.1]
KSP Object: (mg_levels_2_est_) 32 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_2_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=4276737, cols=4276737
total: nonzeros=29771265, allocated nonzeros=29771265
total number of mallocs used during MatSetValues calls =0
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_2_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=4276737, cols=4276737
total: nonzeros=29771265, allocated nonzeros=29771265
total number of mallocs used during MatSetValues calls =0
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=4276737, cols=4276737
total: nonzeros=29771265, allocated nonzeros=29771265
total number of mallocs used during MatSetValues calls =0
Residual norm 0.000941649
Total Time Elapsed: 570.948138
Total Time Elapsed: 571.226165
Total Time Elapsed: 571.226142
Total Time Elapsed: 571.227743
Total Time Elapsed: 571.155077
Total Time Elapsed: 571.230839
Total Time Elapsed: 571.230841
Total Time Elapsed: 571.379246
Total Time Elapsed: 571.231696
Total Time Elapsed: 571.162397
Total Time Elapsed: 570.963429
Total Time Elapsed: 571.234322
Total Time Elapsed: 570.952416
Total Time Elapsed: 571.235487
Total Time Elapsed: 570.955526
Total Time Elapsed: 570.952019
Total Time Elapsed: 570.956828
Total Time Elapsed: 571.384060
Total Time Elapsed: 571.168546
Total Time Elapsed: 570.955564
Total Time Elapsed: 571.386214
Total Time Elapsed: 571.167004
Total Time Elapsed: 570.958710
Total Time Elapsed: 571.389300
Total Time Elapsed: 571.173462
Total Time Elapsed: 571.394695
Total Time Elapsed: 571.169628
Total Time Elapsed: 571.397931
Total Time Elapsed: 571.401814
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
Total Time Elapsed: 571.401327
./ex45 on a linux-gnu-c-nodebug named n042 with 32 processors, by zlwei Wed Nov 6 11:48:06 2013
Using Petsc Development GIT revision: d696997672013bb4513d3ff57c61cc10e09b71f6 GIT Date: 2013-06-13 10:28:37 -0500
Max Max/Min Avg Total
Total Time Elapsed: 571.186741
Total Time Elapsed: 571.188217
Time (sec): 5.710e+02 1.00007 5.709e+02
Objects: 1.750e+02 1.00000 1.750e+02
Flops: 7.470e+10 1.00022 7.468e+10 2.390e+12
Flops/sec: 1.308e+08 1.00018 1.308e+08 4.186e+09
MPI Messages: 1.594e+03 1.77506 1.234e+03 3.948e+04
MPI Message Lengths: 3.375e+07 1.17496 2.581e+04 1.019e+09
MPI Reductions: 3.500e+02 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.8451e+02 49.8% 2.3636e+12 98.9% 1.188e+04 30.1% 1.963e+04 76.0% 2.430e+02 69.4%
1: MG Apply: 2.8643e+02 50.2% 2.6346e+10 1.1% 2.760e+04 69.9% 6.185e+03 24.0% 1.060e+02 30.3%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %f - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
KSPSetUp 5 1.0 4.9497e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 5 0 0 0 0 7 0
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
KSPSolve 1 1.0 5.7077e+02 1.0 7.47e+10 1.0 3.9e+04 2.6e+04 3.2e+02100100 99100 93 201101327131134 4187
VecTDot 9 1.0 1.3759e-01 2.2 2.51e+06 1.1 0.0e+00 0.0e+00 9.0e+00 0 0 0 0 3 0 0 0 0 4 560
VecNorm 6 1.0 3.9218e-0117.3 1.67e+06 1.1 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 2 0 0 0 0 2 131
VecCopy 2 1.0 6.6240e-03 9.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 12 1.0 1.3879e-0242.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 9 1.0 3.7683e-0214.0 2.51e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2043
VecAYPX 3 1.0 7.1573e-03 7.7 8.37e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3585
VecScatterBegin 7 1.0 5.3980e-03 9.0 0.00e+00 0.0 8.7e+02 1.7e+04 0.0e+00 0 0 2 1 0 0 0 7 2 0 0
VecScatterEnd 7 1.0 3.0406e-0248.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatMult 5 1.0 1.0779e-01 4.6 8.98e+06 1.1 6.4e+02 2.3e+04 0.0e+00 0 0 2 1 0 0 0 5 2 0 2564
MatMultTranspose 2 1.0 1.7044e-02 6.5 1.04e+06 1.0 2.3e+02 2.1e+03 0.0e+00 0 0 1 0 0 0 0 2 0 0 1896
MatLUFactorSym 1 1.0 3.1107e+00 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 1 1 0 0 0 1 0
MatLUFactorNum 1 1.0 5.5701e+02 7.3 7.38e+10 1.0 0.0e+00 0.0e+00 0.0e+00 49 99 0 0 0 98100 0 0 0 4241
MatAssemblyBegin 11 1.0 5.8176e-01 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 3 0 0 0 0 5 0
MatAssemblyEnd 11 1.0 2.9766e-01 2.0 0.00e+00 0.0 2.2e+03 1.0e+03 4.0e+01 0 0 6 0 11 0 0 18 0 16 0
MatGetRowIJ 1 1.0 6.0652e-02 7.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 3.3588e-01 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 1 0 0 0 0 1 0
MatView 8 1.3 4.5514e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 2 0 0 0 0 2 0
MatPtAP 2 1.0 7.9281e-01 1.0 2.06e+07 1.1 3.9e+03 8.8e+03 5.0e+01 0 0 10 3 14 0 0 33 4 21 800
MatPtAPSymbolic 2 1.0 4.4065e-01 1.1 0.00e+00 0.0 2.2e+03 1.2e+04 3.0e+01 0 0 6 2 9 0 0 18 3 12 0
MatPtAPNumeric 2 1.0 3.9084e-01 1.1 2.06e+07 1.1 1.7e+03 5.2e+03 2.0e+01 0 0 4 1 6 0 0 14 1 8 1623
MatGetRedundant 1 1.0 1.8018e+00 1.3 0.00e+00 0.0 3.0e+03 2.3e+05 4.0e+00 0 0 8 68 1 1 0 25 89 2 0
MatGetLocalMat 2 1.0 5.1957e-02 6.7 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 2 0
MatGetBrAoCol 2 1.0 6.7300e-02 2.9 0.00e+00 0.0 1.5e+03 1.4e+04 4.0e+00 0 0 4 2 1 0 0 13 3 2 0
MatGetSymTrans 4 1.0 2.9506e-02 8.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCSetUp 1 1.0 5.6251e+02 7.0 7.38e+10 1.0 1.1e+04 7.2e+04 1.8e+02 50 99 27 74 51 100100 89 98 73 4201
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
PCApply 5 1.0 4.8945e+0265.0 8.32e+08 1.0 2.8e+04 8.8e+03 1.1e+02 50 1 70 24 30 101 1232 32 44 54
MGSetup Level 0 1 1.0 5.6137e+02 7.1 7.38e+10 1.0 5.1e+03 1.4e+05 2.7e+01 49 99 13 71 8 99100 43 93 11 4208
MGSetup Level 1 1 1.0 8.1902e-03 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 2 0 0 0 0 2 0
MGSetup Level 2 1 1.0 1.7457e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 2 0 0 0 0 2 0
--- Event Stage 1: MG Apply
KSPGMRESOrthog 20 1.0 4.4386e-01 1.9 3.47e+07 1.1 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 6 0 4 0 0 19 2390
KSPSetUp 2 1.0 4.8228e+029246.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01 49 0 0 0 6 97 0 0 0 19 0
KSPSolve 25 1.0 4.8849e+0270.1 8.08e+08 1.0 2.3e+04 9.7e+03 1.1e+02 50 1 58 22 30 99 97 83 90100 52
VecMDot 20 1.0 4.1889e-01 3.0 1.74e+07 1.1 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 6 0 2 0 0 19 1266
VecNorm 22 1.0 2.4155e-01 2.4 3.47e+06 1.1 0.0e+00 0.0e+00 2.2e+01 0 0 0 0 6 0 0 0 0 21 439
VecScale 62 1.0 6.6617e-0213.4 4.90e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 2244
VecCopy 12 1.0 1.9117e-0210.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 51 1.0 3.4841e-0229.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 84 1.0 9.3068e-02 9.6 1.33e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 2 0 0 0 4352
VecAYPX 80 1.0 8.7920e-02 8.3 7.90e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 2742
VecMAXPY 22 1.0 1.4221e-01 7.5 2.05e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 2 0 0 0 4408
VecPointwiseMult 82 1.0 1.4117e-01 9.0 6.48e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 1400
VecScatterBegin 112 1.0 1.0815e-01 8.5 0.00e+00 0.0 2.8e+04 8.8e+03 0.0e+00 0 0 70 24 0 0 0100100 0 0
VecScatterEnd 112 1.0 5.3921e+0012.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
VecNormalize 22 1.0 2.4458e-01 1.9 5.21e+06 1.1 0.0e+00 0.0e+00 2.2e+01 0 0 0 0 6 0 1 0 0 21 651
MatMult 82 1.0 4.5575e+00 4.0 1.12e+08 1.1 2.0e+04 7.5e+03 0.0e+00 0 0 52 15 0 1 13 74 62 0 750
MatMultAdd 10 1.0 2.4382e+0030.3 5.21e+06 1.0 1.2e+03 2.1e+03 0.0e+00 0 0 3 0 0 0 1 4 1 0 66
MatMultTranspose 10 1.0 1.6895e-0117.1 5.21e+06 1.0 1.2e+03 2.1e+03 0.0e+00 0 0 3 0 0 0 1 4 1 0 956
MatSolve 5 1.0 5.0062e+00 7.7 6.36e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 77 0 0 0 4064
PCApply 87 1.0 5.3308e+00 5.4 6.42e+08 1.0 5.0e+03 1.8e+04 4.0e+00 0 1 13 9 1 1 78 18 36 4 3854
MGSmooth Level 0 5 1.0 5.1727e+00 5.6 6.36e+08 1.0 5.0e+03 1.8e+04 0.0e+00 0 1 13 9 0 1 77 18 36 0 3934
MGSmooth Level 1 10 1.0 4.2581e+00 9.2 4.25e+07 1.1 1.3e+04 2.1e+03 5.3e+01 0 0 34 3 15 1 5 48 11 50 297
MGResid Level 1 5 1.0 1.1527e-01 3.0 4.80e+06 1.1 1.8e+03 2.1e+03 0.0e+00 0 0 5 0 0 0 1 7 2 0 1244
MGInterp Level 1 10 1.0 2.3899e+0026.1 1.20e+06 1.1 1.2e+03 8.7e+02 0.0e+00 0 0 3 0 0 0 0 4 0 0 15
MGSmooth Level 2 10 1.0 4.8321e+02325.1 1.29e+08 1.1 4.6e+03 2.3e+04 5.3e+01 49 0 12 10 15 98 15 17 43 50 8
MGResid Level 2 5 1.0 1.5715e-01 2.6 9.67e+06 1.1 6.4e+02 2.3e+04 0.0e+00 0 0 2 1 0 0 1 2 6 0 1894
MGInterp Level 2 10 1.0 1.4948e-01 3.1 9.22e+06 1.0 1.2e+03 3.3e+03 0.0e+00 0 0 3 0 0 0 1 4 2 0 1919
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 1 1 564 0
Krylov Solver 7 7 66456 0
DMKSP interface 2 2 1296 0
Vector 46 78 36381912 0
Vector Scatter 13 13 13676 0
Matrix 21 21 822871308 0
Distributed Mesh 3 3 1391808 0
Bipartite Graph 6 6 4752 0
Viewer 2 1 728 0
Index Set 32 32 1989612 0
IS L to G Mapping 3 3 690348 0
Preconditioner 7 7 6432 0
--- Event Stage 1: MG Apply
Vector 32 0 0 0
========================================================================================================================
Average time to get PetscTime(): 5.00679e-07
Average time for MPI_Barrier(): 0.000592613
Average time for zero size MPI_Send(): 0.000596561
#PETSc Option Table entries:
-da_refine 2
-dm_view
-ksp_monitor
-ksp_rtol 1.0e-7
-ksp_type cg
-ksp_view
-log_summary
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-pc_mg_galerkin
-pc_mg_log
-pc_mg_monitor
-pc_type mg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Thu Jun 13 15:51:55 2013
Configure options: --download-f-blas-lapack --download-hypre --download-mpich --with-cc=gcc --with-debugging=no --with-fc=gfortran PETSC_ARCH=linux-gnu-c-nodebug
-----------------------------------------
Libraries compiled on Thu Jun 13 15:51:55 2013 on login1.ittc.ku.edu
Machine characteristics: Linux-2.6.32-220.13.1.el6.x86_64-x86_64-with-redhat-6.2-Santiago
Using PETSc directory: /bio/work1/zlwei/PETSc/petsc-dev
Using PETSc arch: linux-gnu-c-nodebug
-----------------------------------------
Using C compiler: /bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/bin/mpif90 -fPIC -Wall -Wno-unused-variable -O ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/include -I/bio/work1/zlwei/PETSc/petsc-dev/include -I/bio/work1/zlwei/PETSc/petsc-dev/include -I/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/include
-----------------------------------------
Using C linker: /bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/bin/mpicc
Using Fortran linker: /bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/bin/mpif90
Using libraries: -Wl,-rpath,/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/lib -L/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/lib -lpetsc -Wl,-rpath,/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/lib -L/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/lib -lHYPRE -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lmpichcxx -lstdc++ -lflapack -lfblas -lX11 -lpthread -lmpichf90 -lgfortran -lm -lm -lmpichcxx -lstdc++ -lmpichcxx -lstdc++ -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl
-----------------------------------------
-------------- next part --------------
Processor [0] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 0 65, Z range of indices: 0 33
Processor [1] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 0 65, Z range of indices: 0 33
Processor [2] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 0 65, Z range of indices: 0 33
Processor [3] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 0 65, Z range of indices: 0 33
Processor [4] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 65 129, Z range of indices: 0 33
Processor [5] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 65 129, Z range of indices: 0 33
Processor [6] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 65 129, Z range of indices: 0 33
Processor [7] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 65 129, Z range of indices: 0 33
Processor [8] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 0 65, Z range of indices: 33 65
Processor [9] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 0 65, Z range of indices: 33 65
Processor [10] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 0 65, Z range of indices: 33 65
Processor [11] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 0 65, Z range of indices: 33 65
Processor [12] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 65 129, Z range of indices: 33 65
Processor [13] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 65 129, Z range of indices: 33 65
Processor [14] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 65 129, Z range of indices: 33 65
Processor [15] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 65 129, Z range of indices: 33 65
Processor [16] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 0 65, Z range of indices: 65 97
Processor [17] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 0 65, Z range of indices: 65 97
Processor [18] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 0 65, Z range of indices: 65 97
Processor [19] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 0 65, Z range of indices: 65 97
Processor [20] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 65 129, Z range of indices: 65 97
Processor [21] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 65 129, Z range of indices: 65 97
Processor [22] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 65 129, Z range of indices: 65 97
Processor [23] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 65 129, Z range of indices: 65 97
Processor [24] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 0 65, Z range of indices: 97 129
Processor [25] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 0 65, Z range of indices: 97 129
Processor [26] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 0 65, Z range of indices: 97 129
Processor [27] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 0 65, Z range of indices: 97 129
Processor [28] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 65, Y range of indices: 65 129, Z range of indices: 97 129
Processor [29] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 129, Y range of indices: 65 129, Z range of indices: 97 129
Processor [30] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 129 193, Y range of indices: 65 129, Z range of indices: 97 129
Processor [31] M 257 N 129 P 129 m 4 n 2 p 4 w 1 s 1
X range of indices: 193 257, Y range of indices: 65 129, Z range of indices: 97 129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
mx = 257, my = 129, mz =129
Processor [0] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 0 33, Z range of indices: 0 17
Processor [1] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 0 33, Z range of indices: 0 17
Processor [2] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 0 33, Z range of indices: 0 17
Processor [3] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 0 33, Z range of indices: 0 17
Processor [4] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 33 65, Z range of indices: 0 17
Processor [5] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 33 65, Z range of indices: 0 17
Processor [6] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 33 65, Z range of indices: 0 17
Processor [7] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 33 65, Z range of indices: 0 17
Processor [8] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 0 33, Z range of indices: 17 33
Processor [9] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 0 33, Z range of indices: 17 33
Processor [10] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 0 33, Z range of indices: 17 33
Processor [11] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 0 33, Z range of indices: 17 33
Processor [12] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 33 65, Z range of indices: 17 33
Processor [13] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 33 65, Z range of indices: 17 33
Processor [14] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 33 65, Z range of indices: 17 33
Processor [15] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 33 65, Z range of indices: 17 33
Processor [16] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 0 33, Z range of indices: 33 49
Processor [17] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 0 33, Z range of indices: 33 49
Processor [18] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 0 33, Z range of indices: 33 49
Processor [19] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 0 33, Z range of indices: 33 49
Processor [20] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 33 65, Z range of indices: 33 49
Processor [21] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 33 65, Z range of indices: 33 49
Processor [22] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 33 65, Z range of indices: 33 49
Processor [23] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 33 65, Z range of indices: 33 49
Processor [24] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 0 33, Z range of indices: 49 65
Processor [25] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 0 33, Z range of indices: 49 65
Processor [26] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 0 33, Z range of indices: 49 65
Processor [27] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 0 33, Z range of indices: 49 65
Processor [28] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 33, Y range of indices: 33 65, Z range of indices: 49 65
Processor [29] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 65, Y range of indices: 33 65, Z range of indices: 49 65
Processor [30] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 65 97, Y range of indices: 33 65, Z range of indices: 49 65
Processor [31] M 129 N 65 P 65 m 4 n 2 p 4 w 1 s 1
X range of indices: 97 129, Y range of indices: 33 65, Z range of indices: 49 65
Processor [0] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 0 17, Z range of indices: 0 9
Processor [1] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 0 17, Z range of indices: 0 9
Processor [2] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 0 17, Z range of indices: 0 9
Processor [3] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 0 17, Z range of indices: 0 9
Processor [4] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 17 33, Z range of indices: 0 9
Processor [5] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 17 33, Z range of indices: 0 9
Processor [6] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 17 33, Z range of indices: 0 9
Processor [7] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 17 33, Z range of indices: 0 9
Processor [8] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 0 17, Z range of indices: 9 17
Processor [9] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 0 17, Z range of indices: 9 17
Processor [10] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 0 17, Z range of indices: 9 17
Processor [11] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 0 17, Z range of indices: 9 17
Processor [12] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 17 33, Z range of indices: 9 17
Processor [13] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 17 33, Z range of indices: 9 17
Processor [14] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 17 33, Z range of indices: 9 17
Processor [15] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 17 33, Z range of indices: 9 17
Processor [16] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 0 17, Z range of indices: 17 25
Processor [17] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 0 17, Z range of indices: 17 25
Processor [18] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 0 17, Z range of indices: 17 25
Processor [19] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 0 17, Z range of indices: 17 25
Processor [20] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 17 33, Z range of indices: 17 25
Processor [21] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 17 33, Z range of indices: 17 25
Processor [22] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 17 33, Z range of indices: 17 25
Processor [23] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 17 33, Z range of indices: 17 25
Processor [24] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 0 17, Z range of indices: 25 33
Processor [25] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 0 17, Z range of indices: 25 33
Processor [26] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 0 17, Z range of indices: 25 33
Processor [27] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 0 17, Z range of indices: 25 33
Processor [28] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 17, Y range of indices: 17 33, Z range of indices: 25 33
Processor [29] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 33, Y range of indices: 17 33, Z range of indices: 25 33
Processor [30] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 33 49, Y range of indices: 17 33, Z range of indices: 25 33
Processor [31] M 65 N 33 P 33 m 4 n 2 p 4 w 1 s 1
X range of indices: 49 65, Y range of indices: 17 33, Z range of indices: 25 33
Processor [0] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 9, Y range of indices: 0 9, Z range of indices: 0 5
Processor [1] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 17, Y range of indices: 0 9, Z range of indices: 0 5
Processor [2] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 25, Y range of indices: 0 9, Z range of indices: 0 5
Processor [3] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 25 33, Y range of indices: 0 9, Z range of indices: 0 5
Processor [4] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 9, Y range of indices: 9 17, Z range of indices: 0 5
Processor [5] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 17, Y range of indices: 9 17, Z range of indices: 0 5
Processor [6] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 25, Y range of indices: 9 17, Z range of indices: 0 5
Processor [7] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 25 33, Y range of indices: 9 17, Z range of indices: 0 5
Processor [8] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 9, Y range of indices: 0 9, Z range of indices: 5 9
Processor [9] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 17, Y range of indices: 0 9, Z range of indices: 5 9
Processor [10] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 25, Y range of indices: 0 9, Z range of indices: 5 9
Processor [11] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 25 33, Y range of indices: 0 9, Z range of indices: 5 9
Processor [12] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 9, Y range of indices: 9 17, Z range of indices: 5 9
Processor [13] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 17, Y range of indices: 9 17, Z range of indices: 5 9
Processor [14] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 25, Y range of indices: 9 17, Z range of indices: 5 9
Processor [15] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 25 33, Y range of indices: 9 17, Z range of indices: 5 9
Processor [16] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 9, Y range of indices: 0 9, Z range of indices: 9 13
Processor [17] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 17, Y range of indices: 0 9, Z range of indices: 9 13
Processor [18] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 25, Y range of indices: 0 9, Z range of indices: 9 13
Processor [19] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 25 33, Y range of indices: 0 9, Z range of indices: 9 13
Processor [20] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 9, Y range of indices: 9 17, Z range of indices: 9 13
Processor [21] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 17, Y range of indices: 9 17, Z range of indices: 9 13
Processor [22] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 25, Y range of indices: 9 17, Z range of indices: 9 13
Processor [23] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 25 33, Y range of indices: 9 17, Z range of indices: 9 13
Processor [24] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 9, Y range of indices: 0 9, Z range of indices: 13 17
Processor [25] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 17, Y range of indices: 0 9, Z range of indices: 13 17
Processor [26] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 25, Y range of indices: 0 9, Z range of indices: 13 17
Processor [27] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 25 33, Y range of indices: 0 9, Z range of indices: 13 17
Processor [28] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 9, Y range of indices: 9 17, Z range of indices: 13 17
Processor [29] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 17, Y range of indices: 9 17, Z range of indices: 13 17
Processor [30] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 17 25, Y range of indices: 9 17, Z range of indices: 13 17
Processor [31] M 33 N 17 P 17 m 4 n 2 p 4 w 1 s 1
X range of indices: 25 33, Y range of indices: 9 17, Z range of indices: 13 17
Processor [0] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 5, Y range of indices: 0 5, Z range of indices: 0 3
Processor [1] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 9, Y range of indices: 0 5, Z range of indices: 0 3
Processor [2] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 13, Y range of indices: 0 5, Z range of indices: 0 3
Processor [3] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 13 17, Y range of indices: 0 5, Z range of indices: 0 3
Processor [4] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 5, Y range of indices: 5 9, Z range of indices: 0 3
Processor [5] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 9, Y range of indices: 5 9, Z range of indices: 0 3
Processor [6] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 13, Y range of indices: 5 9, Z range of indices: 0 3
Processor [7] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 13 17, Y range of indices: 5 9, Z range of indices: 0 3
Processor [8] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 5, Y range of indices: 0 5, Z range of indices: 3 5
Processor [9] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 9, Y range of indices: 0 5, Z range of indices: 3 5
Processor [10] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 13, Y range of indices: 0 5, Z range of indices: 3 5
Processor [11] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 13 17, Y range of indices: 0 5, Z range of indices: 3 5
Processor [12] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 5, Y range of indices: 5 9, Z range of indices: 3 5
Processor [13] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 9, Y range of indices: 5 9, Z range of indices: 3 5
Processor [14] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 13, Y range of indices: 5 9, Z range of indices: 3 5
Processor [15] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 13 17, Y range of indices: 5 9, Z range of indices: 3 5
Processor [16] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 5, Y range of indices: 0 5, Z range of indices: 5 7
Processor [17] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 9, Y range of indices: 0 5, Z range of indices: 5 7
Processor [18] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 13, Y range of indices: 0 5, Z range of indices: 5 7
Processor [19] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 13 17, Y range of indices: 0 5, Z range of indices: 5 7
Processor [20] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 5, Y range of indices: 5 9, Z range of indices: 5 7
Processor [21] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 9, Y range of indices: 5 9, Z range of indices: 5 7
Processor [22] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 13, Y range of indices: 5 9, Z range of indices: 5 7
Processor [23] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 13 17, Y range of indices: 5 9, Z range of indices: 5 7
Processor [24] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 5, Y range of indices: 0 5, Z range of indices: 7 9
Processor [25] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 9, Y range of indices: 0 5, Z range of indices: 7 9
Processor [26] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 13, Y range of indices: 0 5, Z range of indices: 7 9
Processor [27] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 13 17, Y range of indices: 0 5, Z range of indices: 7 9
Processor [28] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 5, Y range of indices: 5 9, Z range of indices: 7 9
Processor [29] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 9, Y range of indices: 5 9, Z range of indices: 7 9
Processor [30] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 9 13, Y range of indices: 5 9, Z range of indices: 7 9
Processor [31] M 17 N 9 P 9 m 4 n 2 p 4 w 1 s 1
X range of indices: 13 17, Y range of indices: 5 9, Z range of indices: 7 9
Processor [0] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 3, Y range of indices: 0 3, Z range of indices: 0 2
Processor [1] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 3 5, Y range of indices: 0 3, Z range of indices: 0 2
Processor [2] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 7, Y range of indices: 0 3, Z range of indices: 0 2
Processor [3] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 7 9, Y range of indices: 0 3, Z range of indices: 0 2
Processor [4] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 3, Y range of indices: 3 5, Z range of indices: 0 2
Processor [5] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 3 5, Y range of indices: 3 5, Z range of indices: 0 2
Processor [6] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 7, Y range of indices: 3 5, Z range of indices: 0 2
Processor [7] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 7 9, Y range of indices: 3 5, Z range of indices: 0 2
Processor [8] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 3, Y range of indices: 0 3, Z range of indices: 2 3
Processor [9] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 3 5, Y range of indices: 0 3, Z range of indices: 2 3
Processor [10] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 7, Y range of indices: 0 3, Z range of indices: 2 3
Processor [11] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 7 9, Y range of indices: 0 3, Z range of indices: 2 3
Processor [12] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 3, Y range of indices: 3 5, Z range of indices: 2 3
Processor [13] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 3 5, Y range of indices: 3 5, Z range of indices: 2 3
Processor [14] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 7, Y range of indices: 3 5, Z range of indices: 2 3
Processor [15] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 7 9, Y range of indices: 3 5, Z range of indices: 2 3
Processor [16] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 3, Y range of indices: 0 3, Z range of indices: 3 4
Processor [17] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 3 5, Y range of indices: 0 3, Z range of indices: 3 4
Processor [18] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 7, Y range of indices: 0 3, Z range of indices: 3 4
Processor [19] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 7 9, Y range of indices: 0 3, Z range of indices: 3 4
Processor [20] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 3, Y range of indices: 3 5, Z range of indices: 3 4
Processor [21] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 3 5, Y range of indices: 3 5, Z range of indices: 3 4
Processor [22] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 7, Y range of indices: 3 5, Z range of indices: 3 4
Processor [23] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 7 9, Y range of indices: 3 5, Z range of indices: 3 4
Processor [24] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 3, Y range of indices: 0 3, Z range of indices: 4 5
Processor [25] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 3 5, Y range of indices: 0 3, Z range of indices: 4 5
Processor [26] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 7, Y range of indices: 0 3, Z range of indices: 4 5
Processor [27] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 7 9, Y range of indices: 0 3, Z range of indices: 4 5
Processor [28] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 0 3, Y range of indices: 3 5, Z range of indices: 4 5
Processor [29] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 3 5, Y range of indices: 3 5, Z range of indices: 4 5
Processor [30] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 5 7, Y range of indices: 3 5, Z range of indices: 4 5
Processor [31] M 9 N 5 P 5 m 4 n 2 p 4 w 1 s 1
X range of indices: 7 9, Y range of indices: 3 5, Z range of indices: 4 5
0 KSP Residual norm 1.990474015208e+03
1 KSP Residual norm 1.163078153200e+02
2 KSP Residual norm 2.809444096980e+00
3 KSP Residual norm 2.139770554363e-01
4 KSP Residual norm 4.835908670273e-02
KSP Object: 32 MPI processes
type: cg
maximum iterations=10000
tolerances: relative=1e-07, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object: 32 MPI processes
type: mg
MG: type is MULTIPLICATIVE, levels=6 cycles=v
Cycles per PCApply=1
Using Galerkin computed coarse grid matrices
Coarse grid solver -- level -------------------------------
KSP Object: (mg_coarse_) 32 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_) 32 MPI processes
type: redundant
Redundant preconditioner: First (color=0) of 32 PCs follows
KSP Object: (mg_coarse_redundant_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_redundant_) 1 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot
matrix ordering: nd
factor fill ratio given 5, needed 3.15101
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=225, cols=225
package used to perform factorization: petsc
total: nonzeros=13313, allocated nonzeros=13313
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 174 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=225, cols=225
total: nonzeros=4225, allocated nonzeros=6075
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=225, cols=225
total: nonzeros=4225, allocated nonzeros=4225
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 32 MPI processes
type: chebyshev
Chebyshev: eigenvalue estimates: min = 0.283115, max = 3.11426
Chebyshev: estimated using: [0 0.1; 0 1.1]
KSP Object: (mg_levels_1_est_) 32 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_1_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=1377, cols=1377
total: nonzeros=30625, allocated nonzeros=30625
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_1_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=1377, cols=1377
total: nonzeros=30625, allocated nonzeros=30625
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 32 MPI processes
type: chebyshev
Chebyshev: eigenvalue estimates: min = 0.285627, max = 3.1419
Chebyshev: estimated using: [0 0.1; 0 1.1]
KSP Object: (mg_levels_2_est_) 32 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_2_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=9537, cols=9537
total: nonzeros=232897, allocated nonzeros=232897
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_2_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=9537, cols=9537
total: nonzeros=232897, allocated nonzeros=232897
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3 -------------------------------
KSP Object: (mg_levels_3_) 32 MPI processes
type: chebyshev
Chebyshev: eigenvalue estimates: min = 0.275571, max = 3.03128
Chebyshev: estimated using: [0 0.1; 0 1.1]
KSP Object: (mg_levels_3_est_) 32 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_3_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=70785, cols=70785
total: nonzeros=1815937, allocated nonzeros=1815937
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_3_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=70785, cols=70785
total: nonzeros=1815937, allocated nonzeros=1815937
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 4 -------------------------------
KSP Object: (mg_levels_4_) 32 MPI processes
type: chebyshev
Chebyshev: eigenvalue estimates: min = 0.231542, max = 2.54696
Chebyshev: estimated using: [0 0.1; 0 1.1]
KSP Object: (mg_levels_4_est_) 32 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_4_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=545025, cols=545025
total: nonzeros=14340865, allocated nonzeros=14340865
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_4_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=545025, cols=545025
total: nonzeros=14340865, allocated nonzeros=14340865
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 5 -------------------------------
KSP Object: (mg_levels_5_) 32 MPI processes
type: chebyshev
Chebyshev: eigenvalue estimates: min = 0.155706, max = 1.71277
Chebyshev: estimated using: [0 0.1; 0 1.1]
KSP Object: (mg_levels_5_est_) 32 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_5_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=4276737, cols=4276737
total: nonzeros=29771265, allocated nonzeros=29771265
total number of mallocs used during MatSetValues calls =0
maximum iterations=2
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using NONE norm type for convergence test
PC Object: (mg_levels_5_) 32 MPI processes
type: jacobi
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=4276737, cols=4276737
total: nonzeros=29771265, allocated nonzeros=29771265
total number of mallocs used during MatSetValues calls =0
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Matrix Object: 32 MPI processes
type: mpiaij
rows=4276737, cols=4276737
total: nonzeros=29771265, allocated nonzeros=29771265
total number of mallocs used during MatSetValues calls =0
Residual norm 0.000947651
Total Time Elapsed: 5.583765
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./ex45 on a linux-gnu-c-nodebug named n042 with 32 processors, by zlwei Wed Nov 6 11:35:09 2013
Total Time Elapsed: 5.584491
Using Petsc Development GIT revision: d696997672013bb4513d3ff57c61cc10e09b71f6 GIT Date: 2013-06-13 10:28:37 -0500
Max Max/Min Avg Total
Total Time Elapsed: 5.245622
Total Time Elapsed: 5.245764
Total Time Elapsed: 5.585519
Total Time Elapsed: 5.245926
Total Time Elapsed: 5.585674
Total Time Elapsed: 4.979911
Total Time Elapsed: 4.983441
Total Time Elapsed: 4.980154
Total Time Elapsed: 4.983256
Total Time Elapsed: 4.979283
Total Time Elapsed: 4.983043
Total Time Elapsed: 4.982123
Total Time Elapsed: 5.247401
Total Time Elapsed: 5.247372
Total Time Elapsed: 5.247512
Total Time Elapsed: 5.248704
Total Time Elapsed: 5.588453
Total Time Elapsed: 5.248888
Total Time Elapsed: 4.987055
Total Time Elapsed: 5.589496
Total Time Elapsed: 5.113992
Total Time Elapsed: 5.117246
Total Time Elapsed: 5.591055
Total Time Elapsed: 5.591275
Total Time Elapsed: 5.116060
Total Time Elapsed: 5.119928
Total Time Elapsed: 5.114534
Total Time Elapsed: 5.118732
Total Time Elapsed: 5.123729
Total Time Elapsed: 5.125026
Time (sec): 4.978e+00 1.00287 4.969e+00
Objects: 3.570e+02 1.00000 3.570e+02
Flops: 2.438e+08 1.08372 2.320e+08 7.424e+09
Flops/sec: 4.911e+07 1.08434 4.669e+07 1.494e+09
MPI Messages: 4.234e+03 2.12444 3.045e+03 9.745e+04
MPI Message Lengths: 9.073e+06 1.78864 2.350e+03 2.291e+08
MPI Reductions: 7.370e+02 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.5054e+00 50.4% 1.2102e+09 16.3% 2.114e+04 21.7% 6.174e+02 26.3% 4.710e+02 63.9%
1: MG Apply: 2.4639e+00 49.6% 6.2142e+09 83.7% 7.631e+04 78.3% 1.733e+03 73.7% 2.650e+02 36.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %f - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
KSPSetUp 8 1.0 6.8008e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+01 1 0 0 0 5 2 0 0 0 8 0
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
KSPSolve 1 1.0 4.7911e+00 1.0 2.41e+08 1.1 9.7e+04 2.3e+03 7.1e+02 96 99 99 98 96 191607458373150 1535
VecTDot 9 1.0 1.3888e-01 4.1 2.51e+06 1.1 0.0e+00 0.0e+00 9.0e+00 2 1 0 0 1 4 6 0 0 2 554
VecNorm 6 1.0 3.6860e-0110.9 1.67e+06 1.1 0.0e+00 0.0e+00 6.0e+00 4 1 0 0 1 8 4 0 0 1 139
VecCopy 2 1.0 3.1111e-03 4.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 27 1.0 1.3712e-0241.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 9 1.0 3.5091e-0213.5 2.51e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 6 0 0 0 2194
VecAYPX 3 1.0 2.2887e-0224.0 8.37e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 2 0 0 0 1121
VecScatterBegin 10 1.0 3.3240e-03 5.2 0.00e+00 0.0 1.2e+03 1.2e+04 0.0e+00 0 0 1 7 0 0 0 6 25 0 0
VecScatterEnd 10 1.0 3.6685e-0251.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatMult 5 1.0 9.9055e-02 5.5 8.98e+06 1.1 6.4e+02 2.3e+04 0.0e+00 1 4 1 6 0 2 23 3 24 0 2790
MatMultTranspose 5 1.0 2.3519e-02 8.9 1.06e+06 1.0 5.8e+02 9.0e+02 0.0e+00 0 0 1 0 0 0 3 3 1 0 1397
MatLUFactorSym 1 1.0 3.5100e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 1 0
MatLUFactorNum 1 1.0 3.3560e-03 6.0 4.79e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 4564
MatAssemblyBegin 23 1.0 5.5205e-01 4.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 6 0 0 0 3 13 0 0 0 5 0
MatAssemblyEnd 23 1.0 2.5726e-01 1.6 0.00e+00 0.0 5.1e+03 4.5e+02 8.8e+01 4 0 5 1 12 8 0 24 4 19 0
MatGetRowIJ 1 1.0 1.5712e-04 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 5.7721e-04 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 14 1.2 7.8599e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 2 0 0 0 0 3 0
MatPtAP 5 1.0 8.9238e-01 1.0 2.14e+07 1.1 1.1e+04 3.4e+03 1.2e+02 18 9 11 16 17 36 54 51 61 27 734
MatPtAPSymbolic 5 1.0 5.3509e-01 1.1 0.00e+00 0.0 6.5e+03 4.1e+03 7.5e+01 10 0 7 12 10 21 0 31 45 16 0
MatPtAPNumeric 5 1.0 3.8300e-01 1.1 2.14e+07 1.1 4.3e+03 2.2e+03 5.0e+01 7 9 4 4 7 15 54 20 16 11 1710
MatGetRedundant 1 1.0 2.0813e-02 3.6 0.00e+00 0.0 3.0e+03 5.5e+02 4.0e+00 0 0 3 1 1 0 0 14 3 1 0
MatGetLocalMat 5 1.0 7.4365e-02 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 1 0 0 0 1 1 0 0 0 2 0
MatGetBrAoCol 5 1.0 9.1380e-02 2.3 0.00e+00 0.0 4.8e+03 4.6e+03 1.0e+01 1 0 5 10 1 3 0 23 37 2 0
MatGetSymTrans 10 1.0 2.0689e-02 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCSetUp 1 1.0 1.4868e+00 1.0 2.29e+07 1.1 2.0e+04 2.1e+03 4.0e+02 30 9 20 18 54 59 58 94 70 85 473
PCApply 5 1.0 2.6097e+00 1.1 2.04e+08 1.1 7.6e+04 2.2e+03 2.6e+02 50 84 78 74 36 98513361281 56 2381
MGSetup Level 0 1 1.0 5.2823e-02 1.4 4.79e+05 1.0 5.1e+03 3.4e+02 2.7e+01 1 0 5 1 4 2 1 24 3 6 290
MGSetup Level 1 1 1.0 9.4671e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0
MGSetup Level 2 1 1.0 4.6070e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0
MGSetup Level 3 1 1.0 4.6048e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0
MGSetup Level 4 1 1.0 6.0458e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0
MGSetup Level 5 1 1.0 2.0175e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 1 0 0 0 1 0
--- Event Stage 1: MG Apply
KSPGMRESOrthog 50 1.0 5.0590e-01 1.9 3.54e+07 1.1 0.0e+00 0.0e+00 5.0e+01 8 15 0 0 7 16 17 0 0 19 2132
KSPSetUp 5 1.0 1.7280e-01 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+01 2 0 0 0 7 4 0 0 0 19 0
KSPSolve 55 1.0 2.2699e+00 1.2 1.79e+08 1.1 6.3e+04 2.3e+03 2.6e+02 43 73 64 63 36 86 87 82 85100 2389
VecMDot 50 1.0 4.8221e-01 2.4 1.77e+07 1.1 0.0e+00 0.0e+00 5.0e+01 7 7 0 0 7 14 9 0 0 19 1119
VecNorm 55 1.0 2.5947e-01 2.1 3.54e+06 1.1 0.0e+00 0.0e+00 5.5e+01 4 1 0 0 7 8 2 0 0 21 416
VecScale 155 1.0 5.0950e-02 9.8 4.99e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 1 2 0 0 0 2983
VecCopy 30 1.0 1.3113e-02 7.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 105 1.0 1.9075e-0219.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 210 1.0 8.1555e-02 8.1 1.35e+07 1.1 0.0e+00 0.0e+00 0.0e+00 1 6 0 0 0 2 7 0 0 0 5050
VecAYPX 200 1.0 8.0734e-02 7.4 8.05e+06 1.1 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 2 4 0 0 0 3037
VecMAXPY 55 1.0 1.2952e-01 9.1 2.09e+07 1.1 0.0e+00 0.0e+00 0.0e+00 1 9 0 0 0 2 10 0 0 0 4921
VecPointwiseMult 205 1.0 1.2314e-01 8.1 6.60e+06 1.1 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 3 3 0 0 0 1633
VecScatterBegin 265 1.0 1.2376e-01 6.9 0.00e+00 0.0 7.6e+04 2.2e+03 0.0e+00 1 0 78 74 0 2 0100100 0 0
VecScatterEnd 265 1.0 1.2838e+00 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 16 0 0 0 0 33 0 0 0 0 0
VecNormalize 55 1.0 2.6221e-01 1.8 5.31e+06 1.1 0.0e+00 0.0e+00 5.5e+01 4 2 0 0 7 8 3 0 0 21 617
MatMult 205 1.0 1.4328e+00 1.4 1.18e+08 1.1 6.6e+04 2.5e+03 0.0e+00 25 48 67 71 0 51 58 86 97 0 2503
MatMultAdd 25 1.0 1.4013e-01 2.4 5.31e+06 1.0 2.9e+03 9.0e+02 0.0e+00 2 2 3 1 0 3 3 4 2 0 1172
MatMultTranspose 25 1.0 1.2134e-0111.7 5.31e+06 1.0 2.9e+03 9.0e+02 0.0e+00 1 2 3 1 0 3 3 4 2 0 1354
MatSolve 5 1.0 6.1536e-04 3.8 1.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6865
PCApply 210 1.0 1.7584e-01 2.9 6.73e+06 1.1 5.0e+03 5.6e+01 1.0e+01 2 3 5 0 1 5 3 6 0 4 1167
MGSmooth Level 0 5 1.0 1.6812e-02 3.7 1.32e+05 1.0 5.0e+03 5.6e+01 0.0e+00 0 0 5 0 0 0 0 6 0 0 251
MGSmooth Level 1 10 1.0 7.2483e-02 1.1 1.45e+05 2.5 1.3e+04 5.3e+01 5.3e+01 1 0 14 0 7 3 0 17 0 20 39
MGResid Level 1 5 1.0 8.1933e-03 2.0 1.57e+04 2.6 1.8e+03 5.3e+01 0.0e+00 0 0 2 0 0 0 0 2 0 0 37
MGInterp Level 1 10 1.0 1.0109e-0210.1 3.92e+03 1.8 1.2e+03 2.4e+01 0.0e+00 0 0 1 0 0 0 0 2 0 0 8
MGSmooth Level 2 10 1.0 7.6197e-02 1.1 8.54e+05 1.6 1.3e+04 1.6e+02 5.3e+01 1 0 14 1 7 3 0 17 1 20 273
MGResid Level 2 5 1.0 6.6566e-03 2.0 9.46e+04 1.6 1.8e+03 1.6e+02 0.0e+00 0 0 2 0 0 0 0 2 0 0 350
MGInterp Level 2 10 1.0 7.4329e-03 4.0 2.37e+04 1.4 1.2e+03 7.1e+01 0.0e+00 0 0 1 0 0 0 0 2 0 0 82
MGSmooth Level 3 10 1.0 1.4607e-01 1.4 5.79e+06 1.3 1.3e+04 5.7e+02 5.3e+01 3 2 14 3 7 5 3 17 4 20 1102
MGResid Level 3 5 1.0 1.9478e-02 3.1 6.50e+05 1.3 1.8e+03 5.7e+02 0.0e+00 0 0 2 0 0 0 0 2 1 0 932
MGInterp Level 3 10 1.0 1.0718e-02 3.6 1.62e+05 1.2 1.2e+03 2.4e+02 0.0e+00 0 0 1 0 0 0 0 2 0 0 435
MGSmooth Level 4 10 1.0 6.5122e-01 1.4 4.25e+07 1.1 1.3e+04 2.1e+03 5.3e+01 12 17 14 12 7 23 20 17 17 20 1943
MGResid Level 4 5 1.0 1.2598e-01 5.1 4.80e+06 1.1 1.8e+03 2.1e+03 0.0e+00 1 2 2 2 0 2 2 2 2 0 1138
MGInterp Level 4 10 1.0 4.8600e-02 5.7 1.20e+06 1.1 1.2e+03 8.7e+02 0.0e+00 0 0 1 0 0 1 1 2 1 0 747
MGSmooth Level 5 10 1.0 1.5219e+00 1.5 1.29e+08 1.1 4.6e+03 2.3e+04 5.3e+01 26 53 5 46 7 52 64 6 62 20 2608
MGResid Level 5 5 1.0 1.6775e-01 2.4 9.67e+06 1.1 6.4e+02 2.3e+04 0.0e+00 2 4 1 6 0 4 5 1 9 0 1775
MGInterp Level 5 10 1.0 1.9435e-01 3.1 9.22e+06 1.0 1.2e+03 3.3e+03 0.0e+00 2 4 1 2 0 4 5 2 2 0 1476
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 1 1 564 0
Krylov Solver 13 13 160752 0
DMKSP interface 4 4 2592 0
Vector 91 171 35911200 0
Vector Scatter 25 25 26300 0
Matrix 45 45 36052932 0
Distributed Mesh 6 6 1412736 0
Bipartite Graph 12 12 9504 0
Viewer 2 1 728 0
Index Set 59 59 788488 0
IS L to G Mapping 6 6 695256 0
Preconditioner 13 13 11736 0
--- Event Stage 1: MG Apply
Vector 80 0 0 0
========================================================================================================================
Average time to get PetscTime(): 5.00679e-07
Average time for MPI_Barrier(): 0.000635576
Average time for zero size MPI_Send(): 0.000971369
#PETSc Option Table entries:
-da_refine 5
-dm_view
-ksp_monitor
-ksp_rtol 1.0e-7
-ksp_type cg
-ksp_view
-log_summary
-mg_levels_ksp_type chebyshev
-mg_levels_pc_type jacobi
-pc_mg_galerkin
-pc_mg_log
-pc_mg_monitor
-pc_type mg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Thu Jun 13 15:51:55 2013
Configure options: --download-f-blas-lapack --download-hypre --download-mpich --with-cc=gcc --with-debugging=no --with-fc=gfortran PETSC_ARCH=linux-gnu-c-nodebug
-----------------------------------------
Libraries compiled on Thu Jun 13 15:51:55 2013 on login1.ittc.ku.edu
Machine characteristics: Linux-2.6.32-220.13.1.el6.x86_64-x86_64-with-redhat-6.2-Santiago
Using PETSc directory: /bio/work1/zlwei/PETSc/petsc-dev
Using PETSc arch: linux-gnu-c-nodebug
-----------------------------------------
Using C compiler: /bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/bin/mpif90 -fPIC -Wall -Wno-unused-variable -O ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/include -I/bio/work1/zlwei/PETSc/petsc-dev/include -I/bio/work1/zlwei/PETSc/petsc-dev/include -I/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/include
-----------------------------------------
Using C linker: /bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/bin/mpicc
Using Fortran linker: /bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/bin/mpif90
Using libraries: -Wl,-rpath,/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/lib -L/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/lib -lpetsc -Wl,-rpath,/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/lib -L/bio/work1/zlwei/PETSc/petsc-dev/linux-gnu-c-nodebug/lib -lHYPRE -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lmpichcxx -lstdc++ -lflapack -lfblas -lX11 -lpthread -lmpichf90 -lgfortran -lm -lm -lmpichcxx -lstdc++ -lmpichcxx -lstdc++ -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl
-----------------------------------------
More information about the petsc-users
mailing list