[petsc-users] Scaling with number of cores
TAY wee-beng
zonexo at gmail.com
Sun Nov 1 19:35:47 CST 2015
Hi,
Sorry I forgot and use the old a.out. I have attached the new log for
48cores (log48), together with the 96cores log (log96).
Why does the number of processes increase so much? Is there something
wrong with my coding?
Only the Poisson eqn 's RHS changes, the LHS doesn't. So if I want to
reuse the preconditioner, what must I do? Or what must I not do?
Lastly, I only simulated 2 time steps previously. Now I run for 10
timesteps (log48_10). Is it building the preconditioner at every timestep?
Also, what about momentum eqn? Is it working well?
I will try the gamg later too.
Thank you
Yours sincerely,
TAY wee-beng
On 2/11/2015 12:30 AM, Barry Smith wrote:
> You used gmres with 48 processes but richardson with 96. You need to be careful and make sure you don't change the solvers when you change the number of processors since you can get very different inconsistent results
>
> Anyways all the time is being spent in the BoomerAMG algebraic multigrid setup and it is is scaling badly. When you double the problem size and number of processes it went from 3.2445e+01 to 4.3599e+02 seconds.
>
> PCSetUp 3 1.0 3.2445e+01 1.0 9.58e+06 2.0 0.0e+00 0.0e+00 4.0e+00 62 8 0 0 4 62 8 0 0 5 11
>
> PCSetUp 3 1.0 4.3599e+02 1.0 9.58e+06 2.0 0.0e+00 0.0e+00 4.0e+00 85 18 0 0 6 85 18 0 0 6 2
>
> Now is the Poisson problem changing at each timestep or can you use the same preconditioner built with BoomerAMG for all the time steps? Algebraic multigrid has a large set up time that you often doesn't matter if you have many time steps but if you have to rebuild it each timestep it is too large?
>
> You might also try -pc_type gamg and see how PETSc's algebraic multigrid scales for your problem/machine.
>
> Barry
>
>
>
>> On Nov 1, 2015, at 7:30 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>
>>
>> On 1/11/2015 10:00 AM, Barry Smith wrote:
>>>> On Oct 31, 2015, at 8:43 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>
>>>>
>>>> On 1/11/2015 12:47 AM, Matthew Knepley wrote:
>>>>> On Sat, Oct 31, 2015 at 11:34 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> I understand that as mentioned in the faq, due to the limitations in memory, the scaling is not linear. So, I am trying to write a proposal to use a supercomputer.
>>>>> Its specs are:
>>>>> Compute nodes: 82,944 nodes (SPARC64 VIIIfx; 16GB of memory per node)
>>>>>
>>>>> 8 cores / processor
>>>>> Interconnect: Tofu (6-dimensional mesh/torus) Interconnect
>>>>> Each cabinet contains 96 computing nodes,
>>>>> One of the requirement is to give the performance of my current code with my current set of data, and there is a formula to calculate the estimated parallel efficiency when using the new large set of data
>>>>> There are 2 ways to give performance:
>>>>> 1. Strong scaling, which is defined as how the elapsed time varies with the number of processors for a fixed
>>>>> problem.
>>>>> 2. Weak scaling, which is defined as how the elapsed time varies with the number of processors for a
>>>>> fixed problem size per processor.
>>>>> I ran my cases with 48 and 96 cores with my current cluster, giving 140 and 90 mins respectively. This is classified as strong scaling.
>>>>> Cluster specs:
>>>>> CPU: AMD 6234 2.4GHz
>>>>> 8 cores / processor (CPU)
>>>>> 6 CPU / node
>>>>> So 48 Cores / CPU
>>>>> Not sure abt the memory / node
>>>>>
>>>>> The parallel efficiency ‘En’ for a given degree of parallelism ‘n’ indicates how much the program is
>>>>> efficiently accelerated by parallel processing. ‘En’ is given by the following formulae. Although their
>>>>> derivation processes are different depending on strong and weak scaling, derived formulae are the
>>>>> same.
>>>>> From the estimated time, my parallel efficiency using Amdahl's law on the current old cluster was 52.7%.
>>>>> So is my results acceptable?
>>>>> For the large data set, if using 2205 nodes (2205X8cores), my expected parallel efficiency is only 0.5%. The proposal recommends value of > 50%.
>>>>> The problem with this analysis is that the estimated serial fraction from Amdahl's Law changes as a function
>>>>> of problem size, so you cannot take the strong scaling from one problem and apply it to another without a
>>>>> model of this dependence.
>>>>>
>>>>> Weak scaling does model changes with problem size, so I would measure weak scaling on your current
>>>>> cluster, and extrapolate to the big machine. I realize that this does not make sense for many scientific
>>>>> applications, but neither does requiring a certain parallel efficiency.
>>>> Ok I check the results for my weak scaling it is even worse for the expected parallel efficiency. From the formula used, it's obvious it's doing some sort of exponential extrapolation decrease. So unless I can achieve a near > 90% speed up when I double the cores and problem size for my current 48/96 cores setup, extrapolating from about 96 nodes to 10,000 nodes will give a much lower expected parallel efficiency for the new case.
>>>>
>>>> However, it's mentioned in the FAQ that due to memory requirement, it's impossible to get >90% speed when I double the cores and problem size (ie linear increase in performance), which means that I can't get >90% speed up when I double the cores and problem size for my current 48/96 cores setup. Is that so?
>>> What is the output of -ksp_view -log_summary on the problem and then on the problem doubled in size and number of processors?
>>>
>>> Barry
>> Hi,
>>
>> I have attached the output
>>
>> 48 cores: log48
>> 96 cores: log96
>>
>> There are 2 solvers - The momentum linear eqn uses bcgs, while the Poisson eqn uses hypre BoomerAMG.
>>
>> Problem size doubled from 158x266x150 to 158x266x300.
>>>> So is it fair to say that the main problem does not lie in my programming skills, but rather the way the linear equations are solved?
>>>>
>>>> Thanks.
>>>>> Thanks,
>>>>>
>>>>> Matt
>>>>> Is it possible for this type of scaling in PETSc (>50%), when using 17640 (2205X8) cores?
>>>>> Btw, I do not have access to the system.
>>>>>
>>>>>
>>>>>
>>>>> Sent using CloudMagic Email
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>> <log48.txt><log96.txt>
-------------- next part --------------
0.000000000000000E+000 0.353000000000000 0.000000000000000E+000
90.0000000000000 0.000000000000000E+000 0.000000000000000E+000
1.00000000000000 0.400000000000000 0 -400000
AB,AA,BB -2.47900002275128 2.50750002410496
3.46600006963126 3.40250006661518
size_x,size_y,size_z 158 266 150
body_cg_ini 0.523700833348298 0.778648765134454
7.03282656467989
Warning - length difference between element and cell
max_element_length,min_element_length,min_delta
0.000000000000000E+000 10000000000.0000 1.800000000000000E-002
maximum ngh_surfaces and ngh_vertics are 45 22
minimum ngh_surfaces and ngh_vertics are 28 10
body_cg_ini 0.896813342835977 -0.976707581163755
7.03282656467989
Warning - length difference between element and cell
max_element_length,min_element_length,min_delta
0.000000000000000E+000 10000000000.0000 1.800000000000000E-002
maximum ngh_surfaces and ngh_vertics are 45 22
minimum ngh_surfaces and ngh_vertics are 28 10
min IIB_cell_no 0
max IIB_cell_no 429
final initial IIB_cell_no 2145
min I_cell_no 0
max I_cell_no 460
final initial I_cell_no 2300
size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u)
2145 2300 2145 2300
IIB_I_cell_no_uvw_total1 3090 3094 3078 3080
3074 3073
IIB_I_cell_no_uvw_total2 3102 3108 3089 3077
3060 3086
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
1 0.00150000 0.26454057 0.26151125 1.18591342 -0.76697866E+03 -0.32601415E+02 0.62972429E+07
KSP Object:(momentum_) 48 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(momentum_) 48 MPI processes
type: bjacobi
block Jacobi: number of blocks = 48
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (momentum_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (momentum_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
package used to perform factorization: petsc
total: nonzeros=3.24201e+06, allocated nonzeros=3.24201e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
total: nonzeros=3.24201e+06, allocated nonzeros=3.53035e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=18912600, cols=18912600
total: nonzeros=1.30008e+08, allocated nonzeros=2.64776e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
2 0.00150000 0.32176840 0.32263677 1.26788535 -0.60296986E+03 0.32645061E+02 0.62967216E+07
KSP Object:(momentum_) 48 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(momentum_) 48 MPI processes
type: bjacobi
block Jacobi: number of blocks = 48
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (momentum_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (momentum_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
package used to perform factorization: petsc
total: nonzeros=3.24201e+06, allocated nonzeros=3.24201e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
total: nonzeros=3.24201e+06, allocated nonzeros=3.53035e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=18912600, cols=18912600
total: nonzeros=1.30008e+08, allocated nonzeros=2.64776e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
3 0.00150000 0.36158843 0.37649782 1.31962547 -0.40206982E+03 0.10005980E+03 0.62965570E+07
KSP Object:(momentum_) 48 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(momentum_) 48 MPI processes
type: bjacobi
block Jacobi: number of blocks = 48
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (momentum_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (momentum_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
package used to perform factorization: petsc
total: nonzeros=3.24201e+06, allocated nonzeros=3.24201e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
total: nonzeros=3.24201e+06, allocated nonzeros=3.53035e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=18912600, cols=18912600
total: nonzeros=1.30008e+08, allocated nonzeros=2.64776e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
4 0.00150000 0.38435320 0.41322368 1.35717436 -0.21463805E+03 0.16271834E+03 0.62964387E+07
KSP Object:(momentum_) 48 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(momentum_) 48 MPI processes
type: bjacobi
block Jacobi: number of blocks = 48
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (momentum_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (momentum_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
package used to perform factorization: petsc
total: nonzeros=3.24201e+06, allocated nonzeros=3.24201e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
total: nonzeros=3.24201e+06, allocated nonzeros=3.53035e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=18912600, cols=18912600
total: nonzeros=1.30008e+08, allocated nonzeros=2.64776e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
5 0.00150000 0.39753585 0.43993066 1.39058201 -0.42701340E+02 0.22029669E+03 0.62963392E+07
KSP Object:(momentum_) 48 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(momentum_) 48 MPI processes
type: bjacobi
block Jacobi: number of blocks = 48
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (momentum_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (momentum_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
package used to perform factorization: petsc
total: nonzeros=3.24201e+06, allocated nonzeros=3.24201e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
total: nonzeros=3.24201e+06, allocated nonzeros=3.53035e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=18912600, cols=18912600
total: nonzeros=1.30008e+08, allocated nonzeros=2.64776e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
6 0.00150000 0.41909332 0.46009046 1.41762692 0.11498677E+03 0.27310502E+03 0.62962522E+07
KSP Object:(momentum_) 48 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(momentum_) 48 MPI processes
type: bjacobi
block Jacobi: number of blocks = 48
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (momentum_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (momentum_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
package used to perform factorization: petsc
total: nonzeros=3.24201e+06, allocated nonzeros=3.24201e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
total: nonzeros=3.24201e+06, allocated nonzeros=3.53035e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=18912600, cols=18912600
total: nonzeros=1.30008e+08, allocated nonzeros=2.64776e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
7 0.00150000 0.43914280 0.47568685 1.43956921 0.25987995E+03 0.32149970E+03 0.62961747E+07
KSP Object:(momentum_) 48 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(momentum_) 48 MPI processes
type: bjacobi
block Jacobi: number of blocks = 48
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (momentum_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (momentum_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
package used to perform factorization: petsc
total: nonzeros=3.24201e+06, allocated nonzeros=3.24201e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
total: nonzeros=3.24201e+06, allocated nonzeros=3.53035e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=18912600, cols=18912600
total: nonzeros=1.30008e+08, allocated nonzeros=2.64776e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
8 0.00150000 0.45521621 0.48796822 1.45767552 0.39328978E+03 0.36583669E+03 0.62961048E+07
KSP Object:(momentum_) 48 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(momentum_) 48 MPI processes
type: bjacobi
block Jacobi: number of blocks = 48
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (momentum_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (momentum_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
package used to perform factorization: petsc
total: nonzeros=3.24201e+06, allocated nonzeros=3.24201e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
total: nonzeros=3.24201e+06, allocated nonzeros=3.53035e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=18912600, cols=18912600
total: nonzeros=1.30008e+08, allocated nonzeros=2.64776e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
9 0.00150000 0.46814054 0.49777707 1.47492488 0.51635507E+03 0.40645276E+03 0.62960413E+07
KSP Object:(momentum_) 48 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(momentum_) 48 MPI processes
type: bjacobi
block Jacobi: number of blocks = 48
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (momentum_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (momentum_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
package used to perform factorization: petsc
total: nonzeros=3.24201e+06, allocated nonzeros=3.24201e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
total: nonzeros=3.24201e+06, allocated nonzeros=3.53035e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=18912600, cols=18912600
total: nonzeros=1.30008e+08, allocated nonzeros=2.64776e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
10 0.00150000 0.47856863 0.50571050 1.49141099 0.63006093E+03 0.44366044E+03 0.62959832E+07
body 1
implicit forces and moment 1
-2.47548682245920 1.64962238999444 0.511583428314605
-0.312588669622222 -1.55898599939365 3.29919937188979
body 2
implicit forces and moment 2
-1.50915464344919 -2.39346816361116 0.496546039715906
0.854598810463335 -1.23770041331909 -3.16285577016750
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./a.out on a petsc-3.6.2_shared_rel named n12-04 with 48 processors, by wtay Mon Nov 2 02:22:24 2015
Using Petsc Release Version 3.6.2, Oct, 02, 2015
Max Max/Min Avg Total
Time (sec): 1.003e+02 1.00020 1.003e+02
Objects: 5.200e+01 1.00000 5.200e+01
Flops: 4.731e+08 1.75932 3.622e+08 1.739e+10
Flops/sec: 4.718e+06 1.75942 3.612e+06 1.734e+08
MPI Messages: 4.450e+02 14.35484 6.071e+01 2.914e+03
MPI Message Lengths: 3.714e+07 2.00000 5.991e+05 1.746e+09
MPI Reductions: 2.310e+02 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 1.0029e+02 100.0% 1.7387e+10 100.0% 2.914e+03 100.0% 5.991e+05 100.0% 2.300e+02 99.6%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 18 1.0 1.0074e+00 1.6 1.17e+08 1.9 1.7e+03 9.9e+05 0.0e+00 1 25 58 96 0 1 25 58 96 0 4308
MatSolve 27 1.0 6.7583e-01 1.5 1.61e+08 1.9 0.0e+00 0.0e+00 0.0e+00 1 34 0 0 0 1 34 0 0 0 8698
MatLUFactorNum 9 1.0 9.4694e-01 2.1 8.62e+07 2.0 0.0e+00 0.0e+00 0.0e+00 1 18 0 0 0 1 18 0 0 0 3256
MatILUFactorSym 1 1.0 8.0723e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatConvert 1 1.0 5.6746e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 10 1.0 1.6173e+0020.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01 1 0 0 0 9 1 0 0 0 9 0
MatAssemblyEnd 10 1.0 6.4045e-01 1.4 0.00e+00 0.0 3.8e+02 1.7e+05 1.6e+01 1 0 13 4 7 1 0 13 4 7 0
MatGetRowIJ 3 1.0 1.5974e-0516.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 1.1506e-02 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 37 1.9 3.7605e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.9e+01 0 0 0 0 8 0 0 0 0 8 0
KSPSetUp 19 1.0 4.1611e-02 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 19 1.0 7.8608e+01 1.0 4.73e+08 1.8 1.7e+03 9.9e+05 4.9e+01 78100 58 96 21 78100 58 96 21 221
VecDot 18 1.0 3.0766e-01 4.8 1.82e+07 1.3 0.0e+00 0.0e+00 1.8e+01 0 4 0 0 8 0 4 0 0 8 2213
VecDotNorm2 9 1.0 2.5026e-0110.8 1.82e+07 1.3 0.0e+00 0.0e+00 9.0e+00 0 4 0 0 4 0 4 0 0 4 2721
VecNorm 18 1.0 6.8599e-01 8.3 1.82e+07 1.3 0.0e+00 0.0e+00 1.8e+01 0 4 0 0 8 0 4 0 0 8 993
VecCopy 18 1.0 5.5553e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 66 1.0 9.9062e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPBYCZ 18 1.0 1.5036e-01 1.8 3.63e+07 1.3 0.0e+00 0.0e+00 0.0e+00 0 8 0 0 0 0 8 0 0 0 9056
VecWAXPY 18 1.0 1.2700e-01 1.5 1.82e+07 1.3 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 5361
VecAssemblyBegin 38 1.0 3.7598e-01 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+02 0 0 0 0 49 0 0 0 0 50 0
VecAssemblyEnd 38 1.0 1.7548e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 18 1.0 6.0631e-02 2.9 0.00e+00 0.0 1.7e+03 9.9e+05 0.0e+00 0 0 58 96 0 0 0 58 96 0 0
VecScatterEnd 18 1.0 3.8051e-01 6.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCSetUp 19 1.0 3.2629e+01 1.0 8.62e+07 2.0 0.0e+00 0.0e+00 4.0e+00 32 18 0 0 2 32 18 0 0 2 94
PCSetUpOnBlocks 9 1.0 1.0313e+00 2.1 8.62e+07 2.0 0.0e+00 0.0e+00 0.0e+00 1 18 0 0 0 1 18 0 0 0 2990
PCApply 27 1.0 7.1712e-01 1.5 1.61e+08 1.9 0.0e+00 0.0e+00 0.0e+00 1 34 0 0 0 1 34 0 0 0 8197
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 7 7 182147036 0
Krylov Solver 3 3 3464 0
Vector 20 20 41709448 0
Vector Scatter 2 2 2176 0
Index Set 7 7 4705612 0
Preconditioner 3 3 3208 0
Viewer 10 9 6840 0
========================================================================================================================
Average time to get PetscTime(): 1.90735e-07
Average time for MPI_Barrier(): 1.35899e-05
Average time for zero size MPI_Send(): 6.83467e-06
#PETSc Option Table entries:
-log_summary
-momentum_ksp_view
-poisson_ksp_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.2_shared_rel --known-mpi-shared=1 --with-shared-libraries --with-fortran-interfaces=1
-----------------------------------------
Libraries compiled on Sun Oct 18 17:34:07 2015 on hpc12
Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core
Using PETSc directory: /home/wtay/Codes/petsc-3.6.2
Using PETSc arch: petsc-3.6.2_shared_rel
-----------------------------------------
Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc -fPIC -wd1572 -O3 ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90 -fPIC -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/include -I/home/wtay/Codes/petsc-3.6.2/include -I/home/wtay/Codes/petsc-3.6.2/include -I/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/include -I/home/wtay/Lib/petsc-3.6.2_shared_rel/include -I/opt/ud/openmpi-1.8.8/include
-----------------------------------------
Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc
Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90
Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/lib -L/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.2_shared_rel/lib -L/home/wtay/Lib/petsc-3.6.2_shared_rel/lib -lHYPRE -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -lmpi -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl
-----------------------------------------
-------------- next part --------------
0.000000000000000E+000 0.353000000000000 0.000000000000000E+000
90.0000000000000 0.000000000000000E+000 0.000000000000000E+000
1.00000000000000 0.400000000000000 0 -400000
AB,AA,BB -2.47900002275128 2.50750002410496
3.46600006963126 3.40250006661518
size_x,size_y,size_z 158 266 150
body_cg_ini 0.523700833348298 0.778648765134454
7.03282656467989
Warning - length difference between element and cell
max_element_length,min_element_length,min_delta
0.000000000000000E+000 10000000000.0000 1.800000000000000E-002
maximum ngh_surfaces and ngh_vertics are 45 22
minimum ngh_surfaces and ngh_vertics are 28 10
body_cg_ini 0.896813342835977 -0.976707581163755
7.03282656467989
Warning - length difference between element and cell
max_element_length,min_element_length,min_delta
0.000000000000000E+000 10000000000.0000 1.800000000000000E-002
maximum ngh_surfaces and ngh_vertics are 45 22
minimum ngh_surfaces and ngh_vertics are 28 10
min IIB_cell_no 0
max IIB_cell_no 429
final initial IIB_cell_no 2145
min I_cell_no 0
max I_cell_no 460
final initial I_cell_no 2300
size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u)
2145 2300 2145 2300
IIB_I_cell_no_uvw_total1 3090 3094 3078 3080
3074 3073
IIB_I_cell_no_uvw_total2 3102 3108 3089 3077
3060 3086
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
1 0.00150000 0.26454057 0.26151125 1.18591342 -0.76697866E+03 -0.32601415E+02 0.62972429E+07
KSP Object:(momentum_) 48 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(momentum_) 48 MPI processes
type: bjacobi
block Jacobi: number of blocks = 48
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (momentum_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (momentum_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
package used to perform factorization: petsc
total: nonzeros=3.24201e+06, allocated nonzeros=3.24201e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
total: nonzeros=3.24201e+06, allocated nonzeros=3.53035e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=18912600, cols=18912600
total: nonzeros=1.30008e+08, allocated nonzeros=2.64776e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
KSP Object:(poisson_) 48 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 48 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 48 MPI processes
type: mpiaij
rows=6304200, cols=6304200
total: nonzeros=4.39181e+07, allocated nonzeros=8.82588e+07
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
2 0.00150000 0.32176840 0.32263677 1.26788535 -0.60296986E+03 0.32645061E+02 0.62967216E+07
body 1
implicit forces and moment 1
-4.26282587609784 3.25239287178069 4.24467550120379
2.64197101323915 -4.33946240378535 6.02325229135247
body 2
implicit forces and moment 2
-2.73310784825621 -4.58132758579707 4.09752056091207
-0.631346424947326 -4.97864096805248 -6.05243915873125
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./a.out on a petsc-3.6.2_shared_rel named n12-04 with 48 processors, by wtay Mon Nov 2 02:08:45 2015
Using Petsc Release Version 3.6.2, Oct, 02, 2015
Max Max/Min Avg Total
Time (sec): 5.583e+01 1.00052 5.581e+01
Objects: 4.400e+01 1.00000 4.400e+01
Flops: 5.257e+07 1.75932 4.025e+07 1.932e+09
Flops/sec: 9.420e+05 1.75989 7.211e+05 3.461e+07
MPI Messages: 5.300e+01 7.57143 1.371e+01 6.580e+02
MPI Message Lengths: 5.310e+06 2.00000 3.793e+05 2.496e+08
MPI Reductions: 6.300e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 5.5813e+01 100.0% 1.9319e+09 100.0% 6.580e+02 100.0% 3.793e+05 100.0% 6.200e+01 98.4%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 2 1.0 1.1111e-01 1.5 1.30e+07 1.9 1.9e+02 9.9e+05 0.0e+00 0 25 29 75 0 0 25 29 75 0 4340
MatSolve 3 1.0 6.9118e-02 1.3 1.79e+07 1.9 0.0e+00 0.0e+00 0.0e+00 0 34 0 0 0 0 34 0 0 0 9450
MatLUFactorNum 1 1.0 1.0166e-01 1.9 9.58e+06 2.0 0.0e+00 0.0e+00 0.0e+00 0 18 0 0 0 0 18 0 0 0 3370
MatILUFactorSym 1 1.0 7.7649e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatConvert 1 1.0 5.6372e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 2 1.0 3.5564e-01290.6 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 6 0 0 0 0 6 0
MatAssemblyEnd 2 1.0 2.9979e-01 1.0 0.00e+00 0.0 3.8e+02 1.7e+05 1.6e+01 1 0 57 25 25 1 0 57 25 26 0
MatGetRowIJ 3 1.0 2.2888e-0524.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 1.3555e-02 4.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 5 1.7 3.3672e-03 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 5 0 0 0 0 5 0
KSPSetUp 3 1.0 4.7309e-02 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 3 1.0 4.2701e+01 1.0 5.26e+07 1.8 1.9e+02 9.9e+05 9.0e+00 77100 29 75 14 77100 29 75 15 45
VecDot 2 1.0 2.6857e-0211.2 2.02e+06 1.3 0.0e+00 0.0e+00 2.0e+00 0 4 0 0 3 0 4 0 0 3 2817
VecDotNorm2 1 1.0 2.4464e-0215.0 2.02e+06 1.3 0.0e+00 0.0e+00 1.0e+00 0 4 0 0 2 0 4 0 0 2 3092
VecNorm 2 1.0 1.0654e-0118.5 2.02e+06 1.3 0.0e+00 0.0e+00 2.0e+00 0 4 0 0 3 0 4 0 0 3 710
VecCopy 2 1.0 4.1361e-03 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 10 1.0 2.0313e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPBYCZ 2 1.0 1.5906e-02 1.8 4.03e+06 1.3 0.0e+00 0.0e+00 0.0e+00 0 8 0 0 0 0 8 0 0 0 9512
VecWAXPY 2 1.0 1.3226e-02 1.3 2.02e+06 1.3 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 5720
VecAssemblyBegin 6 1.0 2.8764e-02 9.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 29 0 0 0 0 29 0
VecAssemblyEnd 6 1.0 4.0054e-05 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 2 1.0 6.8028e-03 2.7 0.00e+00 0.0 1.9e+02 9.9e+05 0.0e+00 0 0 29 75 0 0 0 29 75 0 0
VecScatterEnd 2 1.0 3.6922e-02 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCSetUp 3 1.0 3.2228e+01 1.0 9.58e+06 2.0 0.0e+00 0.0e+00 4.0e+00 58 18 0 0 6 58 18 0 0 6 11
PCSetUpOnBlocks 1 1.0 1.8228e-01 1.6 9.58e+06 2.0 0.0e+00 0.0e+00 0.0e+00 0 18 0 0 0 0 18 0 0 0 1880
PCApply 3 1.0 7.3298e-02 1.3 1.79e+07 1.9 0.0e+00 0.0e+00 0.0e+00 0 34 0 0 0 0 34 0 0 0 8911
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 7 7 182147036 0
Krylov Solver 3 3 3464 0
Vector 20 20 41709448 0
Vector Scatter 2 2 2176 0
Index Set 7 7 4705612 0
Preconditioner 3 3 3208 0
Viewer 2 1 760 0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 9.77516e-06
Average time for zero size MPI_Send(): 6.91414e-06
#PETSc Option Table entries:
-log_summary
-momentum_ksp_view
-poisson_ksp_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.2_shared_rel --known-mpi-shared=1 --with-shared-libraries --with-fortran-interfaces=1
-----------------------------------------
Libraries compiled on Sun Oct 18 17:34:07 2015 on hpc12
Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core
Using PETSc directory: /home/wtay/Codes/petsc-3.6.2
Using PETSc arch: petsc-3.6.2_shared_rel
-----------------------------------------
Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc -fPIC -wd1572 -O3 ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90 -fPIC -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/include -I/home/wtay/Codes/petsc-3.6.2/include -I/home/wtay/Codes/petsc-3.6.2/include -I/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/include -I/home/wtay/Lib/petsc-3.6.2_shared_rel/include -I/opt/ud/openmpi-1.8.8/include
-----------------------------------------
Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc
Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90
Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/lib -L/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.2_shared_rel/lib -L/home/wtay/Lib/petsc-3.6.2_shared_rel/lib -lHYPRE -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -lmpi -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl
-----------------------------------------
-------------- next part --------------
0.000000000000000E+000 0.353000000000000 0.000000000000000E+000
90.0000000000000 0.000000000000000E+000 0.000000000000000E+000
1.00000000000000 0.400000000000000 0 -400000
AB,AA,BB -3.41000006697141 3.44100006844383
3.46600006963126 3.40250006661518
size_x,size_y,size_z 158 266 301
body_cg_ini 0.523700833348298 0.778648765134454
7.03282656467989
Warning - length difference between element and cell
max_element_length,min_element_length,min_delta
0.000000000000000E+000 10000000000.0000 1.800000000000000E-002
maximum ngh_surfaces and ngh_vertics are 41 20
minimum ngh_surfaces and ngh_vertics are 28 10
body_cg_ini 0.896813342835977 -0.976707581163755
7.03282656467989
Warning - length difference between element and cell
max_element_length,min_element_length,min_delta
0.000000000000000E+000 10000000000.0000 1.800000000000000E-002
maximum ngh_surfaces and ngh_vertics are 41 20
minimum ngh_surfaces and ngh_vertics are 28 10
min IIB_cell_no 0
max IIB_cell_no 415
final initial IIB_cell_no 2075
min I_cell_no 0
max I_cell_no 468
final initial I_cell_no 2340
size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u)
2075 2340 2075 2340
IIB_I_cell_no_uvw_total1 7635 7644 7643 8279
8271 8297
IIB_I_cell_no_uvw_total2 7647 7646 7643 8271
8274 8266
KSP Object:(poisson_) 96 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 96 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 96 MPI processes
type: mpiaij
rows=12650428, cols=12650428
total: nonzeros=8.82137e+07, allocated nonzeros=1.77106e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
1 0.00150000 0.35826998 0.36414728 1.27156134 -0.24352631E+04 -0.99308685E+02 0.12633660E+08
KSP Object:(momentum_) 96 MPI processes
type: bcgs
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(momentum_) 96 MPI processes
type: bjacobi
block Jacobi: number of blocks = 96
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (momentum_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (momentum_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
package used to perform factorization: petsc
total: nonzeros=3.24201e+06, allocated nonzeros=3.24201e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=504336, cols=504336
total: nonzeros=3.24201e+06, allocated nonzeros=3.53035e+06
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 96 MPI processes
type: mpiaij
rows=37951284, cols=37951284
total: nonzeros=2.61758e+08, allocated nonzeros=5.31318e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
KSP Object:(poisson_) 96 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object:(poisson_) 96 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
HYPRE BoomerAMG: Cycle type V
HYPRE BoomerAMG: Maximum number of levels 25
HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
HYPRE BoomerAMG: Threshold for strong coupling 0.25
HYPRE BoomerAMG: Interpolation truncation factor 0
HYPRE BoomerAMG: Interpolation: max elements per row 0
HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
HYPRE BoomerAMG: Maximum row sums 0.9
HYPRE BoomerAMG: Sweeps down 1
HYPRE BoomerAMG: Sweeps up 1
HYPRE BoomerAMG: Sweeps on coarse 1
HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
HYPRE BoomerAMG: Relax weight (all) 1
HYPRE BoomerAMG: Outer relax weight (all) 1
HYPRE BoomerAMG: Using CF-relaxation
HYPRE BoomerAMG: Measure type local
HYPRE BoomerAMG: Coarsen type Falgout
HYPRE BoomerAMG: Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 96 MPI processes
type: mpiaij
rows=12650428, cols=12650428
total: nonzeros=8.82137e+07, allocated nonzeros=1.77106e+08
total number of mallocs used during MatSetValues calls =0
not using I-node (on process 0) routines
2 0.00150000 0.49306841 0.48961181 1.45614703 -0.20361132E+04 0.77916035E+02 0.12632159E+08
body 1
implicit forces and moment 1
-4.19326176380900 3.26285229643405 4.91657786206150
3.33023211607813 -4.66288821809535 5.95105697339790
body 2
implicit forces and moment 2
-2.71360610740664 -4.53650746988691 4.76048497342022
-1.13560954211517 -5.55259427154780 -5.98958778241742
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./a.out on a petsc-3.6.2_shared_rel named n12-05 with 96 processors, by wtay Sun Nov 1 14:26:28 2015
Using Petsc Release Version 3.6.2, Oct, 02, 2015
Max Max/Min Avg Total
Time (sec): 5.132e+02 1.00006 5.132e+02
Objects: 4.400e+01 1.00000 4.400e+01
Flops: 5.257e+07 1.75932 4.049e+07 3.887e+09
Flops/sec: 1.024e+05 1.75933 7.890e+04 7.574e+06
MPI Messages: 1.010e+02 14.42857 1.385e+01 1.330e+03
MPI Message Lengths: 5.310e+06 2.00000 3.793e+05 5.045e+08
MPI Reductions: 6.300e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 5.1317e+02 100.0% 3.8869e+09 100.0% 1.330e+03 100.0% 3.793e+05 100.0% 6.200e+01 98.4%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 2 1.0 1.1735e-01 1.6 1.30e+07 1.9 3.8e+02 9.9e+05 0.0e+00 0 25 29 75 0 0 25 29 75 0 8276
MatSolve 3 1.0 7.3929e-02 1.7 1.79e+07 1.9 0.0e+00 0.0e+00 0.0e+00 0 34 0 0 0 0 34 0 0 0 17787
MatLUFactorNum 1 1.0 1.0028e-01 1.9 9.58e+06 2.0 0.0e+00 0.0e+00 0.0e+00 0 18 0 0 0 0 18 0 0 0 6881
MatILUFactorSym 1 1.0 8.3889e-02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatConvert 1 1.0 6.0471e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 2 1.0 4.2017e-0132.8 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 6 0 0 0 0 6 0
MatAssemblyEnd 2 1.0 3.2434e-01 1.0 0.00e+00 0.0 7.6e+02 1.7e+05 1.6e+01 0 0 57 25 25 0 0 57 25 26 0
MatGetRowIJ 3 1.0 1.3113e-0513.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 1.6168e-02 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 5 1.7 1.0472e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 5 0 0 0 0 5 0
KSPSetUp 3 1.0 5.0210e-02 7.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 3 1.0 4.9085e+02 1.0 5.26e+07 1.8 3.8e+02 9.9e+05 9.0e+00 96100 29 75 14 96100 29 75 15 8
VecDot 2 1.0 3.2738e-0210.5 2.02e+06 1.3 0.0e+00 0.0e+00 2.0e+00 0 4 0 0 3 0 4 0 0 3 4637
VecDotNorm2 1 1.0 3.0590e-0215.2 2.02e+06 1.3 0.0e+00 0.0e+00 1.0e+00 0 4 0 0 2 0 4 0 0 2 4963
VecNorm 2 1.0 1.2529e-0120.6 2.02e+06 1.3 0.0e+00 0.0e+00 2.0e+00 0 4 0 0 3 0 4 0 0 3 1212
VecCopy 2 1.0 5.5149e-03 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 10 1.0 2.0353e-02 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPBYCZ 2 1.0 1.8868e-02 2.1 4.03e+06 1.3 0.0e+00 0.0e+00 0.0e+00 0 8 0 0 0 0 8 0 0 0 16091
VecWAXPY 2 1.0 1.4920e-02 1.6 2.02e+06 1.3 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 10174
VecAssemblyBegin 6 1.0 2.2800e-0118.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 29 0 0 0 0 29 0
VecAssemblyEnd 6 1.0 5.3167e-05 6.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 2 1.0 1.0023e-02 4.1 0.00e+00 0.0 3.8e+02 9.9e+05 0.0e+00 0 0 29 75 0 0 0 29 75 0 0
VecScatterEnd 2 1.0 4.0989e-02 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCSetUp 3 1.0 4.3599e+02 1.0 9.58e+06 2.0 0.0e+00 0.0e+00 4.0e+00 85 18 0 0 6 85 18 0 0 6 2
PCSetUpOnBlocks 1 1.0 1.9143e-01 1.9 9.58e+06 2.0 0.0e+00 0.0e+00 0.0e+00 0 18 0 0 0 0 18 0 0 0 3605
PCApply 3 1.0 7.6561e-02 1.6 1.79e+07 1.9 0.0e+00 0.0e+00 0.0e+00 0 34 0 0 0 0 34 0 0 0 17175
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 7 7 182147036 0
Krylov Solver 3 3 3464 0
Vector 20 20 41709448 0
Vector Scatter 2 2 2176 0
Index Set 7 7 4705612 0
Preconditioner 3 3 3208 0
Viewer 2 1 760 0
========================================================================================================================
Average time to get PetscTime(): 1.90735e-07
Average time for MPI_Barrier(): 0.0004426
Average time for zero size MPI_Send(): 1.45609e-05
#PETSc Option Table entries:
-log_summary
-momentum_ksp_view
-poisson_ksp_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.2_shared_rel --known-mpi-shared=1 --with-shared-libraries --with-fortran-interfaces=1
-----------------------------------------
Libraries compiled on Sun Oct 18 17:34:07 2015 on hpc12
Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core
Using PETSc directory: /home/wtay/Codes/petsc-3.6.2
Using PETSc arch: petsc-3.6.2_shared_rel
-----------------------------------------
Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc -fPIC -wd1572 -O3 ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90 -fPIC -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/include -I/home/wtay/Codes/petsc-3.6.2/include -I/home/wtay/Codes/petsc-3.6.2/include -I/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/include -I/home/wtay/Lib/petsc-3.6.2_shared_rel/include -I/opt/ud/openmpi-1.8.8/include
-----------------------------------------
Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc
Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90
Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/lib -L/home/wtay/Codes/petsc-3.6.2/petsc-3.6.2_shared_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.2_shared_rel/lib -L/home/wtay/Lib/petsc-3.6.2_shared_rel/lib -lHYPRE -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -lmpi -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -L/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl
-----------------------------------------
More information about the petsc-users
mailing list